[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-10-13 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6535 )

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc@363
PS15, Line 363:   local_params.__set_bloom_filter(rpc_params->bloom_filter);
> Cleanup is definitely welcome.
Came across this while looking at Tianyi's patch. It looks like this patch 
actually removed the parallelisation. Before this patch the filter distribution 
was offloaded to exec_env_->rpc_pool(). After this change the filter 
distribution is done serially on the RPC thread that processes the 
UpdateFilter() message. I'm not sure what the consequences of this are, but we 
should keep an eye on this to see if it caused any regressions in Impala 2.9.

Unfortunately it looks like the change was snuck into the larger changes in 
this patch - it looks like it wasn't mentioned in the commit message or noticed 
by reviewers. Before the code was:

exec_env_->rpc_pool()->Offer(bind(DistributeFilters, rpc_params,
fragment_inst->impalad_address(), 
fragment_inst->fragment_instance_id()));



-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-Change-Number: 6535
Gerrit-PatchSet: 15
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 13 Oct 2017 18:56:52 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-09-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6535 )

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc@363
PS15, Line 363:   local_params.__set_bloom_filter(rpc_params->bloom_filter);
> Thanks for the reply! The attempt to parallelize makes sense. I'm working o
Cleanup is definitely welcome.



--
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-Change-Number: 6535
Gerrit-PatchSet: 15
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 25 Sep 2017 23:34:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-09-25 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6535 )

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc@363
PS15, Line 363:   local_params.__set_bloom_filter(rpc_params->bloom_filter);
> Note that this change only moved the code around (into coordinator.cc), it
Thanks for the reply! The attempt to parallelize makes sense. I'm working on 
surrounding code and may remove the unnecessary copy/shared_ptr along the way.



--
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-Change-Number: 6535
Gerrit-PatchSet: 15
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 25 Sep 2017 22:59:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-09-25 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6535 )

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc@363
PS15, Line 363:   local_params.__set_bloom_filter(rpc_params->bloom_filter);
> Sorry for commenting on an old CR. I'm reading this part of code and have s
Note that this change only moved the code around (into coordinator.cc), it 
didn't introduce this code. You'll have to go further back in time to see the 
change that introduced that code, and maybe there is insight there.

That said, I agree it doesn't seem like shared_ptr makes sense here (and 
usually, we should avoid use of shared_ptr in impala because it makes reasoning 
about lifetimes difficult).

>From the comment in Coordinator::UpdateFilter() it sounds like there was some 
>idea to parallelize the RPCs, which maybe is where the thought of shared_ptr 
>came in. I'm not sure if the code ever did that, though. Going back further in 
>git history may answer it.

// Make a 'master' copy that will be shared by all concurrent delivery RPC 
attempts.



--
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-Change-Number: 6535
Gerrit-PatchSet: 15
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 25 Sep 2017 22:43:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-09-25 Thread Tianyi Wang (Code Review)
Tianyi Wang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/6535 )

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/6535/15/be/src/runtime/coordinator-backend-state.cc@363
PS15, Line 363:   local_params.__set_bloom_filter(rpc_params->bloom_filter);
Sorry for commenting on an old CR. I'm reading this part of code and have 
several questions:
1. Why does rpc_params need to be a shared_ptr? I think it is only referenced 
within the scope of this function.
2. Why is a copy made here?
2. Why is __set_bloom_filter called, given that local_params is 
copy-constructed from rpc_params?



--
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-Change-Number: 6535
Gerrit-PatchSet: 15
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 25 Sep 2017 22:33:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 14: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 14:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/549/

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 14: Code-Review+2

fixing some 'unused results' warnings.

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Marcel Kornacker (Code Review)
Hello Impala Public Jenkins, Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6535

to look at the new patch set (#14).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
73 files changed, 3,884 insertions(+), 3,782 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/14
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 14
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 13: Verified-1

Build failed: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/544/

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 13:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/544/

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 13: Code-Review+2

rebase

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-08 Thread Marcel Kornacker (Code Review)
Hello Dan Hecht,

I'd like you to reexamine a change.  Please visit

http://gerrit.cloudera.org:8080/6535

to look at the new patch set (#13).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
73 files changed, 3,873 insertions(+), 3,775 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-07 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#12).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
73 files changed, 3,867 insertions(+), 3,755 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 12
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-07 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 11:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS10, Line 413: 
> In your current code, will the real status eventually be propagated as the 
clarified in query-state.h


PS10, Line 424: 
> It gets displayed as soon as there is a client request state registered, I 
we might want to fix that. i'll remove that dcheck for now.


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS11, Line 200: DCHECK(coord_instance_->WaitForPrepare().ok());
> DCHECK(coord_instance_->IsPrepared() && coord_instance_->GetPrepareStatus()
Done


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS11, Line 57: QueryState
> QueryState reference
Done


PS11, Line 59: exec-fragment-instance
> start-query-finstances
Done


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS10, Line 195:   ImpalaBackendConnection 
coord(ExecEnv::GetInstance()->impalad_client_cache(),
  :   query_ctx().coord_address, _status);
  :   if (!coord_status.ok()) {
> Yes, flooding the log. i.e. the TODO would be better if it says what should
the question is whether we want to cancel here if it's not the final report, 
ie,  done==false. i'm happy to change it, it's not clear that one approach is 
better than the other.


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS11, Line 211: if a single instance encountered an error, report an overall 
error status
  :   // for the whole query to initiate cancellation at the 
coordinator
> but why doesn't the coordinator initiate cancellation if any FIS reports er
this status was actually unused. removed.


PS11, Line 264: despire
> despite
Done


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS10, Line 132: e();
> Agree we'll revisit this in a later change.
Done


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS11, Line 250: CancelInternal
> that's no longer defined; delete.
Done


PS11, Line 260:  cancel_on_error is true.
> This is subtle as to why we have it so I think it would be helpful to expla
Done


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/test-env.cc
File be/src/runtime/test-env.cc:

Line 83:   //exec_env_->disk_io_mgr_.reset();
> what's this about?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-06 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 10:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS10, Line 204: GetPrepareStatus
> the barrier isn't really present, because getfinstancestate already waits f
Still, it's not a pattern I think we should encourage in the code base 
(DCHECK(x) where x has side-effects).  How about at least making the 
non-barrier explicit by shortcircuiting it like this:

DCHECK(coord_instance_->IsPrepared() && 
coord_instance_->GetPrepareStatus().ok());


PS10, Line 413:  backend_state->GetStatus();
> good catch, this could be cancelled at this point.
In your current code, will the real status eventually be propagated as the 
query status? Or will this race cause the query to result in CANCELLED even 
though there was a real error?


PS10, Line 424: DCHECK(query_status_.ok()); // nobody should have been able to 
cancel
> when does the query get displayed on the debug page? you still haven't retu
It gets displayed as soon as there is a client request state registered, I 
believe. (i.e. RegisterQuery()). Which is what I meant - webserver cancellation 
is definitely available before exec() returns.


PS10, Line 964: #if 0
  : do we really want this?
> remove it unless someone says we want it.
fine with me to delete.


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS11, Line 200: // fragment instance's executor will not comple
DCHECK(coord_instance_->IsPrepared() && 
coord_instance_->GetPrepareStatus().ok());

to make it explicit that this has no side-effects.


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS10, Line 51: Prepare
> i think we should do some initial setup here so we can return an error stat
the renaming helped.


PS10, Line 59: exec-fragment-instance-$0
> i left everything as is, because i didn't want to break existing tests. hap
Which tests? CAn't they be fixed? You renamed the other thread - I think it 
would be best to do this one as well. The current name is really confusing 
because it's not plural.


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/query-exec-mgr.cc
File be/src/runtime/query-exec-mgr.cc:

PS11, Line 57: n)
QueryState reference


PS11, Line 59: exec-fragment-instance
start-query-finstances


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS10, Line 195: // TODO: this might flood the log
  : LOG(WARNING) << "Couldn't get a client for " << 
query_ctx().coord_address
  : <<"\tReason: " << coord_status.GetDetail();
> address what, flooding the log?
Yes, flooding the log. i.e. the TODO would be better if it says what should 
happen in the future.

And before it looks like we'd remember the status via UpdateStatus() and then 
potentially report it back the next time we try (for periodic reports). Not 
saying that's the right solution but it does look like a subtle behavior change 
so want to think it through. I think the new behavior is probably not really 
worse though (for periodic reports, just report and try again later; for final 
report, in both old and new we'd drop it on the floor).


PS10, Line 255: // TODO: should we try to keep rpc_status for the final 
report? (but the final
  : // report, following this Cancel(), may not succeed anyway.)
> if the report rpc fails we fail the query on this node and then do nothing/
Okay. I think the old code used to remember this status (via UpdateStatus()) 
and then would report it back if it could talk to the coordinator later when 
the fragment is complete.

But I agree the protocol needs to be fixed.


Line 319:   prepare_status = instance_status;
> fis.h points out that all public non-getter functions block until the prepa
That fis.h comment was just added.

But I don't see how it's relevant to my initial comment here -- how can 
prepare_status be CANCELLED?


http://gerrit.cloudera.org:8080/#/c/6535/11/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS11, Line 211: arams.instance_exec_status.emplace_back();
  : params.__isset.instance_exec_status = true;
but why doesn't the coordinator initiate cancellation if any FIS reports error? 
 Is this to simplify the coordinator logic? I guess I'm okay with how you have 
this currently, but it just seems clearer to me to keep FIS status in only one 
place.


PS11, Line 264: 
despite


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS10, Line 59: cancelled
> vague in what way?
Vague because it's not clear what all the side effects of "cancel" are, and 
when they 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-03 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#11).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-v2-test.cc
M be/src/runtime/bufferpool/buffer-pool-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
M be/src/runtime/mem-tracker.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
73 files changed, 3,873 insertions(+), 3,744 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/11
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 11
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-03 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 10:

(72 comments)

http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS10, Line 327:  // if we get an error while trying to get a connection to the 
backend,
  :   // keep going
  :  
> what is this talking about?
removed


PS10, Line 352: keep on cancelling the other fragments
> seems misplaced now
Done


PS10, Line 354: eturn true;
> should this return false? we didn't actually succeed at sending the rpc, ri
it doesn't really matter, the return value is only used for a log message. the 
return value is true because it attempted to cancel.


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS10, Line 61: status
> GetStatus()?
Done


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS10, Line 178: close
> i'm not sure what this is talking about. is it referring to CloseConsumer()
removed comment, it was a leftover


PS10, Line 179: it should
  :   // cancel itself when it receives an error status after 
reporting its profile.
> I also don't see what this is trying to explain.
Done


PS10, Line 183:  Grab executor 
> what is this referring to?
Done


PS10, Line 184: order to get a reference to coord_instance_
  :   // so that coord_sink_ remains valid throughout query 
lifetime.
> how does getting a reference to coord_instance_ affect the validity of coor
Done


PS10, Line 189: at this point, the query is done with the Prepare phase,
> why do we know that? Oh, because GetFinstanceState() blocks until Prepare? 
added comment


PS10, Line 204: GetPrepareStatus
> this is a dangerous side-effect to have inside of a DCHECK -- it creates a 
the barrier isn't really present, because getfinstancestate already waits for 
all instances' prepare phase to finish, so it can't block here. the first 
sentence of this comment kind of explains that.


Line 263:   }
> DCHECK(!has_coord_fragment || coord_backend_idx_ != -1) ?
Done


PS10, Line 413:  backend_state->GetStatus();
> what is this trying to do? Oh, I guess get the status of BackendState::Exec
good catch, this could be cancelled at this point.

switching to per-backend status reporting will fix that.


PS10, Line 424: DCHECK(query_status_.ok()); // nobody should have been able to 
cancel
> why? because the query handle hasn't yet been returned?  But what about via
when does the query get displayed on the debug page? you still haven't returned 
from exec() at this point.


PS10, Line 964: #if 0
  : do we really want this?
> what are you going to do with this?
remove it unless someone says we want it.


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS10, Line 210: query 
> query execution? (as opposed to the query being closed)
Done


PS10, Line 299: /// True if execution has completed, false otherwise.
  :   bool execution_completed_ = false;
> doesn't appear to be used
Done


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

PS10, Line 547: ReleaseResources
> why is this called ReleaseResources and not Close? What are we saying the d
we haven't settled on anything yet. should we stipulate that close = release 
query result-related resources but keep control structures, and rename 
releaseresources to close everywhere? fine by me.


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

PS10, Line 287:   /// TODO: Move these into the new query-wide state, indexed 
by partition id.
> Remove.
Done


http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

> Like QES, I think the lifecycle state machine could be more clearly defined
Done


PS10, Line 95:   if (!status.ok() && !status.IsCancelled() && 
!status.IsMemLimitExceeded()) {
 : // Log error message in addition to returning in Status. 
Queries that do not fetch
 : // results (e.g. insert) may not receive the message 
directly and can only retrieve
 : // the log.
 : runtime_state_->LogError(status.msg());
 :   }
> you should delete this when rebasing e0d1db5
Done


PS10, Line 104:   // TODO: deal with final report failure, otherwise the 
coordinator doesn't know
  :   // we're done
> is that a regression or something that didn't work before as well?
i can probably remove this, if the final report fails, other communication may 
have failed as well


PS10, Line 213: // TODO: 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-05-01 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/10/be/src/runtime/descriptors.h
File be/src/runtime/descriptors.h:

PS10, Line 287:   /// TODO: Move these into the new query-wide state, indexed 
by partition id.
Remove.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-22 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#10).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
69 files changed, 3,772 insertions(+), 3,673 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-22 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 9:

(26 comments)

http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS9, Line 293: UpdateExecStats
> The similarity between UpdateExecStats() and UpdateExecStatus() is potentia
renamed.


Line 455: // TODO: we don't track cpu time per node now. Do that.
> More generally we should have an aggregate profile for each backend that in
rephrased


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

Line 132: const FInstanceExecParams& exec_params_;
> Might be helpful to document what owns the object being referenced.
Done


Line 164:   std::vector instance_params_list_;
> Might be helpful to document what owns the objects being referenced.
Done


Line 175:   /// Protects fields below. Can be held while doing an RPC, so 
SpinLock is a bad idea.
> Our current SpinLock implementation should be fine (it blocks after spinnin
Done


Line 219:   /// TODO: including the median doesn't compile, looks like some 
includes are missing
> This TODO has been in the code since 2012 - maybe just remove it?
Done


Line 253:   /// Root profile for all fragment instances for this fragment
> Stored in obj_pool?
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 364:   f->targets()->push_back(target);
> Maybe:
Done


Line 402:   int num_instances = schedule_.GetNumFragmentInstances();
> nit: intermediate variable seems unnecessary.
Done


Line 410:   // TODO: rename this metric
> There are a handful of minor TODOs like this. I assume there's some reason 
this was already taken care of


Line 938:   // Notify that we completed with an error.
> This comment seems wrong - the cancellation path is used to clean up even i
Done


PS9, Line 956:  
> nit: blank space
Done


Line 956:   // TODO: return here if returned_all_results_? 
> I believe we need to do cancellation even if we returned all results. Havin
i agree that we need to/should cancel as soon as we return all results. i left 
a corresponding todo in the .h file (cleaning all that up as part of this 
change is out of scope).


Line 983:   // TODO: clarify control flow here, it's unclear we should even 
process this status
> Henry had a patch that clarified this and improved the comment: https://ger
i already incorporated that


PS9, Line 1136: VLOG_QU
> delete
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 108: /// TODO: clean up locking behavior; in particular, clarify 
dependency on lock_
> Does any of the locking information in the impala-server.h need to be updat
i didn't change the locking behavior, so no.


PS9, Line 151:  WARN_UNUSED_RESULT;
> nit: long line.
Done


Line 261:   boost::mutex wait_lock_;
> A lot of these locks could safely be SpinLocks - unless they're used with a
out of scope for this change, it's already large enough.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/debug-options.cc
File be/src/runtime/debug-options.cc:

PS9, Line 71:  (*entry).
> entry->second ?
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 507: 
> Extra line
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 162:   /// True if and only if Cancel() has been called.
> I had a hard time reasoning about concurrent accesses to this until I reali
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 77:   /// Execute instances and decrement refcount.
> I find it clearer to understand as: "Acquires ownership of 'qs'".
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 43: #define RETRY_SLEEP_MS 100
> Make a class-level constant?
this will disappear with krpc


Line 311:   // don't return until every instance is prepared and record the 
first non-OK status
> Comment isn't really accurate - it doesn't record the first non-OK status b
Done


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 257:   void set_is_cancelled(bool v) { is_cancelled_ = v; }
> Not your change but related. This bool is accessed from multiple threads ye
we only ever change it from false to true, and in particular we don't do any 
compare, so i'm not sure how relevant synchronization is here.

i changed the signature to remove the bool parameter to make this clear.



[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-21 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 9:

(23 comments)

Did a review pass while I don't have access to my desktop.

http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS9, Line 293: UpdateExecStats
The similarity between UpdateExecStats() and UpdateExecStatus() is potentially 
confusing. Spell out Statistics() maybe?


Line 455: // TODO: we don't track cpu time per node now. Do that.
More generally we should have an aggregate profile for each backend that 
includes things like this (query memory consumption, I/O, other aggregates). We 
don't need to do this now but it will be helpful


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

Line 132: const FInstanceExecParams& exec_params_;
Might be helpful to document what owns the object being referenced.


Line 164:   std::vector instance_params_list_;
Might be helpful to document what owns the objects being referenced.


Line 175:   /// Protects fields below. Can be held while doing an RPC, so 
SpinLock is a bad idea.
Our current SpinLock implementation should be fine (it blocks after spinning 
for a short period). Mutex should also be fine since this won't be heavily 
contented - I don't feel strongly.


Line 219:   /// TODO: including the median doesn't compile, looks like some 
includes are missing
This TODO has been in the code since 2012 - maybe just remove it?


Line 253:   /// Root profile for all fragment instances for this fragment
Stored in obj_pool?


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 364:   f->targets()->push_back(target);
Maybe:

  ->emplace_back(t_target, fragment_params.fragment.idx);


Line 402:   int num_instances = schedule_.GetNumFragmentInstances();
nit: intermediate variable seems unnecessary.


Line 410:   // TODO: rename this metric
There are a handful of minor TODOs like this. I assume there's some reason not 
to do them in this patch, but it's not clear when/if it makes sense to address 
them. Do we have a JIRA or anything to track them?


Line 938:   // Notify that we completed with an error.
This comment seems wrong - the cancellation path is used to clean up even if 
there isn't an error. Maybe just remove the comment?


Line 956:   // TODO: return here if returned_all_results_? 
I believe we need to do cancellation even if we returned all results. Having 
returned all results from the coordinator fragment doesn't imply that all 
fragments have returned all their results.

The cancellation path is convoluted so I may be misunderstanding something 
though.

Henry had a related patch in review: https://gerrit.cloudera.org/#/c/5987


Line 983:   // TODO: clarify control flow here, it's unclear we should even 
process this status
Henry had a patch that clarified this and improved the comment: 
https://gerrit.cloudera.org/#/c/5987/1/be/src/runtime/coordinator.cc


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

Line 108: /// TODO: clean up locking behavior; in particular, clarify 
dependency on lock_
Does any of the locking information in the impala-server.h need to be updated?


Line 261:   boost::mutex wait_lock_;
A lot of these locks could safely be SpinLocks - unless they're used with a 
condition variable. Now that we have a blocking SpinLock it's better in most 
cases until you need a CV or strong fairness guarantees.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

Line 507: 
Extra line


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

Line 162:   /// True if and only if Cancel() has been called.
I had a hard time reasoning about concurrent accesses to this until I realised 
that it didn't actually influence query execution. Document that it's only used 
for DCHECKs? Or just remove it?


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/mem-tracker.cc
File be/src/runtime/mem-tracker.cc:

Line 111:   //VLOG_QUERY << "AddChildTracker(): parent=" << this << " child=" 
<< tracker << " " << tracker->label_ << "\n" << GetStackTrace();
I assume this will be removed.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-exec-mgr.h
File be/src/runtime/query-exec-mgr.h:

Line 77:   /// Execute instances and decrement refcount.
I find it clearer to understand as: "Acquires ownership of 'qs'".


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

Line 43: #define RETRY_SLEEP_MS 100
Make a class-level constant?


Line 311:   // don't return 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-21 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 9:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS9, Line 956:  
nit: blank space


PS9, Line 1136: VLOG_QU
delete


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS9, Line 151:  WARN_UNUSED_RESULT;
nit: long line.


http://gerrit.cloudera.org:8080/#/c/6535/9/be/src/runtime/debug-options.cc
File be/src/runtime/debug-options.cc:

PS9, Line 71:  (*entry).
entry->second ?


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#9).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
69 files changed, 3,775 insertions(+), 3,677 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 8:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS8, Line 80: insert(
> Please consider using emplace() instead.
Done


PS8, Line 96: ctx
> finstance_ctx could be a more meaningful name.
Done


PS8, Line 104: TPlanFragmentCtx& ctx
> May be fragment_ctx is a more meaningful name to distinguish from the TFrag
Done


Line 125: // instance separately
> Will fixing IMPALA-3548 make this code snippet unnecessary ?
not sure.


PS8, Line 126: DCHECK
> DCHECK_EQ()
Done


PS8, Line 136: // TODO: comment
> Missing comment ?
Done


Line 226: bool Coordinator::BackendState::IsDoneInternal() const {
> Consider adding inline.
Done


Line 261: if (instance_exec_status.done) --num_remaining_instances_;
> DCHECK_GT(num_remaining_instances_, 0); before decrementing it.
Done


Line 299: int64_t completion_time = instance_stats.stopwatch_.ElapsedTime();
> Not sure if it's paranoid but should we have DCHECK_GT(completion_time, 0);
Done


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS8, Line 63:  const DebugOptions& debug_options
> Please add comment for this parameter.
Done


PS8, Line 81: //
> ///
Done


PS8, Line 123: int per_fragment_instance_idx() const { return 
exec_params_.per_fragment_instance_idx; }
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS8, Line 228: new FragmentStats(avg_profile_name, root_profile_name, 
num_instances, obj_pool()));
> nit: long line
Done


PS8, Line 272: if (!fragment.__isset.plan)
> Just curious under what circumstances will this be true ?
i think if it's just a 'select 1'.


PS8, Line 330: // source
> Source plan node of the filter.
Done


PS8, Line 354: target
> Target plan node of the filter
Done


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS8, Line 210:  boost::unordered_map node_id_to_idx_map;
> Should this be private ? Same for lock.
made the struct private


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS8, Line 108: Status status;
> not used ?
Done


PS8, Line 297: insert
> emplace()
Done


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

PS8, Line 102: VLOG_QUERY << "SendFilter()";
> remove ?
Done


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS8, Line 102:   if (query_ctx().request_pool.empty()) {
 : const_cast(query_ctx()).request_pool = 
"test-pool";
 :   }
> This seems odd. Why don't we just modify the test to set up query_ctx corre
if you know it's a test, why not set it here?


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 85:   /// installs desc_tbl, if set. If query_ctx.request_pool isn't set, 
sets it to "test-pool".
> This seems odd. Can we modify the test to set up query_ctx.requet_pool corr
the tests don't really care about this (unless they tests for it, and then they 
set it themselves), so i think it's better to do it here.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-19 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 8:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS8, Line 80: insert(
Please consider using emplace() instead.


PS8, Line 96: ctx
finstance_ctx could be a more meaningful name.


PS8, Line 104: TPlanFragmentCtx& ctx
May be fragment_ctx is a more meaningful name to distinguish from the 
TFragmentInstanceCtx& in the outer scope.


Line 125: // instance separately
Will fixing IMPALA-3548 make this code snippet unnecessary ?


PS8, Line 126: DCHECK
DCHECK_EQ()


PS8, Line 136: // TODO: comment
Missing comment ?


Line 226: bool Coordinator::BackendState::IsDoneInternal() const {
Consider adding inline.


Line 261: if (instance_exec_status.done) --num_remaining_instances_;
DCHECK_GT(num_remaining_instances_, 0); before decrementing it.


Line 299: int64_t completion_time = instance_stats.stopwatch_.ElapsedTime();
Not sure if it's paranoid but should we have DCHECK_GT(completion_time, 0); or 
skip calcuating rates_ altogether if completion_time is 0 somehow ?


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS8, Line 63:  const DebugOptions& debug_options
Please add comment for this parameter.


PS8, Line 81: //
///


PS8, Line 123: int per_fragment_instance_idx() const { return 
exec_params_.per_fragment_instance_idx; }
nit: long line


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS8, Line 228: new FragmentStats(avg_profile_name, root_profile_name, 
num_instances, obj_pool()));
nit: long line


PS8, Line 272: if (!fragment.__isset.plan)
Just curious under what circumstances will this be true ?


PS8, Line 330: // source
Source plan node of the filter.


PS8, Line 354: target
Target plan node of the filter


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS8, Line 210:  boost::unordered_map node_id_to_idx_map;
Should this be private ? Same for lock.


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS8, Line 108: Status status;
not used ?


PS8, Line 297: insert
emplace()


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-filter-bank.cc
File be/src/runtime/runtime-filter-bank.cc:

PS8, Line 102: VLOG_QUERY << "SendFilter()";
remove ?


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS8, Line 102:   if (query_ctx().request_pool.empty()) {
 : const_cast(query_ctx()).request_pool = 
"test-pool";
 :   }
This seems odd. Why don't we just modify the test to set up query_ctx correctly.


http://gerrit.cloudera.org:8080/#/c/6535/8/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 85:   /// installs desc_tbl, if set. If query_ctx.request_pool isn't set, 
sets it to "test-pool".
This seems odd. Can we modify the test to set up query_ctx.requet_pool 
correctly before calling this function ?


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

Line 161:   NotifyBarrierOnExit notifier(exec_complete_barrier);
> long line
Done


PS4, Line 184: 
 :   if (!rpc_status.ok()) {
> how is this out of scope ?
it's general cleanup. this patch already contains a good deal of general 
cleanup, in addition to the actual logic changes for the exec rpc call, so i'll 
leave this particular piece of cleanup for later.


PS4, Line 370: 
> long line
Done


PS4, Line 379: tal_split_size_(0),
> Not needed ? Done in UpdateExecStats(), right ?
Done


PS4, Line 391: 
> *done = IsDone();
Done


PS4, Line 434: 
> I don't see how this can spam the log if it's not expected to occur frequen
this exact line will be changed in the very near future as part of switch to 
krpc. i don't think it's worth polishing it now. (in particular, this line will 
disappear.)


PS4, Line 444: 
> magic constant ? #define or constexpr ?
see previous comment about rpc retry. the new rpc stack abstracts that away 
anyway.


Line 463: const string& root_profile_name, int num_instances, ObjectPool* 
obj_pool)
> long line
Done


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 45: p
> the hierarchy fragment -> fragment instance is already well-documented else
Done


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 408: // TODO: rename this metric
> Correct.
Done


PS6, Line 885: TODO: do we need this error propagation path, in addition to the 
status report
 :   // we'll get anyway?
> It reports by making an RPC, which can be delayed for any one of a number o
Done


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 194:   query_ctx().c
> Yes, it's used in the line below but it isn't added to some status object s
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-19 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 7:

The stress test also passes.

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-18 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 408: // TODO: rename this metric
> meaning nothing depends on this name at the moment?
Correct.


PS6, Line 885: TODO: do we need this error propagation path, in addition to the 
status report
 :   // we'll get anyway?
> when a fragment instance fails it reports an error status right away.
It reports by making an RPC, which can be delayed for any one of a number of 
reasons. 

I don't quite see what the TODO is asking - is it whether we need to return an 
error from GetNext() if coord_sink_->GetNext() failed?


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-18 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 7:

(1 comment)

this now passes all tests in debug and asan mode.

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 113: 
> I missed the comment at line 108 and only saw the comment on line 99 which 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-18 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#7).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
69 files changed, 3,774 insertions(+), 3,665 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 113:  _log_;
> they don't, the comment above stipulates that you need to hold lock_
I missed the comment at line 108 and only saw the comment on line 99 which says 
no lock needs to be held. That's the cause for confusion. It'd be nice to have 
a DCHECK to verify that lock_ is actually held in this function.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(25 comments)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/common/status.h
File be/src/common/status.h:

Line 22: #include 
> Can you include iosfwd instead. status.h is included everywhere and it woul
Done


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

PS4, Line 66:   
> We use indentation 4 for line continuation. See line 119 below.
we use 4 spaces for the first line and 2 spaces subsequently for a multi-line 
statement.


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS6, Line 46: using boost::lock_guard;
: using boost::mutex;
> #include "common/names.h" for these.
Done


PS6, Line 348: true
> the class comment says that Cancel() returns true if the cancellation happe
changed comment. this is only informational anyway.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 97:  boost::mutex* lock() { return _; }
> Is it possible to not expose the lock ? It'd be easier to reason about the 
Done


PS4, Line 181:   /// If the status indicates an error status, execution has 
either been aborted by the
 :   /// executing impalad (which then reported the error) or 
cancellation has been
 :   /// initiated; either way, execution must not be cancelled.
 :   Status status_;
> Why the two separate data structures ? Why cannot this be a map of instance
Done


http://gerrit.cloudera.org:8080/#/c/6535/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS2, Line 252: reate BackendStates
 :   bool has_coord_fragment = schedule_.GetCoordFragment() != 
nullptr;
 :   const TNetworkAddress& coord_address = ExecE
> doesn't need to be two statements.
Done


PS2, Line 262: kend_state->Init(entry.second, fra
> const &?
Done


PS2, Line 264: 
> This would be better named coord_backend_idx_.
Done


Line 977:   << "\n" << s.str();
henry, is there any reason to keep this?


PS2, Line 1000:   // debugging aid for back
> Acquiring this lock is problematic for an RPC handler. See .
true, but i don't want to extend the scope of this change to fix locking 
problems as well.


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 408: // TODO: rename this metric
> Why not do this now? (But note that for the moment that name isn't publishe
meaning nothing depends on this name at the moment?


PS6, Line 415: || status.IsCancelled()
> When would we have a backend that was cancelled at this point (since cancel
good point, reverted.


PS6, Line 858: Report aggregate
 :   // query profiles at this point.
> This comment seems out-of-sync with the rename of ReportQuerySummary().
Done


PS6, Line 885: TODO: do we need this error propagation path, in addition to the 
status report
 :   // we'll get anyway?
> This error is needed so that client-facing operations can fail-fast. The ne
when a fragment instance fails it reports an error status right away.

i don't see any harm in leaving that todo in here.


PS6, Line 986: if (!(returned_all_results_ && status.IsCancelled()) && 
!status.ok()) {
> Will you include the fix for IMPALA-4925 here as discussed? (see https://ge
if returned_all_results_ is true, it's questionable whether we shouldn't simply 
return at the topic (since the query summary has already been computed).

i'm happy to include the "fix", but i think these control paths need more work.


PS6, Line 993: lock_guard l(lock_);
> We spoke offline about removing this lock acquisition in this patch (see ht
let's do it as a follow-on, it's purely additive.


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS6, Line 199: // TODO: what to do with this? should we treat a failure to get 
a backend
 : // connection as a failure of instance execution?
> Yes. 
Done


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS6, Line 65:  - guarantee delivery of the ReportExecStatus rpc in case of an 
error, otherwise
: ///   the query might hang
> We can't guarantee delivery - not quite sure what you mean here. 
Done


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 340:   /// only populated by test c'tor
> Isn't it also used by the fe-support code path?
Done


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/test-env.cc
File be/src/runtime/test-env.cc:

Line 23: #include "common/names.h"
> common/names.h 

Re: [Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Henry Robinson
Oops, ignore the outdated PS2 comments. The PS6 ones are current though!

On 17 April 2017 at 16:41, Henry Robinson (Code Review)  wrote:

> Henry Robinson has posted comments on this change.
>
> Change subject: IMPALA-2550: Switch to per-query exec rpc
> ..
>
>
> Patch Set 6:
>
> (14 comments)
>
> I quickly looked at some familiar files.
>
> http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/
> runtime/coordinator-backend-state.cc
> File be/src/runtime/coordinator-backend-state.cc:
>
> PS6, Line 46: using boost::lock_guard;
> : using boost::mutex;
> #include "common/names.h" for these.
>
>
> PS6, Line 348: true
> the class comment says that Cancel() returns true if the cancellation
> happened - but it almost certainly didn't here.
>
>
> http://gerrit.cloudera.org:8080/#/c/6535/2/be/src/runtime/coordinator.cc
> File be/src/runtime/coordinator.cc:
>
> PS2, Line 252: reate BackendStates
>  :   bool has_coord_fragment = schedule_.GetCoordFragment() !=
> nullptr;
>  :   const TNetworkAddress& coord_address = ExecE
> doesn't need to be two statements.
>
>
> PS2, Line 262: kend_state->Init(entry.second, fra
> const &?
>
>
> PS2, Line 264:
> This would be better named coord_backend_idx_.
>
>
> PS2, Line 1000:   // debugging aid for back
> Acquiring this lock is problematic for an RPC handler. See .
>
>
> http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator.cc
> File be/src/runtime/coordinator.cc:
>
> PS6, Line 408: // TODO: rename this metric
> Why not do this now? (But note that for the moment that name isn't
> published in the profile anywhere).
>
>
> PS6, Line 415: || status.IsCancelled()
> When would we have a backend that was cancelled at this point (since
> cancellation always originates from the coordinator)? And if there was a
> cancelled backend, why overwrite its status with the error state? From a
> user perspective, if they cancel the query, surfacing an error would be
> confusing.
>
>
> PS6, Line 858: Report aggregate
>  :   // query profiles at this point.
> This comment seems out-of-sync with the rename of ReportQuerySummary().
>
>
> PS6, Line 885: TODO: do we need this error propagation path, in addition
> to the status report
>  :   // we'll get anyway?
> This error is needed so that client-facing operations can fail-fast. The
> next coordinator status report might be seconds away.
>
>
> PS6, Line 986: if (!(returned_all_results_ && status.IsCancelled()) &&
> !status.ok()) {
> Will you include the fix for IMPALA-4925 here as discussed? (see
> https://gerrit.cloudera.org/#/c/5987/1/be/src/runtime/coordinator.cc)
>
>
> PS6, Line 993: lock_guard l(lock_);
> We spoke offline about removing this lock acquisition in this patch (see
> https://gerrit.cloudera.org/#/c/5971/2 for my patch a month or so ago).
> This is too expensive to take in an RPC handler - when a query completes
> this can lead to significant convoying on this lock, and with KRPC the
> reactor threads can get serialized (this would still be an issue even if
> the reports were batched). Do you want to tackle this in this patch or in a
> follow-on?
>
>
> http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.cc
> File be/src/runtime/query-state.cc:
>
> PS6, Line 199: // TODO: what to do with this? should we treat a failure to
> get a backend
>  : // connection as a failure of instance execution?
> Yes.
>
> The report RPC from a backend to the coordinator is how the backend tells
> if it's orphaned from the coordinator. That should cause the backend to
> fail, because it can't receive cancellation messages, and might otherwise
> block forever.
>
> We CancelInternal() later if the RPC itself fails. This is just another
> way that the RPC can logically fail. KRPC abstracts this away.
>
>
> http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.h
> File be/src/runtime/query-state.h:
>
> PS6, Line 65:  - guarantee delivery of the ReportExecStatus rpc in case of
> an error, otherwise
> : ///   the query might hang
> We can't guarantee delivery - not quite sure what you mean here.
> If the RPC can't be sent, the query state should get torn down on the
> assumption that the coordinator has failed.
>
>
> --
> To view, visit http://gerrit.cloudera.org:8080/6535
> To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
>
> Gerrit-MessageType: comment
> Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
> Gerrit-PatchSet: 6
> Gerrit-Project: Impala-ASF
> Gerrit-Branch: master
> Gerrit-Owner: Marcel Kornacker 
> Gerrit-Reviewer: Dan Hecht 
> Gerrit-Reviewer: Henry Robinson 
> Gerrit-Reviewer: Marcel Kornacker 
> Gerrit-Reviewer: Michael Ho 
> Gerrit-Reviewer: Tim Armstrong 
> 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Henry Robinson (Code Review)
Henry Robinson has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(14 comments)

I quickly looked at some familiar files.

http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS6, Line 46: using boost::lock_guard;
: using boost::mutex;
#include "common/names.h" for these.


PS6, Line 348: true
the class comment says that Cancel() returns true if the cancellation happened 
- but it almost certainly didn't here.


http://gerrit.cloudera.org:8080/#/c/6535/2/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS2, Line 252: reate BackendStates
 :   bool has_coord_fragment = schedule_.GetCoordFragment() != 
nullptr;
 :   const TNetworkAddress& coord_address = ExecE
doesn't need to be two statements.


PS2, Line 262: kend_state->Init(entry.second, fra
const &?


PS2, Line 264: 
This would be better named coord_backend_idx_.


PS2, Line 1000:   // debugging aid for back
Acquiring this lock is problematic for an RPC handler. See .


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS6, Line 408: // TODO: rename this metric
Why not do this now? (But note that for the moment that name isn't published in 
the profile anywhere).


PS6, Line 415: || status.IsCancelled()
When would we have a backend that was cancelled at this point (since 
cancellation always originates from the coordinator)? And if there was a 
cancelled backend, why overwrite its status with the error state? From a user 
perspective, if they cancel the query, surfacing an error would be confusing.


PS6, Line 858: Report aggregate
 :   // query profiles at this point.
This comment seems out-of-sync with the rename of ReportQuerySummary().


PS6, Line 885: TODO: do we need this error propagation path, in addition to the 
status report
 :   // we'll get anyway?
This error is needed so that client-facing operations can fail-fast. The next 
coordinator status report might be seconds away.


PS6, Line 986: if (!(returned_all_results_ && status.IsCancelled()) && 
!status.ok()) {
Will you include the fix for IMPALA-4925 here as discussed? (see 
https://gerrit.cloudera.org/#/c/5987/1/be/src/runtime/coordinator.cc)


PS6, Line 993: lock_guard l(lock_);
We spoke offline about removing this lock acquisition in this patch (see 
https://gerrit.cloudera.org/#/c/5971/2 for my patch a month or so ago).  This 
is too expensive to take in an RPC handler - when a query completes this can 
lead to significant convoying on this lock, and with KRPC the reactor threads 
can get serialized (this would still be an issue even if the reports were 
batched). Do you want to tackle this in this patch or in a follow-on?


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS6, Line 199: // TODO: what to do with this? should we treat a failure to get 
a backend
 : // connection as a failure of instance execution?
Yes. 

The report RPC from a backend to the coordinator is how the backend tells if 
it's orphaned from the coordinator. That should cause the backend to fail, 
because it can't receive cancellation messages, and might otherwise block 
forever. 

We CancelInternal() later if the RPC itself fails. This is just another way 
that the RPC can logically fail. KRPC abstracts this away.


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS6, Line 65:  - guarantee delivery of the ReportExecStatus rpc in case of an 
error, otherwise
: ///   the query might hang
We can't guarantee delivery - not quite sure what you mean here. 
If the RPC can't be sent, the query state should get torn down on the 
assumption that the coordinator has failed.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(4 comments)

Looked through some of the files I'm more familiar with - had a few minor 
comments.

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/common/status.h
File be/src/common/status.h:

Line 22: #include 
Can you include iosfwd instead. status.h is included everywhere and it would be 
nice not to pull in and compile the huge iostream headers for every file in the 
codebase.


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

Line 340:   /// only populated by test c'tor
Isn't it also used by the fe-support code path?


http://gerrit.cloudera.org:8080/#/c/6535/6/be/src/runtime/test-env.cc
File be/src/runtime/test-env.cc:

Line 23: #include "common/names.h"
common/names.h usually gets put below the other includes because it checks to 
see what other headers were pulled in.


Line 61:   if (buffer_pool_min_buffer_len_ != -1 && buffer_pool_capacity_ != 
-1) {
Thanks for fixing this.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 194: stringstream s;
> it's used in the line below.
Yes, it's used in the line below but it isn't added to some status object so 
it's essentially dead code. What I see is:

void foobar() {
...
...
  { 
 stringstream s;
 s << "some string";
 return;
  }
}


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-17 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

PS4, Line 66:   
> how so?
We use indentation 4 for line continuation. See line 119 below.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS4, Line 184: 
 :   if (!rpc_status.ok()) {
> out of scope for this patch.
how is this out of scope ?


Line 365:   TPublishFilterResult res;
> i left that in there as a reminder to myself to think this through again.
At which version of this patch will this be thought through ?


PS4, Line 391: 
> IsDone() simply returns a bool, not sure what you mean.
*done = IsDone();


PS4, Line 434: 
> it wasn't there before, and it has the potential to spam the log.
I don't see how this can spam the log if it's not expected to occur frequently. 
If it happens very frequently, then there is something wrong.


http://gerrit.cloudera.org:8080/#/c/6535/4/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS4, Line 399: across 
> there are two indices, which come in handy in different situation:
So, is this a per fragment index different from the globally unique instance 
index ? Can you provide more clarification on how each of them is used and add 
a remark they are really different. It's important to be as clear as possible 
to avoid confusion.


PS4, Line 400: num_fragment_instances
I don't see a field called num_fragment_instances in TPlanFragmentCtx defined 
above.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#5).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
M be/src/runtime/mem-tracker.cc
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-filter-bank.cc
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
M tests/common/test_result_verifier.py
69 files changed, 3,743 insertions(+), 3,654 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-15 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(50 comments)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

PS4, Line 66:   
> nit: wrong indentation
how so?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS4, Line 58: //done_(false) {
> delete
Done


Line 64:   VLOG_QUERY << "BS::Init(): query_id=" << PrintId(query_id_) << " 
state_idx=" << state_idx_ << " host=" << host_;
> long line
i'll remove all of the extra debug logging at the end.


PS4, Line 104: rpc_params->fragment_ctxs.back().fragment.idx
 :  != params.fragment_exec_params.fragment.idx) {
> The assumption is that all fragment instances belonging to the same fragmen
Done


PS4, Line 112: params.fragment_exec_params.fragment.idx
> May as well factor this out as a local variable as it's used above too. The
Done


PS4, Line 170: lock_guard l(lock_);
> Is it necessary to hold the lock while creating BackendConnection below ?
the lock is low-contention, and it protects status_.


PS4, Line 173: client_connect_status;
> Is this not used or did you intend to use this instead of status_ ?
Done


PS4, Line 184:   const string ERR_TEMPLATE =
 :   "ExecQueryFInstances rpc query_id=$0 failed: $1";
> Shouldn't this live in common/thrift/generate_error_codes.py ?
out of scope for this patch.


PS4, Line 332:  lock_guard l2(lock_);
> How heavily contended is this lock ? Will this become a bottleneck if we ha
no, it'll (eventually) be the backend doing the reporting, not the individual 
instances.


PS4, Line 346:   
> nit: indentation is off.
Done


Line 365: if (status_.ok()) {
> Is this block empty now ? It only contains comment, right ?
i left that in there as a reminder to myself to think this through again.


PS4, Line 391: num_remaining_instances_ == 0 || !status_.ok();
> Why not call IsDone() here ?
IsDone() simply returns a bool, not sure what you mean.


PS4, Line 393: //if (*done) stopwatch_.Stop();
> delete
Done


PS4, Line 434:   if (!status.ok()) return false;
> Does this warrant a log message ? It's not entirely clear what's the status
it wasn't there before, and it has the potential to spam the log.


PS4, Line 513:  NULL
> nullptr
Done


PS4, Line 549:
//DCHECK_EQ(fragment_profiles_[instance_state.fragment_idx()].num_instances,
 : //node_exec_summary.exec_stats.size());
> ?
Done


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 45: .
> Can you please also add a comment stating that multiple fragments run a sam
the hierarchy fragment -> fragment instance is already well-documented 
elsewhere, so i'm not going to point that out again.


PS4, Line 100:   //const vector& instance_stats() const { return 
instance_stats_; }
 :   //MonotonicStopWatch* stopwatch() { return _; }
> delete
Done


PS4, Line 109: //InstanceStats* GetInstanceStats(const TUniqueId& instance_id);
> delete
Done


Line 111: return num_remaining_instances_ == 0 || !status_.ok();
> nit: one line ?
only for getters and setters.


PS4, Line 113:  _log_;
> How does this work if multiple callers modify the map at the same time ?
they don't, the comment above stipulates that you need to hold lock_


PS4, Line 114:   //const std::vector& instance_profiles() 
const {
 : //return instance_profiles_;
 :   //}
> delete
Done


PS4, Line 133: #if 0
> delete
Done


PS4, Line 143: /
> delete
Done


PS4, Line 223: //bool done_;
> Still needed ?
Done


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 113: torn_down_(false) {}
> Should the constructor also initialize coord_state_idx_ to -1 in case there
Done


PS4, Line 187: FinishBackendStartup
> Would a better name be VerifyBackendStartup(); ?
Not particularly, because this really blocks until startup finished.


PS4, Line 234: new FragmentStats(avg_profile_name, root_profile_name, 
num_instances, obj_pool())
> Should this be added to obj_pool() too ? fragment_stats_ only contains poin
i think that was the intention.


Line 340:   // Set the 'pending_count_' to zero to indicate that for a 
filter with local-only
> long line
Done


Line 397:   //CountingBarrier* exec_complete_barrier = 
exec_complete_barrier_.get();
> Delete ?
Done


PS4, Line 400: [backend_state, this, _options]() {
 :   backend_state->Exec(query_ctx_, debug_options, 
filter_routing_table_,
 : exec_complete_barrier_.get());
 : })
> IMHO, it seems easier to read if this is defined as a function like the exi
what 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS4, Line 434:   if (!status.ok()) return false;
Does this warrant a log message ? It's not entirely clear what's the status of 
the remote fragment instances at this point, right ? The remote fragment 
instances could still be running. The log message may make debugging in the 
field slightly easier.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

PS4, Line 187: FinishBackendStartup
Would a better name be VerifyBackendStartup(); ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

PS4, Line 249: report_thread_.reset(
 : new Thread("plan-fragment-executor", "report-profile",
 : ::ReportProfileThread, this));
Is one reporting thread per fragment instance a bit much ? Have you considered 
using one reporting thread per fragment ?


PS4, Line 267:  //RETURN_IF_ERROR(quer
delete


PS4, Line 298: RETURN_IF_ERROR(status);
Why not RETURN_IF_ERROR(exec_tree_->GetNext(...))) at line 294 instead ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS4, Line 83: void Cancel();
Please also document whether this is idempotent.


PS4, Line 89: If the preparation phase encountered an error, GetOpenStatus() 
will
:   /// return that error without blocking.
I don't see the opened_promise_ being updated when Prepare() fails in Exec(). 
Is that expected ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 194: stringstream s;
Did you mean to use s somewhere ? It looks like dead code as it stands now.


PS4, Line 246: 3
How is this determined ? Should we use #define or constexpr to represent it so 
it has a more meaningful name ?


PS4, Line 250: rpc_status = Status(res.status);
Is this needed given line 256 below which assigns res.status to result_status ?


PS4, Line 317: Status status 
Not an error but there is a variable named status already in the outer scope. 
May want to have a more specific name.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 172: AtomicInt32 is_cancelled_;
> Is there any atomic read-modify-write operation on this flag ? Doesn't seem
To answer my own question, there is an AtomicCompareAndSwap() operations on 
this flag to avoid calling cancellation on fragments multiple times.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-14 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(41 comments)

Some more comments. Still going through the QueryState code and need to do more 
passes with the coordinator code.

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/exec/data-sink.cc
File be/src/exec/data-sink.cc:

PS4, Line 66:   
nit: wrong indentation


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

PS4, Line 58: //done_(false) {
delete


PS4, Line 104: rpc_params->fragment_ctxs.back().fragment.idx
 :  != params.fragment_exec_params.fragment.idx) {
The assumption is that all fragment instances belonging to the same fragment 
are grouped together in params.fragment_exec_params. Any chance we can have a 
DCHECK for that here or Init() ?


PS4, Line 112: params.fragment_exec_params.fragment.idx
May as well factor this out as a local variable as it's used above too. The 
code would be easier to parse that way.


PS4, Line 170: lock_guard l(lock_);
Is it necessary to hold the lock while creating BackendConnection below ?

Is it not safe to not hold the lock while making the RPC ? Is there some sort 
of race if we only acquire the lock after making the RPC to update rpc_sent_ 
and rpc_latency_ ?


PS4, Line 173: client_connect_status;
Is this not used or did you intend to use this instead of status_ ?

We may skip holding the lock here and only merge client_connect_status with 
status_ if there is any error.


PS4, Line 184:   const string ERR_TEMPLATE =
 :   "ExecQueryFInstances rpc query_id=$0 failed: $1";
Shouldn't this live in common/thrift/generate_error_codes.py ?


PS4, Line 332:  lock_guard l2(lock_);
How heavily contended is this lock ? Will this become a bottleneck if we have a 
lot of fragment instances reporting status ?


PS4, Line 346:   
nit: indentation is off.


Line 365: if (status_.ok()) {
Is this block empty now ? It only contains comment, right ?


PS4, Line 379: //UpdateAverageProfile(exec_state);
Not needed ? Done in UpdateExecStats(), right ?


PS4, Line 391: num_remaining_instances_ == 0 || !status_.ok();
Why not call IsDone() here ?


PS4, Line 393: //if (*done) stopwatch_.Stop();
delete


PS4, Line 444: 3
magic constant ? #define or constexpr ?


PS4, Line 513:  NULL
nullptr


PS4, Line 549:
//DCHECK_EQ(fragment_profiles_[instance_state.fragment_idx()].num_instances,
 : //node_exec_summary.exec_stats.size());
?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

PS4, Line 45: .
Can you please also add a comment stating that multiple fragments run a same 
backend share a single BackendState. So the hierarchy is:

BackendState -> Fragment -> Fragment Instance


PS4, Line 97:  boost::mutex* lock() { return _; }
Is it possible to not expose the lock ? It'd be easier to reason about the code 
if all uses of 'lock_' happens within class functions.


PS4, Line 100:   //const vector& instance_stats() const { return 
instance_stats_; }
 :   //MonotonicStopWatch* stopwatch() { return _; }
delete


PS4, Line 109: //InstanceStats* GetInstanceStats(const TUniqueId& instance_id);
delete


Line 111: return num_remaining_instances_ == 0 || !status_.ok();
nit: one line ?


PS4, Line 113:  _log_;
How does this work if multiple callers modify the map at the same time ?


PS4, Line 114:   //const std::vector& instance_profiles() 
const {
 : //return instance_profiles_;
 :   //}
delete


PS4, Line 133: #if 0
delete


PS4, Line 143: /
delete


PS4, Line 181:   std::vector instance_stats_;
 : 
 :   /// map from instance idx to InstanceStats
 :   std::unordered_map instance_stats_map_;
Why the two separate data structures ? Why cannot this be a map of instance idx 
to InstanceStats directly ? Not cache friendly ?


PS4, Line 223: //bool done_;
Still needed ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 113: torn_down_(false) {}
Should the constructor also initialize coord_state_idx_ to -1 in case there is 
no coordinator fragment ?


PS4, Line 234: new FragmentStats(avg_profile_name, root_profile_name, 
num_instances, obj_pool())
Should this be added to obj_pool() too ? fragment_stats_ only contains pointers.


Line 397:   //CountingBarrier* exec_complete_barrier = 
exec_complete_barrier_.get();
Delete ?


PS4, Line 400: [backend_state, this, _options]() {
 :   backend_state->Exec(query_ctx_, debug_options, 
filter_routing_table_,
 : exec_complete_barrier_.get());
 : })
IMHO, it seems easier to read 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-13 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

> (32 comments)
 > 
 > Did a quick pass. Mostly formatting comments for now. Seems to be
 > quite a bit of debugging logging and #if 0 left in the code.

yes, i left a bunch of debug output in there that i'll remove before everything 
goes in. those are often single long lines (so it's even more obvious they need 
to be removed).

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-Reviewer: Michael Ho 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-13 Thread Michael Ho (Code Review)
Michael Ho has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 4:

(32 comments)

Did a quick pass. Mostly formatting comments for now. Seems to be quite a bit 
of debugging logging and #if 0 left in the code.

http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

Line 64:   VLOG_QUERY << "BS::Init(): query_id=" << PrintId(query_id_) << " 
state_idx=" << state_idx_ << " host=" << host_;
long line


Line 161:   VLOG_QUERY << "BS::Exec(): query_id=" << PrintId(query_id_) << " 
state_idx=" << state_idx_ << " host=" << host_;
long line


PS4, Line 208: #if 0
huh ?


PS4, Line 370: in which
long line


Line 463:   VLOG_QUERY << "BS::PublishFilter(): query_id=" << 
PrintId(query_id_) << " state_idx=" << state_idx_ << " host=" << host_;
long line


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

Line 340:   // Set the 'pending_count_' to zero to indicate that for a 
filter with local-only
long line


PS4, Line 421: //VLOG_QUERY << "backend status: idx=" << backend_state->st
?


PS4, Line 424: VLOG_QUERY << "set new status";
Debugging output ?


PS4, Line 632:  //
Did you mean to comment this line out ?


Line 714: if 
(!partition_create_ops.Execute(ExecEnv::GetInstance()->hdfs_op_thread_pool(), 
false)) {
long line


Line 976: #if 0
huh ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

PS4, Line 216: //ExecSummary* exec_summary() { return _summary_; }
Is this not needed ?


PS4, Line 304: lock_
The lock acquisition order is still unclear wrt other locks in this class (e.g. 
wait_lock_, filter_lock_) although one may derive it by reading the code. In 
the long run, we should consider assigning rank to locks so their acquisition 
order is implicitly encoded in their ranks.

Also, what's the test coverage for the execution paths which may lead to taking 
multiple locks ?


PS4, Line 404: before
Mind explaining why it must be called before InitBackendState()  in the comment 
?


PS4, Line 444:  
blank space


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/descriptors.cc
File be/src/runtime/descriptors.cc:

PS4, Line 500: partition_key_value_ctxs
Isn't the partition expr Literal though ? Does it really need to allocate 
memory ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/fragment-instance-state.cc
File be/src/runtime/fragment-instance-state.cc:

Line 77:   //VLOG_QUERY << "~FIS: instance_id=" << PrintId(instance_id()) << " 
this=" << this << " sink=" << sink_;
Debugging output ?


Line 115:   //VLOG_QUERY << "Cancelling fragment instance " << 
PrintId(instance_id()) << " this=" << this << " sink=" << sink_;
?


Line 133:   //VLOG_QUERY << "Prepare(): instance_id=" << PrintId(instance_id()) 
<< " this=" << this << " sink=" << sink_;
?


Line 310:   //VLOG_QUERY << "Close(): instance_id=" << PrintId(instance_id()) 
<< " this=" << this << " sink=" << sink_;
?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/fragment-instance-state.h
File be/src/runtime/fragment-instance-state.h:

PS4, Line 52: for
long line


PS4, Line 91:  
blank space


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

PS4, Line 254: 100
magic constant. Could this be better to be #define or constexpr value ?


PS4, Line 379:   VLOG_QUERY << "PublishFilter: filter_id=" << filter_id << " 
fragment_idx=" << fragment_idx;
long line


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

PS4, Line 172: AtomicInt32 is_cancelled_;
Is there any atomic read-modify-write operation on this flag ? Doesn't seem 
strictly necessary to be AtomicInt32.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

PS4, Line 105: test-pool
What does "test-pool" mean ?


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

PS4, Line 99: ObjectPoool
typo.


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

PS4, Line 194:   // Make sure ClientRequestState::Wait() has completed before 
fetching rows. Wait() ensures
 :   // that rows are ready to be fetched (e.g., Wait() opens 
ClientRequestState::output_exprs_,
long lines


http://gerrit.cloudera.org:8080/#/c/6535/4/be/src/util/uid-util.h
File be/src/util/uid-util.h:

PS4, Line 81: IsValidId
IsValidFid ? It's not entirely clear from the function name what Id is.


http://gerrit.cloudera.org:8080/#/c/6535/4/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS4, Line 427: TExecQueryFInstancesParams 

[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-05 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 3:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6535/3/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS3, Line 381: the total number
 : // of fragment instances
> what's that?
Done


PS3, Line 386: i32
> Types.TFragmentIdx ?
Done


PS3, Line 412: 6: list destinations
> isn't that common across all instances?
Done


PS3, Line 414: fragment
> instance?
Done


PS3, Line 435: coord_state_idx
> not clear why the backend will care what index it was represented by in a c
Done


Line 441:   4: list fragment_ctxs
> are these in any particular order?
i explained the ordering for fragment_instance_ctxs


Line 453: // ReportExecStatus
> other RPC names have a blank line after them, which makes them easier to fi
Done


PS3, Line 552: overall
> what does "overall" mean here?  this makes it seem like it's the merged sta
changed code to match description


PS3, Line 560: insert_exec_status
> I guess this will hold the insert status for all instances, right?
rephrased


Line 563:   7: optional map error_log;
> and same here. this will be the aggregated error log across all instances?
rephrased


PS3, Line 572: CancelPlanFragment
> CancelQueryFInstances
Done


Line 669: 
> It would be helpful to add the RPC comment to show which RPC the following 
Done


Line 699: 
> // PublishFilter
Done


PS3, Line 734: ReportExecStatus
> Eventually this will report status for all the instances (at least in the p
well, it's overall backend execution status (this could report an error even 
before any fragment instances are started), so that's why i thought 'exec 
status' is still the right scope


Line 739:   TCancelQueryFInstancesResult CancelQueryFInstances(
i didn't want to name this CancelQuery, because that sounds too much like a 
client-related rpc


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-05 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#4).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
66 files changed, 3,820 insertions(+), 3,615 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-04 Thread Dan Hecht (Code Review)
Dan Hecht has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 3:

(14 comments)

Just looked at the thrift so far.

http://gerrit.cloudera.org:8080/#/c/6535/3/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

PS3, Line 381: the total number
 : // of fragment instances
what's that?


PS3, Line 386: i32
Types.TFragmentIdx ?


PS3, Line 412: 6: list destinations
isn't that common across all instances?


PS3, Line 414: fragment
instance?


PS3, Line 435: coord_state_idx
not clear why the backend will care what index it was represented by in a 
coordinator's datastructure. why is that?
oh, reading on I see this is really just used as to identify the backend when 
calling back via ReportExecStatus. it could use a little explanation.


Line 441:   4: list fragment_ctxs
are these in any particular order?


Line 453: // ReportExecStatus
other RPC names have a blank line after them, which makes them easier to find.


PS3, Line 552: overall
what does "overall" mean here?  this makes it seem like it's the merged status 
of all the instances, but from reading the code it's not.


PS3, Line 560: insert_exec_status
I guess this will hold the insert status for all instances, right?


Line 563:   7: optional map error_log;
and same here. this will be the aggregated error log across all instances?


PS3, Line 572: CancelPlanFragment
CancelQueryFInstances


Line 669: 
It would be helpful to add the RPC comment to show which RPC the following 
structures below to (like we have above):

// UpdateFilter


Line 699: 
// PublishFilter


PS3, Line 734: ReportExecStatus
Eventually this will report status for all the instances (at least in the 
periodic case), right? So it might be clearer to name it similar to the others: 
ReportFInstancesStatus() or similar.


-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Dan Hecht 
Gerrit-Reviewer: Henry Robinson 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-03 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has posted comments on this change.

Change subject: IMPALA-2550: Switch to per-query exec rpc
..


Patch Set 2:

Re debug output: there's too much of that in there, I'll clean up later.

Re tests: I ran a few of them by hand (aggregation, join-queries, mt-dop, 
cancellation, lifecycle, queries, tpch), they pass.

This most likely won't pass the full test suite, but I wanted to get the review 
process started before fixing all remaining bugs.

-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker 
Gerrit-Reviewer: Marcel Kornacker 
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-2550: Switch to per-query exec rpc

2017-04-03 Thread Marcel Kornacker (Code Review)
Marcel Kornacker has uploaded a new patch set (#2).

Change subject: IMPALA-2550: Switch to per-query exec rpc
..

IMPALA-2550: Switch to per-query exec rpc

Coordinator:
- FragmentInstanceState -> BackendState, which in turn records
  FragmentInstanceStats

QueryState
- does query-wide setup in a separate thread (which also launches
  the instance exec threads)
- has a query-wide 'prepared' state at which point all static setup
  is done and all FragmentInstanceStates are accessible

Also renamed QueryExecState to ClientRequestState.

Simplified handling of execution status (in FragmentInstanceState):
- status only transmitted via ReportExecStatus rpc
- in particular, it's not returned anymore from the Cancel rpc

FIS: Fixed bugs related to partially-prepared state (in Close() and 
ReleaseThreadToken())

Change-Id: I20769e420711737b6b385c744cef4851cee3facd
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/codegen/codegen-anyval.h
M be/src/common/status.cc
M be/src/common/status.h
M be/src/exec/data-sink.cc
M be/src/exec/data-sink.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/hdfs-avro-table-writer.cc
M be/src/exec/hdfs-avro-table-writer.h
M be/src/exec/hdfs-parquet-table-writer.cc
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-table-sink.cc
M be/src/exec/hdfs-table-sink.h
M be/src/exec/hdfs-table-writer.cc
M be/src/exec/hdfs-table-writer.h
M be/src/exprs/expr-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
A be/src/runtime/coordinator-backend-state.cc
A be/src/runtime/coordinator-backend-state.h
A be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/runtime/data-stream-test.cc
A be/src/runtime/debug-options.cc
A be/src/runtime/debug-options.h
M be/src/runtime/descriptors.cc
M be/src/runtime/descriptors.h
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
D be/src/runtime/plan-fragment-executor.cc
D be/src/runtime/plan-fragment-executor.h
M be/src/runtime/query-exec-mgr.cc
M be/src/runtime/query-exec-mgr.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/scheduling/query-schedule.cc
M be/src/scheduling/query-schedule.h
M be/src/service/CMakeLists.txt
M be/src/service/child-query.cc
M be/src/service/child-query.h
R be/src/service/client-request-state.cc
R be/src/service/client-request-state.h
M be/src/service/fe-support.cc
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/testutil/desc-tbl-builder.cc
M be/src/testutil/fault-injection-util.h
M be/src/util/error-util-test.cc
M be/src/util/error-util.cc
M be/src/util/error-util.h
M be/src/util/uid-util.h
M common/thrift/ExecStats.thrift
M common/thrift/ImpalaInternalService.thrift
66 files changed, 3,778 insertions(+), 3,581 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/6535/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I20769e420711737b6b385c744cef4851cee3facd
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Marcel Kornacker