Michael Ho has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10855 )

Change subject: IMPALA-7213, IMPALA-7241: Port ReportExecStatus() RPC to use 
KRPC
......................................................................


Patch Set 12:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h
File be/src/exec/hdfs-parquet-table-writer.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/exec/hdfs-parquet-table-writer.h@199
PS10, Line 199:
> nit: should #include the appropriate .pb.h here ("include-what-you-use")
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@60
PS10, Line 60: const Coordinator&
> I think this can probably change back to being const if you take the sugges
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@176
PS10, Line 176: last_report_ti
> nit: I think the term "sequence number" is more usual here -- "version" to
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.h@220
PS10, Line 220:   /// Backend exec params, owned by the QuerySchedule and has 
query lifetime.
> This "back pointer" still seems error-prone to me. I think the object lifet
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@267
PS10, Line 267:   return num_remaining_instances_ == 0 || !status_.ok();
> I think a VLOG_QUERY about the skipped RPC is probably useful
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/coordinator-backend-state.cc@294
PS10, Line 294:     DCHECK(!instance_stats->done_);
> nit: why not:
The ctor was marked explicit so not sure it's allowed:
  "explicit Status(const StatusPB& status);"


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@287
PS10, Line 287: atus = report
> Ah, I missed that we join on the reporter thread first.
Good idea about using DFAKE_MUTEX(). Also switched to using a non-atomic.

Also simplified the logic in Coordinator::BackendState::ApplyExecStatusReport() 
as we can rely purely on the sequence number as you suggested.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@375
PS10, Line 375:     ReportExecStatusResponsePB resp;
> should we have a failure injection point on the RPC itself? I only saw fail
Please find the tests in test_rpc_timeout.py which:

1. inject delays in the RPC handler to induce timeout
2. run with a very short service queue to emulate a busy server.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/query-state.cc@379
PS10, Line 379: reak;
> should we backoff?
I will refrain from changing the logic here too much. There will be a follow up 
patch after IMPALA-4063 which will change the retry logic. TODO added.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc
File be/src/runtime/runtime-state.cc:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/runtime/runtime-state.cc@202
PS10, Line 202:   }
> the method doc says that new_errors is cleared, but it's actually written i
This was lost after refactoring this function. Fixed now.


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/service/control-service.cc
File be/src/service/control-service.cc:

PS10:
> This is a general krpc-in-Impala question: I can't find where you set up au
Very good point. This is definitely a bug and it's now fixed in this commit 
here 
(https://github.com/apache/impala/commit/5c541b960491ba91533712144599fb3b6d99521d)


http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h
File be/src/util/uid-util.h:

http://gerrit.cloudera.org:8080/#/c/10855/10/be/src/util/uid-util.h@79
PS10, Line 79:   DCHECK(uid_pb.IsInitialized());
> worth DCHECKs here that the fields are set by calling uid_pb.IsInitialized(
Done


http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/10855/10/bin/impala-config.sh@562
PS10, Line 562: export HBASE_CONF_DIR="$IMPALA_FE_DIR/src/test/resources"
> why's this necessary? Can we change cmake to invoke it from the full path i
FindProtobuf should have set PROTOBUF_PROTOC_EXECUTABLE. Not sure why I needed 
to set it before.


http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py
File tests/custom_cluster/test_rpc_exception.py:

http://gerrit.cloudera.org:8080/#/c/10855/10/tests/custom_cluster/test_rpc_exception.py@97
PS10, Line 97:
> can we change this flag to be in millis instead of seconds? Or do we advert
I don't think this flag is documented as far as I understand.  We can deprecate 
this old flag and rename it to include '_ms' suffix.



--
To view, visit http://gerrit.cloudera.org:8080/10855
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7638583b433dcac066b87198e448743d90415ebe
Gerrit-Change-Number: 10855
Gerrit-PatchSet: 12
Gerrit-Owner: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Michael Ho <k...@cloudera.com>
Gerrit-Reviewer: Sailesh Mukil <sail...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <t...@apache.org>
Gerrit-Comment-Date: Thu, 06 Sep 2018 17:48:58 +0000
Gerrit-HasComments: Yes

Reply via email to