Michael Ho has posted comments on this change. ( http://gerrit.cloudera.org:8080/12049 )
Change subject: IMPALA-4555: Make QueryState's status reporting more robust ...................................................................... Patch Set 2: (8 comments) Some trivial comments for now. I have yet to think through the retry logic to see if they can be simplified and more consistent. http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc@300 PS2, Line 300: || instance_stats->done_ If we send the final report multiple times, won't report_seq_no == last_report_seq_no here ? http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/coordinator-backend-state.cc@320 PS2, Line 320: auto const auto& http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/fragment-instance-state.h File be/src/runtime/fragment-instance-state.h: http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/fragment-instance-state.h@101 PS2, Line 101: recieved nit: received http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.h File be/src/runtime/query-state.h: http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.h@387 PS2, Line 387: The number of consecutive failed intermediate reports. This is reset to 0 upon a successful RPC, right ? So, will it be clearer to say "The number of failed intermediate reports since the last successfully sent report" ? http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc@386 PS2, Line 386: TUniqueId const TUniqueId& http://gerrit.cloudera.org:8080/#/c/12049/2/be/src/runtime/query-state.cc@563 PS2, Line 563: FLAGS_status_report_interval_ms * (num_failed_reports_ + 1))) { This backoff time seems way larger than the retry interval used when failing to send the final report. This seems a bit inconsistent. http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@116 PS2, Line 116: FInstanceExecStatusStatefulPB Is there a more succinct name for it ? StatefulStatusPB ? http://gerrit.cloudera.org:8080/#/c/12049/2/common/protobuf/control_service.proto@138 PS2, Line 138: // Cumulative structural changes made by the table sink of this fragment : // instance. This is sent only when 'done' above is true. Not idempotent. : optional DmlExecStatusPB dml_exec_status = 5; Why isn't this moved into FInstanceExecStatusStatefulPB -- To view, visit http://gerrit.cloudera.org:8080/12049 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib6007013fc2c9e8eeba11b752ee58fb3038da971 Gerrit-Change-Number: 12049 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Thomas Marshall <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Comment-Date: Tue, 18 Dec 2018 00:43:38 +0000 Gerrit-HasComments: Yes
