Qifan Chen has posted comments on this change. ( http://gerrit.cloudera.org:8080/16369 )
Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes ...................................................................... Patch Set 4: (19 comments) Looks good! http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h@615 PS4, Line 615: std::unordered_set<UniqueIdPB> Use of & would be better. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h@620 PS4, Line 620: Same comment. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc@1043 PS4, Line 1043: Use of & here is better. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc@1114 PS4, Line 1114: Same comment. Use reference. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc File be/src/runtime/query-driver.cc: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@128 PS4, Line 128: s nit: Set http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@241 PS4, Line 241: Pass Maybe use the word 'Transfer'. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@241 PS4, Line 241: retried query to be retried? http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc File be/src/scheduling/admission-controller.cc: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc@1572 PS4, Line 1572: chedule_executor_group != executor_group) If all nodes are blacklisted, then there is no way to run the query again. Should we handle this case here? http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc@1582 PS4, Line 1582: delete schedule_executor_group; Not sure why the temporary executor group is destroyed here. Are we suppose to use it? http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.h File be/src/scheduling/executor-group.h: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.h@54 PS4, Line 54: except nit: maybe reword as 'excluding' http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc File be/src/scheduling/executor-group.cc: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@42 PS4, Line 42: find_blacklisted_executor nit: reword as blacklisted_executor_found http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@46 PS4, Line 46: blacklisted_backend_ids.find(desc.backend_id()) : != blacklisted_backend_ids.end() Use std::ordered_set::contains() or std::ordered_set::count() probably would be better as no iterator object is to be created. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@60 PS4, Line 60: .find(desc same comment as above. http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/service/client-request-state.h File be/src/service/client-request-state.h: http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/service/client-request-state.h@396 PS4, Line 396: Use reference &. http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py File tests/custom_cluster/test_query_retries.py: http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@37 PS4, Line 37: the tests simulate rpc errors at nit: reword as "to simulate rpc errors in tests". http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@291 PS4, Line 291: 3-rd nit: 3rd. http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@294 PS4, Line 294: nit: assigned to any fragment instances http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@302 PS4, Line 302: Add admission delay I wonder why a delay is added here. http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@332 PS4, Line 332: Maybe add two more test cases: 1) Two nodes are blacklisted; 2) All nodes are blacklisted. -- To view, visit http://gerrit.cloudera.org:8080/16369 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105 Gerrit-Change-Number: 16369 Gerrit-PatchSet: 4 Gerrit-Owner: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Qifan Chen <[email protected]> Gerrit-Reviewer: Sahil Takiar <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Comment-Date: Thu, 03 Sep 2020 14:21:33 +0000 Gerrit-HasComments: Yes
