Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16369 )

Change subject: IMPALA-9636: Don't run retried query on the blacklisted nodes
......................................................................


Patch Set 4:

(19 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h@615
PS4, Line 615: std::unordered_set<UniqueIdPB>
Use of & would be better.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.h@620
PS4, Line 620:
Same comment.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc@1043
PS4, Line 1043:
Use of & here is better.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/coordinator.cc@1114
PS4, Line 1114:
Same comment. Use reference.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc
File be/src/runtime/query-driver.cc:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@128
PS4, Line 128: s
nit: Set


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@241
PS4, Line 241: Pass
Maybe use the word 'Transfer'.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/runtime/query-driver.cc@241
PS4, Line 241: retried
query to be retried?


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc
File be/src/scheduling/admission-controller.cc:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc@1572
PS4, Line 1572: chedule_executor_group != executor_group)
If all nodes are blacklisted, then there is no way to run the query again. 
Should we handle this case here?


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/admission-controller.cc@1582
PS4, Line 1582: delete schedule_executor_group;
Not sure why the temporary executor group is destroyed here. Are we suppose to 
use it?


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.h
File be/src/scheduling/executor-group.h:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.h@54
PS4, Line 54: except
nit: maybe reword as 'excluding'


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc
File be/src/scheduling/executor-group.cc:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@42
PS4, Line 42: find_blacklisted_executor
nit: reword as blacklisted_executor_found


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@46
PS4, Line 46: blacklisted_backend_ids.find(desc.backend_id())
            :           != blacklisted_backend_ids.end()
Use std::ordered_set::contains() or std::ordered_set::count() probably would be 
better as no iterator object is to be created.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/scheduling/executor-group.cc@60
PS4, Line 60: .find(desc
same comment as above.


http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/service/client-request-state.h
File be/src/service/client-request-state.h:

http://gerrit.cloudera.org:8080/#/c/16369/4/be/src/service/client-request-state.h@396
PS4, Line 396:
Use reference &.


http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py
File tests/custom_cluster/test_query_retries.py:

http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@37
PS4, Line 37: the tests simulate rpc errors at
nit: reword as "to simulate rpc errors in tests".


http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@291
PS4, Line 291: 3-rd
nit: 3rd.


http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@294
PS4, Line 294:
nit: assigned to any fragment instances


http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@302
PS4, Line 302: Add admission delay
I wonder why a delay is added here.


http://gerrit.cloudera.org:8080/#/c/16369/4/tests/custom_cluster/test_query_retries.py@332
PS4, Line 332:
Maybe add two more test cases:

1) Two nodes are blacklisted;
2) All nodes are blacklisted.



--
To view, visit http://gerrit.cloudera.org:8080/16369
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I00bc1b5026efbd0670ffbe57bcebc457d34cb105
Gerrit-Change-Number: 16369
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Sahil Takiar <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>
Gerrit-Comment-Date: Thu, 03 Sep 2020 14:21:33 +0000
Gerrit-HasComments: Yes

Reply via email to