anujphadke has posted comments on this change. Change subject: IMPALA-4866: Hash join node does not apply limits correctly ......................................................................
Patch Set 2: (5 comments) http://gerrit.cloudera.org:8080/#/c/6778/1//COMMIT_MSG Commit Message: PS1, Line 11: exchange > typo Done http://gerrit.cloudera.org:8080/#/c/6778/1/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 506: DCHECK(status.ok()); > there are a lot of other places that this loop can exit. Can't the out_batc I have moved the check outside the while loop, so that this gets invoked everytime we break out of it. Ran a bunch of queries and tried to verify that the limit gets applied correctly Found a few cases where num_rows_returned was not incremented and the ReachedLimit check was not correctly applied. PS1, Line 580: > >= ? Removed this check now. Please ignore. Line 581: DCHECK(current_probe_row_ == NULL); > I think you need to update num_rows_returned_ as well Removed this check. Please ignore. http://gerrit.cloudera.org:8080/#/c/6778/1/tests/common/test_dimensions.py File tests/common/test_dimensions.py: PS1, Line 114: # Don't run with NUM_NODES=1 due to IMPALA-561 : # ALL_CLUSTER_SIZES = [0, 1] > update comment Ran the private tests by changing this value ( but without the fix to correctly apply the limits). Current tests did not catch this. I am reverting it back and adding a few tests which can catch this issue. -- To view, visit http://gerrit.cloudera.org:8080/6778 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I414124f8bb6f8b2af2df468e1c23418d05a0e29f Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: anujphadke <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: anujphadke <[email protected]> Gerrit-HasComments: Yes
