Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15497 )

Change subject: IMPALA-8005: Randomize partitioning exchanges.
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc
File be/src/runtime/data-stream-test.cc:

http://gerrit.cloudera.org:8080/#/c/15497/3/be/src/runtime/data-stream-test.cc@742
PS3, Line 742:     if (map_query_1[i] != map_query_2[i]) {
> It depends on what the core purpose of the test is.  I am wondering if it i
If I'm understanding this correctly, I think the probability of a false match 
should be very small.

The odds of a value being hashed to the same receiver for both query1 and 
query2 is roughly 1/#receivers = 1/4. A false positive for this test would 
require all the values on all the receivers to be the same from query1 to 
query2. So, the probability of a false match is (1/4)^(#distinct values). It 
looks like the Sender uses a distinct value for each send, so there should be 
about 1024 distinct values (we should double check that). (1/4)^1024 is very 
small.

Doing the check on the sender side would also work. I like the end-to-end 
aspect of this, but a direct comparison on the sender side would also be ok.



--
To view, visit http://gerrit.cloudera.org:8080/15497
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1936e6cc3e8d66420a5a9301f49221ca38f3e468
Gerrit-Change-Number: 15497
Gerrit-PatchSet: 3
Gerrit-Owner: Anurag Mantripragada <anu...@cloudera.com>
Gerrit-Reviewer: Aman Sinha <amsi...@cloudera.com>
Gerrit-Reviewer: Anurag Mantripragada <anu...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Joe McDonnell <joemcdonn...@cloudera.com>
Gerrit-Comment-Date: Wed, 25 Mar 2020 19:57:39 +0000
Gerrit-HasComments: Yes

Reply via email to