Jim Apple has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6025/5/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

PS5, Line 1122: ((double) state->source_size - r) / state->source_size
I'm not yet convinced this is correct. I'm not convinced it is correct in HEAD.

First, a nit: I think the denominator should be increased by 1. (I think the 
expected value of the maximum of two independent random variables over the 
uniform distribution on (0,1) is 2/3, not 3/4)

Second, while I understand that this produces values in the given range, I 
don't think I understand how this implements the algorithm mentioned in the .h 
file, in which values of weight k are mapped to r^(1/k), where r is uniform 
random from (0,1).


-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to