Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4787: Optimize APPX_MEDIAN() memory usage
......................................................................


Patch Set 8:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6025/8//COMMIT_MSG
Commit Message:

PS8, Line 22: No new tests were added,
> can update this now
Done


http://gerrit.cloudera.org:8080/#/c/6025/8/be/src/exprs/aggregate-functions-ir.cc
File be/src/exprs/aggregate-functions-ir.cc:

Line 126: // TODO: this file should be cross compiled and then all of the 
builtin
> is this done?
This file ends with a "-ir" and it's in be/src/codegen/impala-ir.cc, so this 
must be done. Removed.


Line 961: struct ReservoirSampleState {
> i'd say this really turned into a class
Done


Line 967:   // resize occurs, this needs to be updated from the outside.
> what does 'from the outside' mean?
I meant that whoever resizes and memcopies this struct over, they are also 
responsible for updating the capacity. It might be more clear if I remove that 
part.


Line 977:   ReservoirSampleState(int init_capacity) :
> use standard formatting
Done.


Line 1016:     // The array of ReservoirSamples starts right after 
ReservoirSampleState, so we use
> that's often done by putting an array of size 1 at the end of the header st
Done. Also made some of the functions non-const because we don't want a 
function like GetSample() to return a const ReservoirSample<T>*.


Line 1025:   int64_t GetNext64(int64_t max) {
> while you're at it, this deserves a comment
Done


Line 1033:   // Given a buffer that contains a ReservoirSampleState, resize the 
buffer so that it's
> its
nice catch


Line 1040:   if (new_capacity * 2 >= MAX_CAPACITY) new_capacity = MAX_CAPACITY;
> if state->capacity is 10 and max_capacity is 40, this line sets new_capacit
With the current constants, we would be resizing from about 8000 to 20,000. I 
think this is acceptable. Would it be better to resize to 16,000 then to 20,000?


Line 1062:   // If the array gets filled due to updates or merges, we 
reallocate a larger buffer to
> you should put this (= a brief description of what you're doing) somewhere 
I moved this description higher up. I kept it mostly unmodified.


http://gerrit.cloudera.org:8080/#/c/6025/8/testdata/workloads/functional-query/queries/QueryTest/aggregation.test
File testdata/workloads/functional-query/queries/QueryTest/aggregation.test:

PS8, Line 1163: mediam
> spelling
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6025
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I99adaad574d4fb0a3cf38c6cbad8b2a23df12968
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Taras Bobrovytsky <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to