Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16942 )

Change subject: IMPALA-10296: Fix analytic limit pushdown when predicates are 
present
......................................................................


Patch Set 11:

(10 comments)

I've only looked at the BE part and tests so far, but I can try to wrap my head 
around the FE changes too if its helpful.

I think the BE logic looks good, but since its complex it took me awhile to 
convince myself it was correct, so most of my suggestions are around cleaning 
up comments

http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc
File be/src/exec/topn-node-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc@44
PS11, Line 44:     priority_queue_.Push(insert_tuple);
Might be useful to explicitly mention here that we don't have to worry about 
ties going into the heap at this point, as we'll handle that with the 
complicated logic in InsertTupleWithTieHandling later if necessary.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc@91
PS11, Line 91: // 'materialized_tuple' needs to be added. Figure out which 
other tuples, if any,
             :     // need to be removed from the heap.
I might move this above the DCHECK, to make it clearer its referring to the 
entire 'else' branch.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc@93
PS11, Line 93: It will compare equal with any overflowed ties at the back of
             :     // 'priority_queue_'
Not sure what this is supposed to mean, since my understanding would be that 
'overflowed' ties aren't in 'priority_queue_'.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc@97
PS11, Line 97: 'top_tuple'
Might be worth explicitly saying that 'top_tuple' is what we just popped off 
the heap.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node-ir.cc@100
PS11, Line 100:       // Removing all the ties would bring the heap below its 
capacity. We need to
I find this comment a little confusing, since we've already done the one Pop() 
we need to do, its unclear why we would think about doing another Pop() and 
bring the heap below capacity just because the new top is tied with the old one.

I think it would be better to say something along the lines of "The new top is 
still tied with the tuples in 'overflowed_ties_' so don't clear it, just add 
the previous top as another overflowed tuple"


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.h
File be/src/exec/topn-node.h:

http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.h@86
PS11, Line 86: /// Unpartitioned TopN Implementation Details
nit: its a little confusing to call out 'unpartitioned mode' specifically when 
its the only mode currently supported. I guess it doesn't matter if the 
partitioned mode patch is going in soon, but you could add a "TODO: implement 
partitioned mode" or something.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.h@180
PS11, Line 180:   RuntimeProfile::Counter* num_partitions_counter_ = nullptr;
The counters here and below aren't used in this patch


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.h@200
PS11, Line 200: GetNextPartitioned()
nit: not in this patch


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.h@256
PS11, Line 256:   /// Helper to insert tuple row into the priority queue with 
tie handling.
If I understand correctly, this function only gets called once the heap has 
reached capacity. I suppose that's sort of implicit, since that's when tie 
handling starts to matter, but might be useful to call this out explicitly.


http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.cc
File be/src/exec/topn-node.cc:

http://gerrit.cloudera.org:8080/#/c/16942/11/be/src/exec/topn-node.cc@306
PS11, Line 306: addition to
I think this is a typo? i.e. these two words should be removed



--
To view, visit http://gerrit.cloudera.org:8080/16942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I801d7799b0d649c73d2dd1703729a9b58a662509
Gerrit-Change-Number: 16942
Gerrit-PatchSet: 11
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Comment-Date: Sat, 16 Jan 2021 00:12:42 +0000
Gerrit-HasComments: Yes

Reply via email to