Thomas Tauber-Marshall has posted comments on this change.

Change subject: IMPALA-4731/IMPALA-397/IMPALA-4728: Materialize sort exprs
......................................................................


Patch Set 5:

(16 comments)

http://gerrit.cloudera.org:8080/#/c/6322/4/fe/src/main/java/org/apache/impala/analysis/SortInfo.java
File fe/src/main/java/org/apache/impala/analysis/SortInfo.java:

Line 39:   // All Exprs with cost greater than this will be materialized. Since 
we don't currently
> All ordering exprs with cost...
Done


Line 40:   // have any information about actual function costs, this value 
guarantees that all
> guarantees -> is intended to ensure
Done


Line 42:   // operations unmaterialized, for example 'SlotRef + SlotRef' would 
have cost 3 and not
> remove cost "3" since this might change and it will be hard to tell this co
Done


Line 167:    * result expressions. Those slot refs in the ordering and result 
exprs are substituted
> The "Done by" sentence sounds a little procedural/complicated, can you cond
Done


Line 170:    * TODO: We could do something more sophisticated than simply 
copying input slot refs -
> exprs
Done


Line 181:     // exprs that get materialized to slot refs in the materialized 
sort tuple.
> Can be improved for clarity. Also point 2 is not quite accurate.
Done


Line 191:     TreeNode.collect(resultExprs, 
Predicates.instanceOf(SlotRef.class), sourceSlots);
> // SlotRefs in the result and ordering exprs after substituting the materia
Done


Line 201:       sortTupleExprs.add(origSlotRef);
> contains() is expensive (and every iteration might grow the rhs).
Done


Line 222:    * the input to the sort with setMaterializedTupleInfo(), and a 
mapping from those
> Easier to read if you have bullet points for the things we materialize, e.g
Done


http://gerrit.cloudera.org:8080/#/c/6322/4/testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test:

Line 1: # expensive order exprs get materialized, cheap ones don't
> space after #
Done


Line 118: |
> Same as case above? We don't materialize in a merging exchange, the precedi
Done


Line 139
> Need additional tests to cover the tricky code to avoid redundant expr mate
Done


http://gerrit.cloudera.org:8080/#/c/6322/4/tests/query_test/test_sort.py
File tests/query_test/test_sort.py:

Line 158: class TestRandomSort(ImpalaTestSuite):
> This will run over all test vectors right? I think it makes sense to only r
Because I'm not taking 'vector' as a parameter to the tests, they will only be 
run once, even under exhaustive.

Not sure what the preferred flow here is - how I have it, or taking the vector 
but restricting it as you suggested, to make the intentions clearer.


Line 172:     # Include "random()" in the select list to check that its sorted 
correctly.
> it's
Done


Line 174:         "select random() as r from functional.alltypessmall order by 
r").data,
> also add one with an inline view:
Done


Line 185:     """Tests that a window function over 'order by random()' worked 
as expected."""
> works as expected
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6322
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ifefdaff8557a30ac44ea82ed428e6d1ffbca2e9e
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to