Tim Armstrong has uploaded a new patch set (#4).

Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants.
......................................................................

IMPALA-3344: Simplify sorter and document/enforce invariants.

Clarify relationships between classes, clean up the previous mess
where every class was friends with the other so there's an actual
distinction between public and private members. TupleIterator
is now no longer tied to TupleSorter, just Run.

Factor out some functions from large functions.

Document and enforce invariants in many cases.

Simplify and document iterator logic.

Make management of buffers when iterating over output stream more
explicitly correct: either use MarkNeedToReturn() or attach block
to the batch as appropriate. Also use the atomic block exchange
operation when moving between blocks in unpinned runs to prevent
pin failures at that point. I explicitly have avoided changing
the hairy block management logic when allocating buffers for
merging, that will need addressing in a follow-up patch.

Add a SpilledRuns counter so that it's more explicit that spilling
occurred.

Testing:
Added some tests for corner cases with empty and NULL strings.

Performance:
Benchmarking against old code initial revealed some regressions from
changes in inlining. Force inlining the TupleComparator::operator() and
iterator Next()/Prev() functions helped and performance seems similar or
slightly better on the targeted orderby benchmarks.

Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a
---
M be/src/runtime/sorted-run-merger.cc
M be/src/runtime/sorted-run-merger.h
M be/src/runtime/sorter.cc
M be/src/runtime/sorter.h
M be/src/util/tuple-row-compare.h
M 
testdata/workloads/functional-query/queries/QueryTest/single-node-large-sorts.test
M testdata/workloads/functional-query/queries/QueryTest/sort.test
M tests/query_test/test_sort.py
8 files changed, 705 insertions(+), 519 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/26/2826/4
-- 
To view, visit http://gerrit.cloudera.org:8080/2826
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Tim Armstrong <[email protected]>

Reply via email to