Tim Armstrong has uploaded a new patch set (#4). Change subject: IMPALA-3344: Simplify sorter and document/enforce invariants. ......................................................................
IMPALA-3344: Simplify sorter and document/enforce invariants. Clarify relationships between classes, clean up the previous mess where every class was friends with the other so there's an actual distinction between public and private members. TupleIterator is now no longer tied to TupleSorter, just Run. Factor out some functions from large functions. Document and enforce invariants in many cases. Simplify and document iterator logic. Make management of buffers when iterating over output stream more explicitly correct: either use MarkNeedToReturn() or attach block to the batch as appropriate. Also use the atomic block exchange operation when moving between blocks in unpinned runs to prevent pin failures at that point. I explicitly have avoided changing the hairy block management logic when allocating buffers for merging, that will need addressing in a follow-up patch. Add a SpilledRuns counter so that it's more explicit that spilling occurred. Testing: Added some tests for corner cases with empty and NULL strings. Performance: Benchmarking against old code initial revealed some regressions from changes in inlining. Force inlining the TupleComparator::operator() and iterator Next()/Prev() functions helped and performance seems similar or slightly better on the targeted orderby benchmarks. Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a --- M be/src/runtime/sorted-run-merger.cc M be/src/runtime/sorted-run-merger.h M be/src/runtime/sorter.cc M be/src/runtime/sorter.h M be/src/util/tuple-row-compare.h M testdata/workloads/functional-query/queries/QueryTest/single-node-large-sorts.test M testdata/workloads/functional-query/queries/QueryTest/sort.test M tests/query_test/test_sort.py 8 files changed, 705 insertions(+), 519 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/26/2826/4 -- To view, visit http://gerrit.cloudera.org:8080/2826 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: newpatchset Gerrit-Change-Id: I9c619e81fd1b8ac50e257172c8bce101a112b52a Gerrit-PatchSet: 4 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Tim Armstrong <[email protected]>
