Hello Dan Hecht,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/5811
to look at the new patch set (#13).
Change subject: IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
......................................................................
IMPALA-4674: Part 1: port BufferedTupleStream to BufferPool
Add a copy of BufferedTupleStream that allocates memory from BufferPool.
This will replace the original implementation when IMPALA-3200 is
completed.
The major changes are:
* Terminology is updated from "blocks" to "pages"
* No small buffer support is needed (hooray!).
* BufferedTupleStream needs to do its own tracking of # of rows per
page, etc instead of relying on BufferedBlockMgr to do it. A
wrapper around PageHandle is used.
* Profile counters (unpin, pin, get new block time) are removed -
similar counters in the BufferPool client are more useful.
* Changed the tuple null indicators so that they are allocated
before each tuple, rather than in a block at the start of the
page. This slightly reduces the memory density, but greatly
simplifies much logic. In particular, it avoids problems with
RowIdx and larger pages with offsets that don't fit in 16 bits.
* Buffer management of read/write streams uses the new pin-counting
support to separate pinning of the read and write pages.
This means that the reservation usage of an unpinned read/write
stream does not fluctuate and the read/write iterators can always
advance without requiring additional reservation.
Testing this required some further changes. TestEnv was refactored
so it can set up either BufferPool or BufferedBlockMgr. Some
BufferPool-related state is added to ExecEnv, QueryState and
RuntimeState, but is only created for backend tests that explicitly
create a BufferPool.
The following is left for future work:
* IMPALA-3808 (large row support) is not added. I've added
TODOs to the code to places that will require changes.
* IMPALA-4179 (remove MarkNeedsDeepCopy()) is not fixed, since
it requires global changes to operators that accumulate memory.
Testing:
All of the BufferedTupleStream unit tests are ported to the new
implementation, except for ones specifically testing the small
buffer functionality.
Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
---
M be/src/exec/hash-table-test.cc
M be/src/runtime/CMakeLists.txt
M be/src/runtime/buffered-block-mgr-test.cc
M be/src/runtime/buffered-tuple-stream-test.cc
A be/src/runtime/buffered-tuple-stream-v2-test.cc
A be/src/runtime/buffered-tuple-stream-v2.cc
A be/src/runtime/buffered-tuple-stream-v2.h
A be/src/runtime/buffered-tuple-stream-v2.inline.h
M be/src/runtime/bufferpool/buffer-pool-test.cc
M be/src/runtime/exec-env.cc
M be/src/runtime/exec-env.h
M be/src/runtime/query-state.cc
M be/src/runtime/query-state.h
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
M be/src/runtime/runtime-state.cc
M be/src/runtime/runtime-state.h
M be/src/runtime/test-env.cc
M be/src/runtime/test-env.h
M be/src/runtime/tmp-file-mgr-test.cc
20 files changed, 3,002 insertions(+), 100 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/5811/13
--
To view, visit http://gerrit.cloudera.org:8080/5811
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7bb47a818564ac19a6be4ca88db0578f4ea0b709
Gerrit-PatchSet: 13
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>