Omid Shahidi has uploaded a new patch set (#11). (
http://gerrit.cloudera.org:8080/18798 )
Change subject: IMPALA-6684: Fix untracked memory in KRPC
......................................................................
IMPALA-6684: Fix untracked memory in KRPC
During serialization of an row batch header, a tuple_data_ is created
which will hold the compressed tuple data for an outbound row batch.
We would like this tuple data to be trackable as it is responsible for
a significant portion of untrackable memory from the krpc data stream
sender. By using free pool, we are able to allocate tuple data and
compression scratch and account for it in the memory tracker of the
KrpcDataStreamSender. This solution creates a RAII class responsible
for memory allocation and changes the existing code to use a char buffer
pointed by a char* tuple_data_ instead of the previously used
std::string tuple_data_. The thrift implementation is left unchanged and
the protobuf implementation is seperated.
Testing:
- Passed core tests.
- Ran a single node benchmark which shows no regression.
- Updated row-batch-serialize-test and row-batch-serialize-benchmark to
test the row-batch serialization used by KRPC.
- Manually collected query-profile, heap growth, and memory usage log
showing untracked memory decreased by 1/2.
- Add end-end unit-test to verify the new counters in runtime profile
New row-batch serialization benchmark:
serialize:
Func 10% 50% 90% 10% 50% 90% ile
(rel) (rel) (rel)
-----------------------------------------------------------
ser_no_dups_baseline 8.36 8.6 8.7 1X 1X 1X
ser_no_dups 6.73 6.85 6.93 0.804X 0.796X 0.796X
ser_no_dups_full 5.28 5.38 5.55 0.631X 0.625X 0.637X
ser_adjacent_dups_baseline 12.9 13.2 13.4 1X 1X 1X
ser_adjacent_dups 23.2 23.7 24.1 1.8X 1.8X 1.8X
ser_adjacent_dups_full 19.9 20.3 20.7 1.54X 1.54X 1.55X
ser_dups_baseline 9.17 9.54 9.72 1X 1X 1X
ser_dups 7.45 7.69 7.86 0.812X 0.806X 0.809X
ser_dups_full 14.6 15 15.3 1.6X 1.57X 1.57X
deserialize:
Func 10% 50% 90% 10% 50% 90% ile
(rel) (rel) (rel)
-----------------------------------------------------------
deser_no_dups_baseline 32.6 33.5 34 1X 1X 1X
deser_no_dups 32.5 33.1 33.7 0.999X 0.99X 0.992X
deser_adjacent_dups_baseline 53.1 54 54.7 1X 1X 1X
deser_adjacent_dups 80.3 81.6 82.5 1.51X 1.51X 1.51X
deser_dups_baseline 52.4 54 54.7 1X 1X 1X
deser_dups 86.8 88.4 89.7 1.66X 1.64X 1.64X
Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
---
M be/src/benchmarks/row-batch-serialize-benchmark.cc
M be/src/runtime/krpc-data-stream-sender.cc
M be/src/runtime/krpc-data-stream-sender.h
M be/src/runtime/row-batch-serialize-test.cc
M be/src/runtime/row-batch.cc
M be/src/runtime/row-batch.h
A be/src/runtime/row-batch.inline.h
A testdata/workloads/tpch/queries/datastream-sender.test
A tests/query_test/test_datastream_sender.py
9 files changed, 656 insertions(+), 214 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/18798/11
--
To view, visit http://gerrit.cloudera.org:8080/18798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2ba2b907ce4f275a7a1fb8cf75453c7003eb4b82
Gerrit-Change-Number: 18798
Gerrit-PatchSet: 11
Gerrit-Owner: Omid Shahidi <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Omid Shahidi <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>