Hello Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/24089
to look at the new patch set (#17).
Change subject: IMPALA-2744: Codegen for tuple DeepCopy - part1
......................................................................
IMPALA-2744: Codegen for tuple DeepCopy - part1
Created codegen'd version of BufferedTupleStream::DeepCopy.
Codegen'd function is only used by PartitionedHashJoinBuilder in this
patch.
It was considered to use Tuple's TryDeepCopy* functions for
BufferedTupleStream, but it's better to keep its own DeepCopy
for there are differences between the two:
-BufferedTupleStream doesn't copy tuples serially, first
it copies "fixed len" parts of all tuples, then all
"string data" for all tuples, then all "collection data" of
all tuples.
-BufferedTupleStream's DeepCopy doesn't set String's pointers.
This also applies when copying a string from a collection.
Measurements:
Measured with the following commit:
select straight_join l_orderkey, o_custkey, o_orderkey, l_partkey
from tpch30.orders left join /*+broadcast*/ tpch30.lineitem
on o_orderkey = l_orderkey where o_totalprice<0;
Where tpch30 is generated by:
bin/load-data.py -s 30 -f --workloads tpch
--table_formats text/none,parquet/snap
Before:
BuildRowsPartitionTime: 3s996ms
After:
BuildRowsPartitionTime: 2s139ms
Testing:
Added tests to buffered-tuple-stream-test.cc that compare the results
of codegen'd and basic DeepCopy variations of BufferedTupleStream
with different data types.
Change-Id: I63e32babdbaf56095478c6c66afb9cb91189f946
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/exec/partitioned-hash-join-builder-ir.cc
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/partitioned-hash-join-builder.h
A be/src/exec/partitioned-hash-join-builder.inline.h
M be/src/runtime/CMakeLists.txt
A be/src/runtime/buffered-tuple-stream-ir.cc
M be/src/runtime/buffered-tuple-stream-test.cc
M be/src/runtime/buffered-tuple-stream.cc
M be/src/runtime/buffered-tuple-stream.h
M be/src/runtime/buffered-tuple-stream.inline.h
M be/src/runtime/spillable-row-batch-queue.h
A be/src/runtime/tuple-row-ir.cc
M be/src/runtime/tuple-row.h
15 files changed, 736 insertions(+), 176 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/24089/17
--
To view, visit http://gerrit.cloudera.org:8080/24089
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I63e32babdbaf56095478c6c66afb9cb91189f946
Gerrit-Change-Number: 24089
Gerrit-PatchSet: 17
Gerrit-Owner: Balazs Hevele <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>