Csaba Ringhofer created IMPALA-13509:
----------------------------------------
Summary: Avoid duplicate deepcopy duing hash partitioning in
KrpcDataStreamSender
Key: IMPALA-13509
URL: https://issues.apache.org/jira/browse/IMPALA-13509
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Csaba Ringhofer
Currently all rows are deep copied twice:
1. to the RowBatch of the given channel
2. to an OutboundRowBatch when the collector RowBatch is at capacity
Copying directly to an OutboundRowBatch could avoid some CPU work.
The would also allow easier implementation of the following improvements:
- deduplicate tuples similarly to broadcast/unpartitioned exchange
(IMPALA-13225).
- keep outbound row batch size below data_stream_sender_buffer_size even for
var len data
--
This message was sent by Atlassian Jira
(v8.20.10#820010)