[
https://issues.apache.org/jira/browse/IMPALA-13509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Smith resolved IMPALA-13509.
------------------------------------
Fix Version/s: Impala 4.5.0
Resolution: Fixed
> Avoid duplicate deepcopy during hash partitioning in KrpcDataStreamSender
> -------------------------------------------------------------------------
>
> Key: IMPALA-13509
> URL: https://issues.apache.org/jira/browse/IMPALA-13509
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Csaba Ringhofer
> Assignee: Csaba Ringhofer
> Priority: Critical
> Labels: performance
> Fix For: Impala 4.5.0
>
>
> Currently all rows are deep copied twice:
> 1. to the RowBatch of the given channel
> 2. to an OutboundRowBatch when the collector RowBatch is at capacity
> Copying directly to an OutboundRowBatch could avoid some CPU work.
> The would also allow easier implementation of the following improvements:
> - deduplicate tuples similarly to broadcast/unpartitioned exchange
> (IMPALA-13225).
> - keep outbound row batch size below data_stream_sender_buffer_size even for
> var len data
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]