Csaba Ringhofer created IMPALA-12433:
----------------------------------------
Summary: KrpcDataStreamSender could share some buffers between
channels
Key: IMPALA-12433
URL: https://issues.apache.org/jira/browse/IMPALA-12433
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Csaba Ringhofer
Currently each channel has two outbound row batches and each of those have 2
buffers, one for serialization and another for compression.
https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/row-batch.h#L100
https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/krpc-data-stream-sender.cc#L236
https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java#L81
As serialization + compression is always done from the fragment instance thread
only one compression is done at a time, so a single compression buffer could be
shared between channels. If this buffer is sent via KRPC then it could be
swapped with the per channel buffer.
As far as I understand at least one buffer per channel is needed because async
KRPC calls can use it from another thread (this is done to avoid an extra copy
of the buffer before RPCs). We can only reuse that buffer after getting a
callback from KRPC.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)