[ 
https://issues.apache.org/jira/browse/IMPALA-12433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17819175#comment-17819175
 ] 

ASF subversion and git services commented on IMPALA-12433:
----------------------------------------------------------

Commit 2f14fd29c0b47fc2c170a7f0eb1cecaf6b9704f4 in impala's branch 
refs/heads/master from Csaba Ringhofer
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=2f14fd29c ]

IMPALA-12433: Share buffers among channels in KrpcDataStreamSender

Before this patch each KrpcDataStreamSender::Channel had 2
OutboundRowBatch with its own serialization and compression buffers.

This patch switches to use a single buffer per channel. This is
enough to store the in-flight data in KRPC, while other buffers
are only used during serialization and compression which is done for
just a single channel at a time, so can be shared among channels.

Memory estimates in the planner are not changed because the existing
calculation has several issues (see IMPALA-12594).

Change-Id: I64854a350a9dae8bf3af11c871882ea4750e60b3
Reviewed-on: http://gerrit.cloudera.org:8080/20719
Tested-by: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Reviewed-by: Kurt Deschler <kdesc...@cloudera.com>
Reviewed-by: Zihao Ye <eyiz...@163.com>
Reviewed-by: Michael Smith <michael.sm...@cloudera.com>


> KrpcDataStreamSender could share some buffers between channels
> --------------------------------------------------------------
>
>                 Key: IMPALA-12433
>                 URL: https://issues.apache.org/jira/browse/IMPALA-12433
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>            Reporter: Csaba Ringhofer
>            Priority: Major
>              Labels: memory-saving, performance
>
> Currently each channel has two outbound row batches and each of those have 2 
> buffers, one for serialization and another for compression.
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/row-batch.h#L100
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/be/src/runtime/krpc-data-stream-sender.cc#L236
> https://github.com/apache/impala/blob/0f55e551bc98843c79a9ec82582ddca237aa4fe9/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java#L81
> As serialization + compression is always done from the fragment instance 
> thread only one compression is done at a time, so a single compression buffer 
> could be shared between channels. If this buffer is sent via KRPC then it 
> could be swapped with the per channel buffer. 
> As far as I understand at least one buffer per channel is needed because  
> async KRPC calls can use it from another thread (this is done to avoid an 
> extra copy of the buffer before RPCs). We can only reuse that buffer after 
> getting a callback from KRPC.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to