Omid Shahidi created IMPALA-11510:
-------------------------------------
Summary: Provide memory estimation for Exchange Sender's plan node
Key: IMPALA-11510
URL: https://issues.apache.org/jira/browse/IMPALA-11510
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Reporter: Omid Shahidi
Currently there is no memory estimation provided for DataStreamSink which is
responsible for consuming the KRPC OutboundRowBatches which contain the tuple
data for RowBatches. There is a hypothesis that providing such memory
estimation can help to reduce memory usage without affecting the performance. A
rough estimation for exchange sender is:
num_channel * 2 * (tuple_buffer_length + compressed_buffer_length)
With IMPALA-6684, two new runtime profile counters were added. TupleDataBytes
and CompressionScratchBytes track the tuple buffer length and compressed buffer
length within an OutboundRowBatch. These runtime profile counters can be used a
research source and as experimentation for an estimation of tuple_buffer_length
and comressed_buffer_length
https://github.com/apache/impala/blob/26438d8e3e2cecfdab82643fcee7553df50198ca/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java#L60-L63
--
This message was sent by Atlassian Jira
(v8.20.10#820010)