Omid Shahidi created IMPALA-11510:
-------------------------------------

             Summary: Provide memory estimation for Exchange Sender's plan node
                 Key: IMPALA-11510
                 URL: https://issues.apache.org/jira/browse/IMPALA-11510
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
            Reporter: Omid Shahidi


Currently there is no memory estimation provided for DataStreamSink which is 
responsible for consuming the KRPC OutboundRowBatches which contain the tuple 
data for RowBatches. There is a hypothesis that providing such memory 
estimation can help to reduce memory usage without affecting the performance. A 
rough estimation for exchange sender is:

  num_channel * 2 * (tuple_buffer_length + compressed_buffer_length)

With IMPALA-6684, two new runtime profile counters were added. TupleDataBytes 
and CompressionScratchBytes track the tuple buffer length and compressed buffer 
length within an OutboundRowBatch. These runtime profile counters can be used a 
research source and as experimentation for an estimation of tuple_buffer_length 
and comressed_buffer_length

https://github.com/apache/impala/blob/26438d8e3e2cecfdab82643fcee7553df50198ca/fe/src/main/java/org/apache/impala/planner/DataStreamSink.java#L60-L63



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to