Riza Suminto created IMPALA-11972:
-------------------------------------

             Summary: Factor in row width during ProcessingCost calculation.
                 Key: IMPALA-11972
                 URL: https://issues.apache.org/jira/browse/IMPALA-11972
             Project: IMPALA
          Issue Type: Improvement
          Components: Frontend
    Affects Versions: Impala 4.3.0
            Reporter: Riza Suminto
            Assignee: Riza Suminto


IMPALA-11604 add ProcessingCost (PC) concept to measure the cost for a distinct 
PlanNode / DataSink / PlanFragment to process its input rows globally across 
all of its instances.

We should investigate if the row width should be considered in computing PC for 
more operators, and if that will make the PC model more accurate. The code in 
IMPALA-11604 has materialization cost parameter to accommodate PC where row 
width should factor in. Currently, PC of ScanNode, ExchangeNode, and 
DataStreamSink has row width factored in through materialization parameter here.

For VARCHAR, we can use some kind of average width stats, if available.  For 
fixed width columns, we just use the width. In both cases, the unit should be 
in bytes. The idea of including a width in costing is to make the outcome as 
precise and less error-prone as possible.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to