[ 
https://issues.apache.org/jira/browse/IMPALA-2138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong reassigned IMPALA-2138:
-------------------------------------

    Assignee:     (was: Alexander Behm)

> Get rid of unused columns by upstream operators at points of materialization
> ----------------------------------------------------------------------------
>
>                 Key: IMPALA-2138
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2138
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 1.4, Impala 2.0, Impala 2.2
>            Reporter: Ippokratis Pandis
>            Priority: Critical
>              Labels: performance
>
> It would be a very good performance improvement if we were able to get rid of 
> columns as soon as we know that they are not going to be used from any other 
> operators upstream. The amount of data we are handling will reduce making the 
> network and I/O (spilling) transfers more efficient. It will also improve 
> cache performance. 
> The current row-wise in-memory format does not make it very easy to get rid 
> of such unused columns. However, there are points of materialization where we 
> copy-out the tuples and we can actually perform these projections. There are 
> multiple points of materialization, notably:
> * The exchange operator
> * The build side of hash join
> * The probe side of hash join when we have spilling
> * The aggregation
> * Sorts and analytic function evaluation
> In order to do these projections we need to modify the FE and know at each 
> operator what's the minimum set of columns that are being referenced by this 
> operator and all the upstream ones. (That minimum set is very easy to be 
> calculated during an additional top-down traversal of the plan.) We also need 
> to modify the BE and make the copy-out operation aware of such projections.
> Assigning first to Alex, because of the needed FE changes. Happy to take care 
> of the needed BE changes. Perhaps we could split this issue into 2 sub-tasks, 
> the FE and the BE changes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org
For additional commands, e-mail: issues-all-h...@impala.apache.org

Reply via email to