[ 
https://issues.apache.org/jira/browse/IMPALA-2737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong resolved IMPALA-2737.
-----------------------------------
    Resolution: Later

Closing until we have more concrete plans.

> Investigate partition-oriented agg and join processing
> ------------------------------------------------------
>
>                 Key: IMPALA-2737
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2737
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.3.0
>            Reporter: Tim Armstrong
>            Priority: Minor
>              Labels: performance
>         Attachments: partition-oriented-pagg-preview.diff
>
>
> Currently the partitioned aggregations and joins add rows to the partitions 
> as they process the input. This leads to poor memory access patterns since 
> the 16 different partitions are randomly accessed. An alternative approach is 
> to do an initial pass to hash and divide the rows between partitions, then do 
> a second pass per partition to insert all the rows for that partition. This 
> avoids the random access to partitions.
> This can enable some additional optimisations, e.g. prefetching hash table 
> buckets for the next row.
> An initial prototype was posted here: http://gerrit.cloudera.org/#/c/628 . 
> The diff is attached.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to