[
https://issues.apache.org/jira/browse/IMPALA-3825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Lars Volker updated IMPALA-3825:
--------------------------------
Labels: runtime-filters scalability (was: runtime-filters)
> Distribute runtime filter aggregation across cluster
> ----------------------------------------------------
>
> Key: IMPALA-3825
> URL: https://issues.apache.org/jira/browse/IMPALA-3825
> Project: IMPALA
> Issue Type: Improvement
> Components: Distributed Exec
> Affects Versions: Impala 2.6.0
> Reporter: Henry Robinson
> Assignee: Abhishek Rawat
> Priority: Major
> Labels: runtime-filters, scalability
>
> Runtime filters can be tens of MB or more, and incasting all filters from all
> shuffle joins to the coordinator can put a lot of memory pressure on that
> node. To alleviate this we should consider spreading out the aggregation
> operation across the cluster, so that a different node aggregates each
> runtime filter.
> This still restricts aggregation to #runtime-filters nodes, which will
> usually be less than the cluster size. If we want to smooth that out further
> we could use tree-based aggregation, but let's measure the benefits of simply
> distributing the aggregation work first.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]