[
https://issues.apache.org/jira/browse/IMPALA-9951?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17173486#comment-17173486
]
Tim Armstrong commented on IMPALA-9951:
---------------------------------------
Linking a couple of related JIRA that can help with specific cases of analytics
that solved Q67
> Skew in analytic sorts when partition key has low cardinality
> -------------------------------------------------------------
>
> Key: IMPALA-9951
> URL: https://issues.apache.org/jira/browse/IMPALA-9951
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Tim Armstrong
> Priority: Major
> Labels: multithreading, tpcds
>
> In queries like TPC-DS Q67, the cardinality of the PARTITION BY expression of
> the analytic may be much lower than the parallelism of the input fragment. In
> this case the runtime of the sort can be skewed. We could mitigate the
> problem by doing the expensive sort *before* the exchange, so that the
> analytic fragment only needs to merge together its sorted input and evaluate
> the analytic over it.
> The impact of this is greater with multithreading, so I am considering only
> change the default when mt_dop > 0
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]