[
https://issues.apache.org/jira/browse/IMPALA-8081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-8081:
----------------------------------
Summary: Avoid over-parallelizing queries when there are small input splits
(was: Automatically choose mt_dop)
> Avoid over-parallelizing queries when there are small input splits
> ------------------------------------------------------------------
>
> Key: IMPALA-8081
> URL: https://issues.apache.org/jira/browse/IMPALA-8081
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: Janaki Lahorani
> Priority: Major
> Labels: multithreading
>
> Currently we maximise parallelism given the number of input splits available.
> This is often a good decision, unless there are very many small input splits,
> particularly small files. We could avoid this pathological behaviour by
> having a minimum threshold of input bytes per instance (this is still pretty
> crude, since file input bytes only correlates loosely with the amount of work
> required).
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]