[
https://issues.apache.org/jira/browse/IMPALA-13645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17933986#comment-17933986
]
Riza Suminto commented on IMPALA-13645:
---------------------------------------
Filtering partitions in coordinator should be feasible.
We can let planning and scheduling as it is. Once coordinator finish
aggregating partition filter and has not heard that that any scanners has move
ahead, coordinator can send "reschedule" signal to all executors, with the new
ScanRange list.
> Account for impact of runtime filters when scheduling scan ranges
> -----------------------------------------------------------------
>
> Key: IMPALA-13645
> URL: https://issues.apache.org/jira/browse/IMPALA-13645
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Reporter: David Rorke
> Priority: Major
>
> Runtime filters can introduce significant skew in the number of scan ranges
> scanned by a given executor or scan fragment. Initial scan range scheduling
> attempts to balance ranges across hosts and with multithreading (e.g.
> mt_dop>0) there's additional balancing done locally within a given executor,
> but neither of these balancing steps account for the impact of runtime
> filters.
> One option for remote, partition filters might be to wait for the filters to
> arrive at the coordinator, apply the filters prior to scheduling, and then
> balance the surviving scan ranges during scheduling. Another more limited
> approach would be to only use the filters for balancing across fragments
> within a given executor (wait for the filter to arrive and then prune the
> ranges assigned to that executor and hand them out to fragments in a
> deterministic way).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]