[
https://issues.apache.org/jira/browse/SPARK-35036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17399843#comment-17399843
]
Min Shen commented on SPARK-35036:
----------------------------------
Not sure if this is fixable, given the reasons you already described.
The partial set of map indexes are used in AQE only to handle skewed partitions.
Since it's a skewed partition to begin with, in practice it would only affect
very few shuffle partitions.
We could alternatively handle skewed partitions with push-based shuffle
differently from how AQE handles it, i.e. instead of subdividing a shuffle
partition using continuous map index sub-ranges we could subdivide a skewed
merged shuffle partition based on boundaries of the MB-sized chunks.
This should be relatively easier to achieve and can also handle skewed
partitions.
Furthermore, just to clarify that push-based shuffle can already work with AQE
for shuffle partition coalescing.
> Improve push based shuffle to work with AQE by fetching partial map indexes
> for a reduce partition
> --------------------------------------------------------------------------------------------------
>
> Key: SPARK-35036
> URL: https://issues.apache.org/jira/browse/SPARK-35036
> Project: Spark
> Issue Type: Sub-task
> Components: Spark Core
> Affects Versions: 3.1.1
> Reporter: Venkata krishnan Sowrirajan
> Priority: Major
>
> Currently when both Push based shuffle and AQE is enabled and when partial
> set of map indexes are requested to MapOutputTracker this is delegated the
> regular shuffle instead of push based shuffle reading map blocks. This is
> because blocks from mapper in push based shuffle are merged out of order due
> to which its hard to only get the matching blocks of the reduce partition for
> the requested start and end map indexes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]