Mridul Muralidharan created SPARK-36892:
-------------------------------------------
Summary: Disable batch fetch for a shuffle when push based shuffle
is enabled
Key: SPARK-36892
URL: https://issues.apache.org/jira/browse/SPARK-36892
Project: Spark
Issue Type: Bug
Components: Shuffle
Affects Versions: 3.2.0
Reporter: Mridul Muralidharan
When push based shuffle is enabled, efficient fetch of merged mapper shuffle
output happens.
Unfortunately, this currently interacts badly with
spark.sql.adaptive.fetchShuffleBlocksInBatch, potentially causing shuffle fetch
to hang and/or duplicate data to be fetched, causing correctness issues.
Given batch fetch does not benefit spark stages reading merged blocks when push
based shuffle is enabled, ShuffleBlockFetcherIterator.doBatchFetch can be
disabled when push based shuffle is enabled.
Thx to [~Ngone51] for surfacing this issue.
+CC [~Gengliang.Wang]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]