[GitHub] [spark] akpatnam25 opened a new pull request, #36601: [SPARK-38987][SHUFFLE] Throw FetchFailedException when merged shuffle…

GitBox Wed, 18 May 2022 14:47:35 -0700


akpatnam25 opened a new pull request, #36601:
URL: https://github.com/apache/spark/pull/36601


   … blocks are corrupted and spark.shuffle.detectCorrupt is set to true
   
   
   
   ### What changes were proposed in this pull request?
   <!--
   Adds the corruption exception handling for merged shuffle chunk when 
spark.shuffle.detectCorrupt is set to true(default value is true)
   -->
   
   
   ### Why are the changes needed?
   <!--
   Prior to Spark 3.0, spark.shuffle.detectCorrupt is set to true by default, 
and this configuration is one of the knob for early corruption detection. So 
the fallback can be triggered as expected.
   
   After Spark 3.0, even though spark.shuffle.detectCorrupt is still set to 
true by default, but the early corruption detect knob is controlled with a new 
configuration spark.shuffle.detectCorrupt.useExtraMemory, and it set to false 
by default. Thus the default behavior, with only Magnet enabled after Spark 
3.2.0(internal li-3.1.1), will disable the early corruption detection, thus no 
fallback will be triggered. And it will drop to throw an exception when start 
to read the corrupted blocks.
   
   We handle the corrupted stream for merged blocks by throwing a 
FetchFailedException in this case. This will trigger a retry based on the 
values of spark.shuffle.detectCorrupt.useExtraMemory and 
spark.shuffle.detectCorrupt. 
   -->
   
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   No
   -->
   
   
   ### How was this patch tested?
   <!--
   -Tested on internal cluster
   - Added UT
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] akpatnam25 opened a new pull request, #36601: [SPARK-38987][SHUFFLE] Throw FetchFailedException when merged shuffle…

Reply via email to