yabola commented on PR #38560:
URL: https://github.com/apache/spark/pull/38560#issuecomment-1321786141

   > One things that I know need to be addressed are: Some merge data infos are 
not saved on the driver because they are too small ( controlled by 
`spark.shuffle.push.minShuffleSizeToWait`) please see 
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L2295
   
   @mridulm sorry, in my previous implementation, I needed to pass the reduceid 
to the external shuffle service, but I found a problem, the driver cannot 
record the complete merged reduceId (see my comment for the reason)...
   But I had changed my implementation, so it may not be a problem (we can save 
merged reduceIds in shuffle service, please see 
![codes](https://github.com/apache/spark/pull/38560/files#diff-d544fbb952b61283b3d18ca10a5027528efc4f2f65047130da015ae7754c117fR791).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to