gaoyajun02 commented on PR #46934: URL: https://github.com/apache/spark/pull/46934#issuecomment-2176222670
> @gaoyajun02 , trying to understand the scenario better here - did you observe disk issues which resulted in this inconsistency ? > > If yes, should this be checksum'ed - to ensure correctness. I would prefer that to adding additional rpc calls to the driver - which will now be incurred for all calls - given this should be a rare enough scenario. > > Thoughts ? > > Also, +CC @zhouyejoe as well. The first two paragraphs are my descriptions of these scenarios. The added getMergeStatusMapTracker call does not result in an additional RPC call to the driver during runtime, because after the first request for metadata, the mergeStatus is already saved in the mergeStatuses of the Executor side's MapOutputTrackerWorker. Considering that some system-level errors cannot be fully covered, I think it is necessary to perform merge metadata verification and fallback on the reduce side. @mridulm -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
