Re: [PR] [SPARK-48580][CORE] Add consistency check and fallback for mapIds in push-merged block meta [spark]

via GitHub Tue, 18 Jun 2024 06:56:04 -0700


gaoyajun02 commented on PR #46934:
URL: https://github.com/apache/spark/pull/46934#issuecomment-2176170520


   The above example scenario was determined through metric collection and 
localization in our production environment. However, there are still many 
inconsistencies in mapId at the application layer that cannot be explained, and 
this PR cannot guarantee the final consistency of shuffle data. These service 
nodes (which account for a very small percentage of the cluster nodes, 0.1%) 
have common file system errors at the system level, and there is a small 
probability of data loss cases occurring daily. Given these types of issues, my 
current solution is to consider rolling back the entire reduce partition data, 
not just the inconsistent mapIds.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-48580][CORE] Add consistency check and fallback for mapIds in push-merged block meta [spark]

Reply via email to