jerrypeng commented on PR #56008: URL: https://github.com/apache/spark/pull/56008#issuecomment-4551616215
> Looks OK to me in general. There are several test coverage questions from AI assisted review: > > Scenario Covered? > getAvailableShuffleWriterTaskLocations through RPC (worker) No -- only tested in-process on master > Duplicate registerShuffle with same shuffleId No > Re-registration of same mapId (task retry) No > getAvailableShuffleWriterTaskLocations for non-existent shuffle through RPC No > Concurrent registerShuffleWriterTask + unregisterShuffle race No > But I don't have a full implementation so it's hard to say this is a real gap or it will be a part of following PRs. > > Also, one potential caveat was also called out from registerShuffleWriterTask - race condition between registerShuffleWriterTask and unregisterShuffle, though the outcome is mostly a memory leakage of entry in taskLocations. Added the proposed tests and addressed the potential race condition. Also added another test to verify that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
