abstractdog commented on code in PR #150:
URL: https://github.com/apache/tez/pull/150#discussion_r1118054397


##########
tez-runtime-library/src/main/java/org/apache/tez/runtime/library/common/shuffle/orderedgrouped/ShuffleScheduler.java:
##########
@@ -842,7 +846,7 @@ private void penalizeHost(MapHost host, int failures) {
     penalties.add(new Penalty(host, penaltyDelay));
   }
 
-  private int getFailureCount(InputAttemptIdentifier srcAttempt) {
+  private synchronized int getFailureCount(InputAttemptIdentifier srcAttempt) {

Review Comment:
   this method is used only in a call path: isShuffleHealthy ->  
isAbortLimitExceeedFor -> getFailureCount, which is not synchronized at all at 
the moment, could you help me understand how is making getFailureCount 
synchronized related to the actual patch?
   
   UPDATE: I think I got it, please correct if I'm misunderstanding something: 
so copyFailed has become non-synchronized, so isShuffleHealthy (which is called 
from copyFailed) should become synchronized, but you cannot make it due to the 
actual problem (cannot have the whole isShuffleHealthy sync), so you're 
synchronizing parts of it



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to