shameersss1 commented on code in PR #7828:
URL: https://github.com/apache/hadoop/pull/7828#discussion_r2235400661


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/nodelabels/NodeAttributesManagerImpl.java:
##########
@@ -741,7 +741,15 @@ public void refreshNodeAttributesToScheduler(NodeId 
nodeId) {
     if (host == null || host.attributes == null) {
       return;
     }
-    newNodeToAttributesMap.put(hostName, host.attributes.keySet());
+
+    // Use read lock and create defensive copy since
+    // other threads might access host.attributes
+    readLock.lock();
+    try {
+      newNodeToAttributesMap.put(hostName, new 
HashSet<>(host.attributes.keySet()));

Review Comment:
   The issue is that.
   
   1. The LOG statement which prints `newNodeToAttributesMap` tries to iterate 
`host.attribute`
   2. `host.attribute` gets modified by some other thread - leading to 
concurrent modification exception.
   
   There are two ways to solve this
   1. As you said to readLock before LOG statement so that `host.attribute` 
does not get modified during LOG statement
   2. Create a defensive copy of `host.attribute` (under read lock because the 
modification can happen at that time as well).
   
   The rationale behind using option 2 to avoid logging inconsistency- Assume 
that we readLock before LOG statement. Once the LOG statement is executed, some 
other thread modifies the `host.attribute` this will lead to we logging 
something and processing something else.
   
   Creating a defensive copy make sure that we don't change value. i.e what is 
LOGed gets processed as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to