steveloughran commented on code in PR #4909:
URL: https://github.com/apache/hadoop/pull/4909#discussion_r975119488


##########
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/WeakReferenceMap.java:
##########
@@ -132,35 +145,92 @@ public WeakReference<V> lookup(K key) {
    * @return an instance.
    */
   public V get(K key) {
-    final WeakReference<V> current = lookup(key);
-    V val = resolve(current);
-    if (val != null) {
+    final WeakReference<V> currentWeakRef = lookup(key);
+    // resolve it, after which if not null, we have a strong reference
+    V strongVal = resolve(currentWeakRef);
+    if (strongVal != null) {
       // all good.
-      return  val;
+      return  strongVal;
     }
 
-    // here, either no ref, or the value is null
-    if (current != null) {
+    // here, either currentWeakRef was null, or its reference was GC'd.
+    if (currentWeakRef != null) {
+      // garbage collection removed the reference.
+
+      // explicitly remove the weak ref from the map if it has not
+      // been updated by this point
+      // this is here just for completeness.
+      map.remove(key, currentWeakRef);
+
+      // log/report the loss.
       noteLost(key);
     }
+
+    // create a new value and add it to the map
     return create(key);
   }
 
   /**
    * Create a new instance under a key.
+   * <p>
    * The instance is created, added to the map and then the
    * map value retrieved.
    * This ensures that the reference returned is that in the map,
    * even if there is more than one entry being created at the same time.
+   * If that race does occur, it will be logged the first time it happens
+   * for this specific map instance.
+   * <p>
+   * HADOOP-18456 highlighted the risk of a concurrent GC resulting a null
+   * value being retrieved and so returned.
+   * To prevent this:
+   * <ol>
+   *   <li>A strong reference is retained to the newly created instance
+   *       in a local variable.</li>
+   *   <li>That variable is used after the resolution process, to ensure
+   *       the JVM doesn't consider it "unreachable" and so eligible for 
GC.</li>
+   *   <li>A check is made for the resolved reference being null, and if so,
+   *       the put() is repeated</li>
+   * </ol>
    * @param key key
-   * @return the value
+   * @return the created value
    */
   public V create(K key) {
     entriesCreatedCount.incrementAndGet();
-    WeakReference<V> newRef = new WeakReference<>(
-        requireNonNull(factory.apply(key)));
-    map.put(key, newRef);
-    return map.get(key).get();
+    /*
+     Get a strong ref so even if a GC happens in this method the reference is 
not lost.
+     It is NOT enough to have a reference in a field, it MUST be used
+     so as to ensure the reference isn't optimized away prematurely.
+     "A reachable object is any object that can be accessed in any potential 
continuing
+      computation from any live thread."
+    */
+
+    final V strongRef = requireNonNull(factory.apply(key));
+    V resolvedStrongRef;
+    do {
+      WeakReference<V> newWeakRef = new WeakReference<>(strongRef);
+
+      // put it in the map
+      map.put(key, newWeakRef);
+
+      // get it back from the map
+      WeakReference<V> retrievedWeakRef = map.get(key);
+      // resolve that reference, handling the situation where somehow it was 
removed from the map
+      // between the put() and the get()
+      resolvedStrongRef = resolve(retrievedWeakRef);
+      if (resolvedStrongRef == null) {
+        referenceLostDuringCreation.warn("reference to %s lost during 
creation", key);
+        noteLost(key);
+      }
+    } while (resolvedStrongRef == null);
+
+    // note if there was any change in the reference.
+    // as this forces strongRef to be kept in scope
+    if (strongRef != resolvedStrongRef) {
+      LOG.debug("Created instance for key {}: {} overwritten by {}",

Review Comment:
   no, it's just addressing a race condition...the reason we do the lookup is 
to ensure that all threads share the same instance. the exact choice of which 
one is not considered relevant



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to