Github user mridulm commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2019#discussion_r16631699
  
    --- Diff: core/src/main/scala/org/apache/spark/network/Connection.scala ---
    @@ -118,14 +118,33 @@ abstract class Connection(val channel: SocketChannel, 
val selector: Selector,
       }
     
       def close() {
    -    closed = true
    -    val k = key()
    -    if (k != null) {
    -      k.cancel()
    +    synchronized {
    +      /**
    +       * We should avoid executing closing sequence
    +       * double by a same thread.
    +       * Otherwise we can fail to call connectionsById.get() in
    +       * ConnectionManager#removeConnection() at the 2nd time
    +       */
    +      if (!closed) {
    +        disposeSasl()
    +
    +        /**
    +          * callOnCloseCallback() should be invoked
    +          * before k.cancel() and channel.close()
    +          * to avoid key() returns null.
    +          * If key() returns null before callOnCloseCallback(),
    +          * We cannot remove entry from connectionsByKey in 
ConnectionManager
    +          * and end up being threw CancelledKeyException.
    +          */
    +        callOnCloseCallback()
    +        val k = key()
    +        if (k != null) {
    +          k.cancel()
    +        }
    +        channel.close()
    +        closed = true
    +      }
    --- End diff --
    
    The way to handle this is to make closed an AtomicBoolean and do a 
getAndSet.
    If the result of getAndSet is false, which means closed was false on 
invocation, only then do the actual logic of close from earlier : it is a bug 
that all invocations of close was trying to do the same thing.
    
    Essentially :
    a) Change 
    ```var closed = false```
    to
    ```var closed = new AtomicBoolean(false)```
    
    b) Change close() to
    ```
    def close() {
      val prev = closed.getAndSet(true)
      if (! prev) {
        closeImpl()
      }
    }
    ```
    
    Where closeImpl is a private method containing the logic from earlier close 
(except for the closed variable update).
    
    
    This will ensure that failures in closeImpl will still result in connection 
being marked as close; and repeated invocations will not cause same code to be 
executed and other failures to surface (like missing id from map, etc).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to