Github user sarutak commented on a diff in the pull request:
https://github.com/apache/spark/pull/2019#discussion_r16631373
--- Diff: core/src/main/scala/org/apache/spark/network/Connection.scala ---
@@ -118,14 +118,33 @@ abstract class Connection(val channel: SocketChannel,
val selector: Selector,
}
def close() {
- closed = true
- val k = key()
- if (k != null) {
- k.cancel()
+ synchronized {
+ /**
+ * We should avoid executing closing sequence
+ * double by a same thread.
+ * Otherwise we can fail to call connectionsById.get() in
+ * ConnectionManager#removeConnection() at the 2nd time
+ */
+ if (!closed) {
+ disposeSasl()
+
+ /**
+ * callOnCloseCallback() should be invoked
+ * before k.cancel() and channel.close()
+ * to avoid key() returns null.
+ * If key() returns null before callOnCloseCallback(),
+ * We cannot remove entry from connectionsByKey in
ConnectionManager
+ * and end up being threw CancelledKeyException.
+ */
+ callOnCloseCallback()
+ val k = key()
+ if (k != null) {
+ k.cancel()
+ }
+ channel.close()
+ closed = true
+ }
--- End diff --
SendingConnection#close is called from 3 threads on the same instance.
For example, 1st thread of handle-read-write-executor calls
ReceivingConnection#close -> SendingConnection#close, 2nd thread of
handle-read-write-executor calles SendingConnection#close and 3rd thread of
connection-manager-thread calls ConnectionManager#run ->
SendingConnection#close.
I think, if it threw exception from any methods in close(), connection is
not marked as closed because one of those thread is expected to close resources
even if another thread fail to close.
And synchronized block is for protect being called SendingConnection#close
from 3 threads.
It can be one of following situation.
(1) One thread of handle-read-write-execuor evaluates key.cancel in
SendingConnection#close
(2) Then, connection-manager-thread calls removeConnection via
callOnCloseCallback and evaluates "connectionsyKey -= connection.key". This
should be fail because connection.key is null at this time.
After (2) above, connection-manager-thread expects connectionsByKey.size !=
0 in ConnectionManager#stop but that size cannot be 0 and we get log message
"All connections not cleaned up".
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]