[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

revans2 Mon, 14 May 2018 11:37:01 -0700

Github user revans2 commented on the issue:

https://github.com/apache/storm/pull/2669

The original code added this so when shutting down the context we could be
sure that all ongoing connections were also terminated as cleanly as possible.
After this change that is only true for the server side and not the client side.

What are the ramifications of that? I think it was just defensive
programming so it is probably fine to make this change from that perspective,
but I would to make sure that my understanding of the problem matches yours.

From the comments in STORM-3055 the error appears to be triggered when a
supervisor is shut down, wiped clean (so it gets a new supervisor id), and then
brought back up. At that point nimbus schedules the worker on the same
node/slot as before, but with a new supervisor ID. This confuses the
connection caching because when updating the connections it gets a list of
connections to shut down and a separate list of connections to create. The new
connections are created, but in this case a new one is not created because we
already have it open. Then the old unneeded connections are torn down, but in
this case the connection is needed.

Looking at the javadocs for IContext. It looks like the caching really does
violate the spec, but it is a bit of a gray area.

https://github.com/apache/storm/blob/53f38bc31f2fd315a520ba6b86c0a60be08381cc/storm-client/src/jvm/org/apache/storm/messaging/IContext.java#L48-L57

I am fine with removing the caching like this JIRA does, but I do want to
see the code cleaned up, because without the caching there is a lot of extra
code that can be removed.

---

[GitHub] storm issue #2669: [STORM-3055] remove conext connection cache

Reply via email to