Will take a look at it. On Wed, Aug 28, 2019 at 3:30 PM Stig Rohde Døssing <[email protected]> wrote:
> Yes, you are right, that is better. > > Regarding tests, there's an AsyncLocalizerTest that stubs the blob store, > and makes some assertions about download behavior. Maybe that would be a > good fit? > > Den ons. 28. aug. 2019 kl. 14.30 skrev Diogo Monteiro < > [email protected]>: > >> Hi Stig, >> Thanks for the reply. >> >> Makes sense. But isn't it probably better if we just explicitly create an >> empty resources directory (*Files.createDirectories(extractionDest)* is >> enough) in an else clause, instead of calling *extractDirFromJar*? >> >> I say this because if we don't have a resources jar or a classpath url, >> we aren't really extracting anything from a jar. >> >> Another thing, what's your opinion on how I should test this? >> *LocallyCachedTopologyBlob* do not have unit tests. >> >> Diogo. >> >> On Tue, Aug 27, 2019 at 5:32 PM Stig Rohde Døssing < >> [email protected]> wrote: >> >>> Hi Diogo, >>> >>> Thanks for your thorough explanation. I think you are right, and this is >>> a bug. We'd be happy to see a PR to fix this. >>> >>> I think a decent way to handle this could be adding an extra else clause >>> to >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L146, >>> and simply create an empty resources directory in the blob extraction >>> directory, by calling extractDirFromJar(resourcesJar, ServerConfigUtils. >>> RESOURCES_SUBDIR, extractionDest);. This is just me spitballing, so >>> please feel free to fix it some other way if you have a better idea. >>> >>> Den tir. 27. aug. 2019 kl. 14.50 skrev Diogo Monteiro < >>> [email protected]>: >>> >>>> Hi all, >>>> >>>> My name is Diogo and I am a dev for Paddy Power Betfair in Porto, >>>> Portugal. We're running Storm 1.x.x in production for a couple of years and >>>> the time has come for us to upgrade to 2.0.0. We use *LocalCluster* to >>>> run topologies in our local machines to perform manually tests. >>>> >>>> So, going to the point: I was trying to launch a topology that I'm >>>> developing (in 2.0.0) and noticed that the worker was getting restarted >>>> each ~30 seconds. >>>> I placed a breakpoint in the *kill* method of *LocalContainer* ( >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/LocalContainer.java#L66) >>>> to try and understand why the worker was getting restarted. >>>> >>>> The call stack was: >>>> >>>> kill:66, LocalContainer (org.apache.storm.daemon.supervisor) >>>> killContainerFor:269, Slot (org.apache.storm.daemon.supervisor) >>>> handleRunning:724, Slot (org.apache.storm.daemon.supervisor) >>>> stateMachineStep:218, Slot (org.apache.storm.daemon.supervisor) >>>> run:931, Slot (org.apache.storm.daemon.supervisor) >>>> >>>> >>>> With this I can understand that the worker is killed because a blob has >>>> changed ( >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java#L724). >>>> In fact, there's a changing blob in the *dynamicState* at that point. >>>> >>>> I checked the *AsyncLocalizer *which downloads, caches blobs locally, >>>> and notifies the Slot state machine of a changing blob. >>>> >>>> I noticed this: >>>> >>>> - >>>> >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L339 >>>> - >>>> >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L265 >>>> - >>>> >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L142 >>>> - >>>> >>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L192 >>>> >>>> >>>> Which tell me that (correct me if I'm wrong): >>>> >>>> - Supervisor tries to update blobs each 30 seconds. >>>> - The topology jar blob requires extraction of the resources >>>> directory (either from a jar or directly in a classpath URL). It does >>>> so in >>>> *fetchUnzipToTemp *and it's existence is checked in >>>> *isFullyDownloaded*. >>>> - The Slot is notified of a changing blob if: >>>> - the remote version is different from the local version (the >>>> code has changed). >>>> - OR the blob is not fully downloaded (the jar exists, and the >>>> extracted resources directory exists). >>>> >>>> Well, I did not have a resources folder under the root of the >>>> classpath, and that's why the worker was being restarted each ~30 seconds, >>>> as the Slot was being notified of a changing blob everytime >>>> *updateBlobs* ran. >>>> I created a resources folder (with dummy files) under the root of the >>>> classpath and the problem is now solved. >>>> >>>> However, if I understand correctly, the resources folder is only >>>> required for *multilang*. Our topologies do not use *multilang *and >>>> this do not happen in Storm 1.1.3 for instance. >>>> >>>> Am I seeing or doing something wrong and this is an expected behaviour? >>>> I am happy to contribute if this is in fact something worth to open an >>>> issue and fix. >>>> >>>> Hope this is the right place for these questions, and thanks in advance >>>> for taking your time to look at this. >>>> >>>> Regards, >>>> Diogo >>>> >>>
