Yes, you are right, that is better. Regarding tests, there's an AsyncLocalizerTest that stubs the blob store, and makes some assertions about download behavior. Maybe that would be a good fit?
Den ons. 28. aug. 2019 kl. 14.30 skrev Diogo Monteiro < [email protected]>: > Hi Stig, > Thanks for the reply. > > Makes sense. But isn't it probably better if we just explicitly create an > empty resources directory (*Files.createDirectories(extractionDest)* is > enough) in an else clause, instead of calling *extractDirFromJar*? > > I say this because if we don't have a resources jar or a classpath url, we > aren't really extracting anything from a jar. > > Another thing, what's your opinion on how I should test this? > *LocallyCachedTopologyBlob* do not have unit tests. > > Diogo. > > On Tue, Aug 27, 2019 at 5:32 PM Stig Rohde Døssing <[email protected]> > wrote: > >> Hi Diogo, >> >> Thanks for your thorough explanation. I think you are right, and this is >> a bug. We'd be happy to see a PR to fix this. >> >> I think a decent way to handle this could be adding an extra else clause >> to >> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L146, >> and simply create an empty resources directory in the blob extraction >> directory, by calling extractDirFromJar(resourcesJar, ServerConfigUtils. >> RESOURCES_SUBDIR, extractionDest);. This is just me spitballing, so >> please feel free to fix it some other way if you have a better idea. >> >> Den tir. 27. aug. 2019 kl. 14.50 skrev Diogo Monteiro < >> [email protected]>: >> >>> Hi all, >>> >>> My name is Diogo and I am a dev for Paddy Power Betfair in Porto, >>> Portugal. We're running Storm 1.x.x in production for a couple of years and >>> the time has come for us to upgrade to 2.0.0. We use *LocalCluster* to >>> run topologies in our local machines to perform manually tests. >>> >>> So, going to the point: I was trying to launch a topology that I'm >>> developing (in 2.0.0) and noticed that the worker was getting restarted >>> each ~30 seconds. >>> I placed a breakpoint in the *kill* method of *LocalContainer* ( >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/LocalContainer.java#L66) >>> to try and understand why the worker was getting restarted. >>> >>> The call stack was: >>> >>> kill:66, LocalContainer (org.apache.storm.daemon.supervisor) >>> killContainerFor:269, Slot (org.apache.storm.daemon.supervisor) >>> handleRunning:724, Slot (org.apache.storm.daemon.supervisor) >>> stateMachineStep:218, Slot (org.apache.storm.daemon.supervisor) >>> run:931, Slot (org.apache.storm.daemon.supervisor) >>> >>> >>> With this I can understand that the worker is killed because a blob has >>> changed ( >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java#L724). >>> In fact, there's a changing blob in the *dynamicState* at that point. >>> >>> I checked the *AsyncLocalizer *which downloads, caches blobs locally, >>> and notifies the Slot state machine of a changing blob. >>> >>> I noticed this: >>> >>> - >>> >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L339 >>> - >>> >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L265 >>> - >>> >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L142 >>> - >>> >>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L192 >>> >>> >>> Which tell me that (correct me if I'm wrong): >>> >>> - Supervisor tries to update blobs each 30 seconds. >>> - The topology jar blob requires extraction of the resources >>> directory (either from a jar or directly in a classpath URL). It does so >>> in >>> *fetchUnzipToTemp *and it's existence is checked in >>> *isFullyDownloaded*. >>> - The Slot is notified of a changing blob if: >>> - the remote version is different from the local version (the >>> code has changed). >>> - OR the blob is not fully downloaded (the jar exists, and the >>> extracted resources directory exists). >>> >>> Well, I did not have a resources folder under the root of the classpath, >>> and that's why the worker was being restarted each ~30 seconds, as the Slot >>> was being notified of a changing blob everytime *updateBlobs* ran. >>> I created a resources folder (with dummy files) under the root of the >>> classpath and the problem is now solved. >>> >>> However, if I understand correctly, the resources folder is only >>> required for *multilang*. Our topologies do not use *multilang *and >>> this do not happen in Storm 1.1.3 for instance. >>> >>> Am I seeing or doing something wrong and this is an expected behaviour? >>> I am happy to contribute if this is in fact something worth to open an >>> issue and fix. >>> >>> Hope this is the right place for these questions, and thanks in advance >>> for taking your time to look at this. >>> >>> Regards, >>> Diogo >>> >>
