Will take a look at it.

On Wed, Aug 28, 2019 at 3:30 PM Stig Rohde Døssing <[email protected]>
wrote:

> Yes, you are right, that is better.
>
> Regarding tests, there's an AsyncLocalizerTest that stubs the blob store,
> and makes some assertions about download behavior. Maybe that would be a
> good fit?
>
> Den ons. 28. aug. 2019 kl. 14.30 skrev Diogo Monteiro <
> [email protected]>:
>
>> Hi Stig,
>> Thanks for the reply.
>>
>> Makes sense. But isn't it probably better if we just explicitly create an
>> empty resources directory (*Files.createDirectories(extractionDest)* is
>> enough) in an else clause, instead of calling *extractDirFromJar*?
>>
>> I say this because if we don't have a resources jar or a classpath url,
>> we aren't really extracting anything from a jar.
>>
>> Another thing, what's your opinion on how I should test this?
>> *LocallyCachedTopologyBlob* do not have unit tests.
>>
>> Diogo.
>>
>> On Tue, Aug 27, 2019 at 5:32 PM Stig Rohde Døssing <
>> [email protected]> wrote:
>>
>>> Hi Diogo,
>>>
>>> Thanks for your thorough explanation. I think you are right, and this is
>>> a bug. We'd be happy to see a PR to fix this.
>>>
>>> I think a decent way to handle this could be adding an extra else clause
>>> to
>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L146,
>>> and simply create an empty resources directory in the blob extraction
>>> directory, by calling extractDirFromJar(resourcesJar, ServerConfigUtils.
>>> RESOURCES_SUBDIR, extractionDest);. This is just me spitballing, so
>>> please feel free to fix it some other way if you have a better idea.
>>>
>>> Den tir. 27. aug. 2019 kl. 14.50 skrev Diogo Monteiro <
>>> [email protected]>:
>>>
>>>> Hi all,
>>>>
>>>> My name is Diogo and I am a dev for Paddy Power Betfair in Porto,
>>>> Portugal. We're running Storm 1.x.x in production for a couple of years and
>>>> the time has come for us to upgrade to 2.0.0. We use *LocalCluster* to
>>>> run topologies in our local machines to perform manually tests.
>>>>
>>>> So, going to the point: I was trying to launch a topology that I'm
>>>> developing (in 2.0.0) and noticed that the worker was getting restarted
>>>> each ~30 seconds.
>>>> I placed a breakpoint in the *kill* method of *LocalContainer* (
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/LocalContainer.java#L66)
>>>> to try and understand why the worker was getting restarted.
>>>>
>>>> The call stack was:
>>>>
>>>> kill:66, LocalContainer (org.apache.storm.daemon.supervisor)
>>>> killContainerFor:269, Slot (org.apache.storm.daemon.supervisor)
>>>> handleRunning:724, Slot (org.apache.storm.daemon.supervisor)
>>>> stateMachineStep:218, Slot (org.apache.storm.daemon.supervisor)
>>>> run:931, Slot (org.apache.storm.daemon.supervisor)
>>>>
>>>>
>>>> With this I can understand that the worker is killed because a blob has
>>>> changed (
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/daemon/supervisor/Slot.java#L724).
>>>> In fact, there's a changing blob in the *dynamicState* at that point.
>>>>
>>>> I checked the *AsyncLocalizer *which downloads, caches blobs locally,
>>>> and notifies the Slot state machine of a changing blob.
>>>>
>>>> I noticed this:
>>>>
>>>>    -
>>>>    
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L339
>>>>    -
>>>>    
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/AsyncLocalizer.java#L265
>>>>    -
>>>>    
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L142
>>>>    -
>>>>    
>>>> https://github.com/apache/storm/blob/2ba95bbd1c911d4fc6363b1c4b9c4c6d86ac9aae/storm-server/src/main/java/org/apache/storm/localizer/LocallyCachedTopologyBlob.java#L192
>>>>
>>>>
>>>> Which tell me that (correct me if I'm wrong):
>>>>
>>>>    - Supervisor tries to update blobs each 30 seconds.
>>>>    - The topology jar blob requires extraction of the resources
>>>>    directory (either from a jar or directly in a classpath URL). It does 
>>>> so in
>>>>    *fetchUnzipToTemp *and it's existence is checked in
>>>>    *isFullyDownloaded*.
>>>>    - The Slot is notified of a changing blob if:
>>>>       - the remote version is different from the local version (the
>>>>       code has changed).
>>>>       - OR the blob is not fully downloaded (the jar exists, and the
>>>>       extracted resources directory exists).
>>>>
>>>> Well, I did not have a resources folder under the root of the
>>>> classpath, and that's why the worker was being restarted each ~30 seconds,
>>>> as the Slot was being notified of a changing blob everytime
>>>> *updateBlobs* ran.
>>>> I created a resources folder (with dummy files) under the root of the
>>>> classpath and the problem is now solved.
>>>>
>>>> However, if I understand correctly, the resources folder is only
>>>> required for *multilang*. Our topologies do not use *multilang *and
>>>> this do not happen in Storm 1.1.3 for instance.
>>>>
>>>> Am I seeing or doing something wrong and this is an expected behaviour?
>>>> I am happy to contribute if this is in fact something worth to open an
>>>> issue and fix.
>>>>
>>>> Hope this is the right place for these questions, and thanks in advance
>>>> for taking your time to look at this.
>>>>
>>>> Regards,
>>>> Diogo
>>>>
>>>

Reply via email to