Ideally you want to use multiple instances of a bolt/spout over doing internal 
multithreading. if you must use internal threading, you will need to do your 
own synchronization as well.
In general these distributed systems are designed to relieve users from doing 
their own MT and synchronization.

-roshan





On Thursday, April 9, 2020, 07:28:52 AM PDT, Ethan Li 
<ethanopensou...@gmail.com> wrote: 





In your case,

1. Every executor has an instance of ExecutorTransfer

https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/executor/Executor.java#L146

2. Every ExecutorTransfer has its own serializer

https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/ExecutorTransfer.java#L44

3. Every executor has its own outputCollector

https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/executor/bolt/BoltExecutor.java#L146-L147

4. When outputCollector is called to emit to remote workers, it uses
ExecutorTransfer to transfer data

https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/executor/ExecutorTransfer.java#L66

5. which will try to serialize data

https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/daemon/worker/WorkerTransfer.java#L116

6. But serializer is not thread-safe

https://github.com/apache/storm/blob/00f48d60e75b28e11a887baba02dc77876b2bb3d/storm-client/src/jvm/org/apache/storm/serialization/KryoTupleSerializer.java#L33-L43


Filed a JIRA https://issues.apache.org/jira/browse/STORM-3620

Thanks for reporting this issue.

Best,
Ethan

On Thu, Apr 9, 2020 at 9:03 AM Ethan Li <ethanopensou...@gmail.com> wrote:

> You are right. I think output collectors are not thread-safe in storm
> 2.x.
>
>
> On Tue, Apr 7, 2020 at 7:35 AM Simon Cooper <
> simon.coo...@featurespace.co.uk> wrote:
>
>> Hi Storm devs,
>>
>> We've narrowed down the issue - multiple threads in our bolts are
>> accessing output collectors at the same time. In storm1, this was fine
>> (presumably something in clojure synchronized it, or it was thread-safe),
>> but in storm2, this causes multiple threads to try and write to the same
>> buffer at the same time, causing data corruption and really weird behaviour
>> when the corrupted data was deserialized at the other end. We've fixed this
>> by putting a mutex on all the collectors.
>>
>> I know there's been some back-and-forth around this in the past, but now,
>> output collectors are definitely not thread-safe in storm2.
>>
>> Simon
>>
>> -----Original Message-----
>> From: Simon Cooper <simon.coo...@featurespace.co.uk>
>> Sent: 03 April 2020 12:16
>> To: dev@storm.apache.org
>> Subject: Crashes when running storm 2.1 topologies on multiple workers
>>
>> Hi Storm devs,
>>
>> We've encountered a serious problem when trying to run a storm 2.1
>> topology across multiple workers - it looks like the data is being
>> corrupted somewhere between being serialized on the sending worker and
>> being deserialized on the receiving worker. This means it's impossible for
>> us to run storm 2.1 on topologies with more than one worker!
>>
>> The relevant storm bug is STORM-3582. We've seen multiple exceptions in
>> this area, some of which point towards kryo being at fault here, maybe
>> involving custom serializers. However, I've so far been unable to reproduce
>> the issue in test cases.
>>
>> Is anyone able to give us some pointers to try and work out what may be
>> going wrong? This does seem like a very serious issue for the latest storm
>> release (unfortunately we're unable to try storm 2.0 to see when it was
>> introduced due to other issues with that release).
>>
>> Many thanks,
>> Simon Cooper
>>
>> This message, and any files/attachments transmitted together with it, is
>> intended for the use only of the person (or persons) to whom it is
>> addressed. It may contain information which is confidential and/or
>> protected by legal privilege. Accordingly, any dissemination, distribution,
>> copying or use of this message, or any part of it or anything sent together
>> with it, other than by intended recipients, may constitute a breach of
>> civil or criminal law and is hereby prohibited. Unless otherwise stated,
>> any views expressed in this message are those of the person sending it and
>> not the sender's employer. No responsibility, legal or otherwise, of
>> whatever nature, is accepted as to the accuracy of the contents of this
>> message or for the completeness of the message as received. Anyone who is
>> not the intended recipient of this message is advised to make no use of it
>> and is requested to contact Featurespace Limited as soon as possible. Any
>> recipient of this message who has knowledge or suspects that it may have
>> been the subject of unauthorised interception or alteration is also
>> requested to contact Featurespace Limited.
>>
>

Reply via email to