Re: Syncing multiple streams to compute final result from a bolt

Harsh Choudhary Tue, 20 Sep 2016 07:50:26 -0700

But how would that solve the syncing problem?



On Tue, Sep 20, 2016 at 8:12 PM, Alberto São Marcos <[email protected]>
wrote:

> I would dump the *Bolt-A* results in a shared-data-store/queue and have a
> separate workflow with another spout and Bolt-B draining from there
>
> On Tue, Sep 20, 2016 at 9:20 AM, Harsh Choudhary <[email protected]>
> wrote:
>
>> Hi
>>
>> I am thinking of doing the following.
>>
>> Spout subscribed to Kafka and get JSONs. Spout emits the JSONs as
>> individual tuples.
>>
>> Bolt-A has subscribed to the spout. Bolt-A creates multiple JSONs from a
>> json and emits them as multiple streams.
>>
>> Bolt-B receives these streams and do the computation on them.
>>
>> I need to make a cumulative result from all the multiple JSONs (which are
>> emerged from a single JSON) in a Bolt. But a bolt static instance variable
>> is only shared between tasks per worker. How do achieve this syncing
>> process.
>>
>>                               --->
>> Spout ---> Bolt-A   --->   Bolt-B  ---> Final result
>>                               --->
>>
>> The final result is per JSON which was read from Kafka.
>>
>> Or is there any other way to achieve this better?
>>
>
>

Re: Syncing multiple streams to compute final result from a bolt

Reply via email to