I am thinking of doing the following.

Spout subscribed to Kafka and get JSONs. Spout emits the JSONs as
individual tuples.

Bolt-A has subscribed to the spout. Bolt-A creates multiple JSONs from a
json and emits them as multiple streams.

Bolt-B receives these streams and do the computation on them.

I need to make a cumulative result from all the multiple JSONs (which are
emerged from a single JSON) in a Bolt. But a bolt static instance variable
is only shared between tasks per worker. How do achieve this syncing

Spout ---> Bolt-A   --->   Bolt-B  ---> Final result

The final result is per JSON which was read from Kafka.

Or is there any other way to achieve this better?

Reply via email to