Topologies can be as big as you want them to, but I have personally
preferred small ones because of the same problem that you are facing.
Timeouts.
For you, I'd recommend that after your trident state inserts to DB, send an
ack and don't do any further emits.
Let the lower part of your diagram be a separate topology that queries what
Trident wrote and processes it.
Also, each REST call section and the "each" bolt before it could be moved
into the spout.

On Mon, Aug 22, 2016 at 12:26 PM, Pratyusha Rasamsetty <
[email protected]> wrote:

> In my use case, the chain is quite big and the time is unpredictable. Each
> tuple emitted by spout can emit about 100 tuples. And then those child
> tuples have to do REST calls which can take about 2-3 seconds. Then
> inserts, maintaining states, stateQuery, then again processings at the end
> again save to database.
>
> 1. ​The flow some what looks like this. In between I have eliminated lot
> more each blocks. How will guaranteed message processing exactly happens in
> this case? 2. Setting proper timeouts at spout level does not help as time
> taken to process a tuple is unpredicted and processing tuples multiple
> times is a costly operation
> - How to make sure that the storm does not re emits based on time?
> 3. How big can a normal topology or trident topology chain can be?
>
> Thanks
> Pratyusha
>



-- 
Regards,
Navin

Reply via email to