We’ve implemented this internally.

Within the spout, we keep track of all the ‘in-flight’ tuples in a Set. Then, 
when we need to do a flush (determined by several internal and external 
conditions), we stop sending new tuples through and wait for all the pending 
tuples to be acked (or failed). Then we send a flush tuple through (it’s 
emitted on a separate ‘control’ stream connected to all the running bolts, 
independent of main tuple processing), wait for that to be acked, and then 
start sending normal tuples through again.

This is implemented using a reasonably simple state machine inside the spout to 
ensure nextTuple() always does the correct thing whatever’s going on in the 
topology. It also keeps track of whether the topology is ‘idle’, and doesn’t 
bother flushing when there’s nothing to flush.

Hope this helps,
SimonC

From: Satish Mittal [mailto:[email protected]]
Sent: 26 June 2015 16:17
To: [email protected]
Subject: Re: Flushing a topology?

Assuming each flush period is associated with a start and end event, one 
approach is to instantiate a HashSet in the spout at flush start event. Each 
emitted tuple is added to the set, and upon receiving ack, it is removed from 
the set. Upon receiving flush end event, wait for all pending messages to be 
acked.

Assuming that processing latency is small, the pending set size should be 
relatively bounded in steady state at any point of time.

When the HashSet size becomes 0, returns success. Of course, if flush periods 
are back-to-back, then start of the next flush period is same as the end of 
previous period.

Thanks,
Satish

On Wed, Jun 24, 2015 at 4:20 PM, Javier Gonzalez 
<[email protected]<mailto:[email protected]>> wrote:

Hi Satish,

Thank you for your response.

The ack at the spout guarantees that the flush message has traversed the 
topology, but makes no guarantees about the tuples emitted before the flush 
tuple. It would be possible for tuples to be in flight and be "beaten to the 
ack"  by the flush tuple, particularly if the flush tuple does not participate 
in all the processing done in the bolts to the regular tuples.

Regards,
Javier
On Jun 24, 2015 2:55 AM, "Satish Mittal" 
<[email protected]<mailto:[email protected]>> wrote:
From the way Storm's acking mechanism works, if a spout tuple gets acked, then 
it implies that tuple has completely traversed the topology. Isn't that 
sufficient?

On Tue, Jun 23, 2015 at 10:50 PM, Javier Gonzalez 
<[email protected]<mailto:[email protected]>> wrote:

Hi,

Question: how would you implement a "flush" in a topology: sending a special 
message to the topology that will in time return with a message that says 
everything up to the flush message has finished traversing the topology? (does 
that make sense?)

Regards,
Javier


_____________________________________________________________
The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.


_____________________________________________________________
The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.
This message, and any files/attachments transmitted together with it, is 
intended for the use only of the person (or persons) to whom it is addressed. 
It may contain information which is confidential and/or protected by legal 
privilege. Accordingly, any dissemination, distribution, copying or use of this 
message, or any part of it or anything sent together with it, other than by 
intended recipients, may constitute a breach of civil or criminal law and is 
hereby prohibited. Unless otherwise stated, any views expressed in this message 
are those of the person sending it and not the sender's employer. No 
responsibility, legal or otherwise, of whatever nature, is accepted as to the 
accuracy of the contents of this message or for the completeness of the message 
as received. Anyone who is not the intended recipient of this message is 
advised to make no use of it and is requested to contact Featurespace Limited 
as soon as possible. Any recipient of this message who has knowledge or 
suspects that it may have been the subject of unauthorised interception or 
alteration is also requested to contact Featurespace Limited.

Reply via email to