We’ve implemented this internally. Within the spout, we keep track of all the ‘in-flight’ tuples in a Set. Then, when we need to do a flush (determined by several internal and external conditions), we stop sending new tuples through and wait for all the pending tuples to be acked (or failed). Then we send a flush tuple through (it’s emitted on a separate ‘control’ stream connected to all the running bolts, independent of main tuple processing), wait for that to be acked, and then start sending normal tuples through again.
This is implemented using a reasonably simple state machine inside the spout to ensure nextTuple() always does the correct thing whatever’s going on in the topology. It also keeps track of whether the topology is ‘idle’, and doesn’t bother flushing when there’s nothing to flush. Hope this helps, SimonC From: Satish Mittal [mailto:[email protected]] Sent: 26 June 2015 16:17 To: [email protected] Subject: Re: Flushing a topology? Assuming each flush period is associated with a start and end event, one approach is to instantiate a HashSet in the spout at flush start event. Each emitted tuple is added to the set, and upon receiving ack, it is removed from the set. Upon receiving flush end event, wait for all pending messages to be acked. Assuming that processing latency is small, the pending set size should be relatively bounded in steady state at any point of time. When the HashSet size becomes 0, returns success. Of course, if flush periods are back-to-back, then start of the next flush period is same as the end of previous period. Thanks, Satish On Wed, Jun 24, 2015 at 4:20 PM, Javier Gonzalez <[email protected]<mailto:[email protected]>> wrote: Hi Satish, Thank you for your response. The ack at the spout guarantees that the flush message has traversed the topology, but makes no guarantees about the tuples emitted before the flush tuple. It would be possible for tuples to be in flight and be "beaten to the ack" by the flush tuple, particularly if the flush tuple does not participate in all the processing done in the bolts to the regular tuples. Regards, Javier On Jun 24, 2015 2:55 AM, "Satish Mittal" <[email protected]<mailto:[email protected]>> wrote: From the way Storm's acking mechanism works, if a spout tuple gets acked, then it implies that tuple has completely traversed the topology. Isn't that sufficient? On Tue, Jun 23, 2015 at 10:50 PM, Javier Gonzalez <[email protected]<mailto:[email protected]>> wrote: Hi, Question: how would you implement a "flush" in a topology: sending a special message to the topology that will in time return with a message that says everything up to the flush message has finished traversing the topology? (does that make sense?) Regards, Javier _____________________________________________________________ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. _____________________________________________________________ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt. This message, and any files/attachments transmitted together with it, is intended for the use only of the person (or persons) to whom it is addressed. It may contain information which is confidential and/or protected by legal privilege. Accordingly, any dissemination, distribution, copying or use of this message, or any part of it or anything sent together with it, other than by intended recipients, may constitute a breach of civil or criminal law and is hereby prohibited. Unless otherwise stated, any views expressed in this message are those of the person sending it and not the sender's employer. No responsibility, legal or otherwise, of whatever nature, is accepted as to the accuracy of the contents of this message or for the completeness of the message as received. Anyone who is not the intended recipient of this message is advised to make no use of it and is requested to contact Featurespace Limited as soon as possible. Any recipient of this message who has knowledge or suspects that it may have been the subject of unauthorised interception or alteration is also requested to contact Featurespace Limited.
