Thank you, Satish and Simon. We're thinking of a design analogous to the
ones you propose (keeping state in the spout). One question: how would you
deal with the case of a spout crash? You would lose the in memory state,
right?

Regards,
Javier

On Fri, Jun 26, 2015 at 12:48 PM, Simon Cooper <
[email protected]> wrote:

>  We’ve implemented this internally.
>
>
>
> Within the spout, we keep track of all the ‘in-flight’ tuples in a Set.
> Then, when we need to do a flush (determined by several internal and
> external conditions), we stop sending new tuples through and wait for all
> the pending tuples to be acked (or failed). Then we send a flush tuple
> through (it’s emitted on a separate ‘control’ stream connected to all the
> running bolts, independent of main tuple processing), wait for that to be
> acked, and then start sending normal tuples through again.
>
>
>
> This is implemented using a reasonably simple state machine inside the
> spout to ensure nextTuple() always does the correct thing whatever’s going
> on in the topology. It also keeps track of whether the topology is ‘idle’,
> and doesn’t bother flushing when there’s nothing to flush.
>
>
>
> Hope this helps,
>
> SimonC
>
>
>
> *From:* Satish Mittal [mailto:[email protected]]
> *Sent:* 26 June 2015 16:17
> *To:* [email protected]
> *Subject:* Re: Flushing a topology?
>
>
>
> Assuming each flush period is associated with a start and end event, one
> approach is to instantiate a HashSet in the spout at flush start event.
> Each emitted tuple is added to the set, and upon receiving ack, it is
> removed from the set. Upon receiving flush end event, wait for all pending
> messages to be acked.
>
>
>
> Assuming that processing latency is small, the pending set size should be
> relatively bounded in steady state at any point of time.
>
>
>
> When the HashSet size becomes 0, returns success. Of course, if flush
> periods are back-to-back, then start of the next flush period is same as
> the end of previous period.
>
>
>
> Thanks,
>
> Satish
>
>
>
> On Wed, Jun 24, 2015 at 4:20 PM, Javier Gonzalez <[email protected]>
> wrote:
>
> Hi Satish,
>
> Thank you for your response.
>
> The ack at the spout guarantees that the flush message has traversed the
> topology, but makes no guarantees about the tuples emitted before the flush
> tuple. It would be possible for tuples to be in flight and be "beaten to
> the ack"  by the flush tuple, particularly if the flush tuple does not
> participate in all the processing done in the bolts to the regular tuples.
>
> Regards,
> Javier
>
> On Jun 24, 2015 2:55 AM, "Satish Mittal" <[email protected]> wrote:
>
>   From the way Storm's acking mechanism works, if a spout tuple gets
> acked, then it implies that tuple has completely traversed the topology.
> Isn't that sufficient?
>
>
>
> On Tue, Jun 23, 2015 at 10:50 PM, Javier Gonzalez <[email protected]>
> wrote:
>
> Hi,
>
> Question: how would you implement a "flush" in a topology: sending a
> special message to the topology that will in time return with a message
> that says everything up to the flush message has finished traversing the
> topology? (does that make sense?)
>
> Regards,
> Javier
>
>
>
>
>
> _____________________________________________________________
>
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>
>
>
>
>
> _____________________________________________________________
>
> The information contained in this communication is intended solely for the
> use of the individual or entity to whom it is addressed and others
> authorized to receive it. It may contain confidential or legally privileged
> information. If you are not the intended recipient you are hereby notified
> that any disclosure, copying, distribution or taking any action in reliance
> on the contents of this information is strictly prohibited and may be
> unlawful. If you have received this communication in error, please notify
> us immediately by responding to this email and then delete it from your
> system. The firm is neither liable for the proper and complete transmission
> of the information contained in this communication nor for any delay in its
> receipt.
>  This message, and any files/attachments transmitted together with it, is
> intended for the use only of the person (or persons) to whom it is
> addressed. It may contain information which is confidential and/or
> protected by legal privilege. Accordingly, any dissemination, distribution,
> copying or use of this message, or any part of it or anything sent together
> with it, other than by intended recipients, may constitute a breach of
> civil or criminal law and is hereby prohibited. Unless otherwise stated,
> any views expressed in this message are those of the person sending it and
> not the sender's employer. No responsibility, legal or otherwise, of
> whatever nature, is accepted as to the accuracy of the contents of this
> message or for the completeness of the message as received. Anyone who is
> not the intended recipient of this message is advised to make no use of it
> and is requested to contact Featurespace Limited as soon as possible. Any
> recipient of this message who has knowledge or suspects that it may have
> been the subject of unauthorised interception or alteration is also
> requested to contact Featurespace Limited.
>



-- 
Javier González Nicolini

Reply via email to