Derek Dagit created STORM-746:
---------------------------------

             Summary: Disable Spout Ack Init when there is no output task
                 Key: STORM-746
                 URL: https://issues.apache.org/jira/browse/STORM-746
             Project: Apache Storm
          Issue Type: Improvement
    Affects Versions: 0.9.2-incubating
            Reporter: Derek Dagit
            Assignee: Derek Dagit
            Priority: Minor


Suppose a user cannot easily modify the spout in the topology.
The user has temporarily disabled transferring of tuples from a spout, for 
debugging.

In this case, when acking is used, each time the spout emits, it sends a tuple 
to the acker bolt.  The bolt executes on this tuple by initializing the 
bit-field used for tracking when the tuple "tree" has completed processing 
(XOR-ing the new field with 0), then checking whether processing is complete 
(by comparing the field to 0), and finally sending an ack in reply to the spout.

Normally, this is not a problem beyond the overhead, but on at least one 
occasion in the course of debugging topology performance, the acker bolt's host 
was so overloaded that it actually could not send the reply ack back to the 
spout before the spout timed it out.  This resulted in a lot of Fails reported 
for tuples that were not supposed to go anywhere in the first place, and an 
unnecessary count against the max.spout.pending that evidently also makes it 
harder to debug.

This was very confusing to the user.


I propose that we short-cut the ack init in the case when the spout does not 
emit to any downstream tasks.

I do have some misgivings already about making this change, as a spout emitting 
nowhere could be considered outside the set of normal use cases for Storm.  
That said, I will not be unhappy if someone gives a -1.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to