Sounds like you are setting the tick tuple value globally for the whole
topology.  You could enable bolt-level configuration and then set the
values of each bolt as you like.  That would be a small amount of new code
per bolt.

On 1/11/16, 9:59 AM, "Steve Miller" <[email protected]> wrote:

>Hi.  In the project I'm working on, we have a lot of code that basically:
>
>    * consumes normal tuples as they come in, building up some sort of
>aggregated representation of what was in those tuples
>    * then when a tick tuple comes in, it publishes the whole set of data
>(e.g., it sends the aggregates to some other bolt for processing, or
>publishes to Kafka or Cassandra, whatever)
>
>Of course, given the most straightforward implementation of that, given
>that the bolts typically start at more or less the same time, the tick
>tuples all get delivered at the same time.  So it's really easy to end up
>in a circumstance where some downstream consumer spends 59 seconds out of
>60 doing nothing, then gets completely pounded on for a second, then
>spends the next 59 seconds doing nothing.
>
>In our use cases, generally we want to do things like aggregate data for
>60 seconds, but the aggregates don't all need to line up.
>
>I keep thinking that if there was a way to tell Storm that we want a tick
>tuple every 60 seconds, but delay for a random number of seconds between
>0 and 60 before you send the first one, that'd just fix this right up.
>But I don't see an obvious way to do that.
>
>Clearly there are ways in which we can take care of this in our code,
>they just involve more code. (-:
>
>It seems like this would be a common use case.  Are there better
>approaches?  Is there some trick that would make it possible to smear the
>tick tuples out over time?  If you're in this situation, how do you
>handle it?
>
>I'd love to be missing something easy and obvious.
>
>Thanks!
>
>       -Steve
>

Reply via email to