Ah, I see. I don¹t believe that is possible at all with tuck tuples, but perhaps someone else will know.
On 1/11/16, 11:40 AM, "Steve Miller" <[email protected]> wrote: >Thanks. It seems like that'd help if we wanted different values per >bolt, but we want the same value -- we just want the actual times on the >tuples to happen at different times. That is, if we want stats over a >60-second period, we want the tick tuple interval to be 60 seconds >everywhere. > >But let's say we have 60 bolts. What we have now is that if all the >bolts started running at 00:00:23, they all emit/publish at 00:01:23, >00:02:23, and so forth. What I'd like is to have one bolt publish at >00:01:23, one at 00:01:24, and so forth up to 00:02:22, or a close >approximation. > > -Steve > >On Mon, Jan 11, 2016 at 04:42:03PM +0000, Aaron.Dossett wrote: >> Sounds like you are setting the tick tuple value globally for the whole >> topology. You could enable bolt-level configuration and then set the >> values of each bolt as you like. That would be a small amount of new >>code >> per bolt. >> >> On 1/11/16, 9:59 AM, "Steve Miller" <[email protected]> wrote: >> >> >Hi. In the project I'm working on, we have a lot of code that >>basically: >> > >> > * consumes normal tuples as they come in, building up some sort of >> >aggregated representation of what was in those tuples >> > * then when a tick tuple comes in, it publishes the whole set of >>data >> >(e.g., it sends the aggregates to some other bolt for processing, or >> >publishes to Kafka or Cassandra, whatever) >> > >> >Of course, given the most straightforward implementation of that, given >> >that the bolts typically start at more or less the same time, the tick >> >tuples all get delivered at the same time. So it's really easy to end >>up >> >in a circumstance where some downstream consumer spends 59 seconds out >>of >> >60 doing nothing, then gets completely pounded on for a second, then >> >spends the next 59 seconds doing nothing. >> > >> >In our use cases, generally we want to do things like aggregate data >>for >> >60 seconds, but the aggregates don't all need to line up. >> > >> >I keep thinking that if there was a way to tell Storm that we want a >>tick >> >tuple every 60 seconds, but delay for a random number of seconds >>between >> >0 and 60 before you send the first one, that'd just fix this right up. >> >But I don't see an obvious way to do that. >> > >> >Clearly there are ways in which we can take care of this in our code, >> >they just involve more code. (-: >> > >> >It seems like this would be a common use case. Are there better >> >approaches? Is there some trick that would make it possible to smear >>the >> >tick tuples out over time? If you're in this situation, how do you >> >handle it? >> > >> >I'd love to be missing something easy and obvious. >> > >> >Thanks! >> > >> > -Steve >> > >> >
