Implement 2 bolts subscribing to the same spout tuple. bolt1 is group by time stamp and bolt group by resource id.
A spout can have multiple bolts following. On 3/21/14, Jonas Bergström <[email protected]> wrote: > Hi, I've just started using Storm and have a couple of questions. > > I have a stream of events that consists of a timestamp and a resource-id > and I want to "bucket" them into discrete time-buckets, e.g. 1 minute long, > and also group on resource-id so that even if the same resource-id is > encountered multiple times during the same time bucket it is only counted > as one. > > I'm mapping the timestamp onto a date-string with minute granularity and > groups on that, which woks fine. But I don't understand how to add the > grouping on resource-id as well. > > For example, I want the following stream [timestamp,id]: > "2014-03-20 14:18:32,887,1" > "2014-03-20 14:18:42,887,2" > "2014-03-20 14:18:52,887,1" > "2014-03-20 14:18:57,887,1" > "2014-03-20 14:18:58,887,3" > "2014-03-20 14:19:07,887,1" > > to result in [timebucket,count]: > "2014-03-20 14:18:00,3" > "2014-03-20 14:19:00,1" > > Any ideas? > I already implemented this using tick-tuples and grouping on resource-id, > but I want to use Trident instead and be able to catch up properly if I > restart the Storm cluster. > > Also, I read in several places that one can have a spout batch by > "punctuation", which fits my use case well. But I haven't understood how > this can be implemented. Does anybody have any pointers? > > > Many thanks / Jonas >
