Implement 2 bolts subscribing to the same spout tuple.
bolt1 is group by time stamp and bolt group by resource id.

A spout can have multiple bolts following.


On 3/21/14, Jonas Bergström <[email protected]> wrote:
> Hi, I've just started using Storm and have a couple of questions.
>
> I have a stream of events that consists of a timestamp and a resource-id
> and I want to "bucket" them into discrete time-buckets, e.g. 1 minute long,
> and also group on resource-id so that even if the same resource-id is
> encountered multiple times during the same time bucket it is only counted
> as one.
>
> I'm mapping the timestamp onto a date-string with minute granularity and
> groups on that, which woks fine. But I don't understand how to add the
> grouping on resource-id as well.
>
> For example, I want the following stream [timestamp,id]:
> "2014-03-20 14:18:32,887,1"
> "2014-03-20 14:18:42,887,2"
> "2014-03-20 14:18:52,887,1"
> "2014-03-20 14:18:57,887,1"
> "2014-03-20 14:18:58,887,3"
> "2014-03-20 14:19:07,887,1"
>
> to result in [timebucket,count]:
> "2014-03-20 14:18:00,3"
> "2014-03-20 14:19:00,1"
>
> Any ideas?
> I already implemented this using tick-tuples and grouping on resource-id,
> but I want to use Trident instead and be able to catch up properly if I
> restart the Storm cluster.
>
> Also, I read in several places that one can have a spout batch by
> "punctuation", which fits my use case well. But I haven't understood how
> this can be implemented. Does anybody have any pointers?
>
>
> Many thanks / Jonas
>

Reply via email to