Hi Daniela,

> Okay, could I do the grouping already in Kafka? For example would it be 
> possible to use one topic per region or to use one topic with a partition for 
> every region? Then the messages would already be grouped when the arrive at 
> Storm. Is this correct?

You would need a kafka spout instance per topic and a separate windowed bolt 
instance that receives from the corresponding kafka spout. But such a topology 
would be difficult to manage as the number of topics increases. The other 
option is to do the grouping within the windowed bolt like I mentioned in the 
last mail. 

> Would the windowing and the aggregation for each time window be separated in 
> two bolts or is both done in one bolt?

Separate bolts are not needed for aggregation, it can be done inside the 
windowed bolt.


Thanks,
Arun




On 3/31/16, 1:23 AM, "Maria Musterfrau" <[email protected]> wrote:

>Hi Arun
>
>Sorry, I did not see your reply in the dev mailing list. Thank you very much!
>
>Okay, could I do the grouping already in Kafka? For example would it be 
>possible to use one topic per region or to use one topic with a partition for 
>every region? Then the messages would already be grouped when the arrive at 
>Storm. Is this correct?
>
>Would the windowing and the aggregation for each time window be separated in 
>two bolts or is both done in one bolt?
>
>Thank you in advance.
>
>Regards,
>Daniela
> 
> 
>
>Gesendet: Mittwoch, 30. März 2016 um 20:15 Uhr
>Von: "Arun Iyer" <[email protected]>
>An: "[email protected]" <[email protected]>, "[email protected]" 
><[email protected]>
>Betreff: Re: Combining group by and time window
>
>Reposting the reply that was posted to dev mailing list :-
> 
>
>For storm core, windowed bolts would give you the tuples in the last minute 
>but you would have to do the grouping yourself. You could of-course use a 
>fields grouping to split the load across the windowed bolts. For trident you 
>might want to take a look at the windowing apis that were added recently and 
>see if it fits your need. You have to choose between trident and core based on 
>your use cases, the guarantee you need and if you need batching vs per tuple 
>processing etc.
> 
>- Arun
>
> 
> 
>From: Maria Musterfrau
>Reply-To: "[email protected]"
>Date: Wednesday, March 30, 2016 at 10:56 PM
>To: "[email protected][[email protected]]"
>Subject: Fw: Combining group by and time window
> 
>
>Does anyone have an idea?
> 
>Thank you in advance.
> 
>Regards,
>Daniela
> 
>
>Gesendet: Montag, 28. März 2016 um 21:06 Uhr
>Von: "Maria Musterfrau" <[email protected][[email protected]]>
>An: [email protected][[email protected]]
>Betreff: Combining group by and time window
>
>Hi,
> 
>I have a stream with time series data from different regions. I would like to 
>group the stream by the different regions and to add up the values of the last 
>minute (time window) per region. The sums should be persisted to Redis or 
>something like this.
> 
>I already found out that Storm Trident provides a group by function to split 
>the stream. I think this could be useful.
>Storm core provides time windows, so I could use it for the aggregation.
> 
>But how can I combine these two components? Or is this not possible?
> 
>Would it be useful to do the grouping already in Kafka (with different topics) 
>or is it better to do it in Storm
> 
>Thank you in advance.
> 
>Regards,
>Daniela
>

Reply via email to