Re: Streaming groupby and aggregation by field expressions

Fabian Hueske Wed, 05 Nov 2014 10:27:13 -0800

Hi Gyula,

great to see so much progress with Flink Streaming!


Regarding the improved aggregations, there is another effort to improve the
aggregations of the batch API, which was recently discussed on the mailing
list [1].
I think it would make sense to try to keep the streaming and batch APIs
close together. Would the proposed batch approach also work for the
streaming case and if not what is missing?

Cheers, Fabian

[1]
http://mail-archives.apache.org/mod_mbox/flink-dev/201410.mbox/%3C44B1AB07-F993-436F-AE23-8CC4CCC08A54%40tu-berlin.de%3E

2014-11-05 18:37 GMT+01:00 Gyula Fóra <[email protected]>:

> Hey guys,
>
> Just a quick note on some upcoming API updates for the Streaming api.
>
> Now it will be possible to use field expressions for both grouping and
> aggregations in the streaming api. You can check it out here
> <
> https://github.com/mbalassi/incubator-flink/blob/daba36e142537ca0bd7e4d0f1209ce8b0ebecda5/flink-addons/flink-streaming/flink-streaming-examples/src/main/java/org/apache/flink/streaming/examples/wordcount/PojoWordCount.java#L102
> >
> .
>
> Or in a concise form:
> DataStream<Word> counts = text.flatMap(new Tokenizer()).groupBy("word")
> .sum("frequency");
>
> I will still do some more testing before it will be available in the master
> branch.
>
> I am also planning to extend aggregations to more fields at the same time
> like
> sum(1,2,2) or max("a","c").
>
> Regards,
> Gyula
>

Re: Streaming groupby and aggregation by field expressions

Reply via email to