Hi,

Assuming FLINK-6465 lands, will something like

SELECT COUNT(*) FROM (SELECT FIRST_VALUE(names) FROM stream) GROUP BY
HOP(proctime, INTERVAL '1' MINUTE, INTERVAL '1' MINUTE)

works?

~Haohui

On Fri, Sep 29, 2017 at 6:52 PM Ron Crocker <rcroc...@newrelic.com> wrote:

> Hi -
>
> I have a colleague who is trying to write a flink job that will determine
> deltas from period to period. Let’s say the periods are 1 minutes. What he
> would like to do is report in minute 2 those things that are new since from
> minute 1, then in minute 3 report those things that are new also since
> minute 1.
>
> For example, consider the stream looks like
>
> minute | name
> =======|=======
>      1 | abc
>      1 | def
>      2 | abc
>      2 | ghi
>      3 | abc
>      3 | def
>      4 | ghi
>      4 | jkl
>
>
> What we would like to report is:
>
> minute | count | names
> =======|=======|=======
>      1 |     2 | abc, def
>      2 |     1 | ghi
>      3 |     0 |
>      4 |     1 | jkl
>
>
> In minute 2, abc was already seen but ghi is new, so it gets reported out
> as new. In minute 3, abc and def havalready been seen, so there are no new
> names, and again in minute 4 ghi has been seen but jkl is new, so we report
> out the 1 new name.
>
> I’m struggling to help and thought someone here might be able to help. I
> have thought about merging two streams (the stream of new things and the
> stream of the full set seen so far) but haven’t tried that yet.
>
> I welcome any of your inputs.
>
> Thanks!
>
> Ron
> —
> Ron Crocker
> Principal Engineer & Architect
> ( ( •)) New Relic
> rcroc...@newrelic.com
> M: +1 630 363 8835 <(630)%20363-8835>
>

Reply via email to