Hi all,

With the continuous efforts from the community, we already supported
`flatAggregate`[1] on TableAPI in retract mode. I think It's better to add
upsert mode for  `flatAggregate`.

The result table of streaming non-window `flatAggregate` is a table
contains updates. We can, of course, use a RetractStreamTableSink[2] to
emit the table, but we can get better performance in upsert mode.  However,
due to the lack of keys, we can’t use an UpsertStreamTableSink to emit the
table. We don’t have this problem for a normal aggregate as it emits a
single row for each group, so the unique keys are exactly the same with the
group keys. While for a `flatAggregate`, its pretty difference that due to
emits multi rows(a “sub-table”) for a single group. To solve this problem,
we need to find a way to define keys on flatAggregate, so that we can also
use upsert sink to emit the result table after flatAggregate.

So, Aljoscha, Hequn and I prepared a design document for how to define the
update keys for  `flatAggregate` in upsert mode.  The detail can be found
here:

https://docs.google.com/document/d/183qHo8PDG-xserDi_AbGP6YX9aPY0rVr80p3O3Gyz6U/edit?usp=sharing

I appreciate it if you can have look at the document and any comments are
welcome!


Best,

Jincheng


[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=97552739

[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sourceSinks.html#defining-a-streamtablesource

Reply via email to