Hi Shaoxuan,

thanks a lot for this proposal!
Support for retractions is a super nice and important feature and will
enable many more use cases for the Table API / SQL.
I'm really excited to see this happening. I made a first pass over your
proposal and added a few comments. I'll do another pass soon.

Since it is only 6 weeks left until the feature freeze for Flink 1.3, I
propose to develop the retraction support in a feature branch.
IMO, we must make sure that either all operators support retraction or
none. Otherwise, the behavior of the Table API / SQL will not be
predictable.

I also think that we should define which operators we want to support in
Flink 1.3 in order to coordinate the development of retraction support.

What do others think?

Cheers, Fabian


2017-03-14 16:53 GMT+01:00 Shaoxuan Wang <wshaox...@gmail.com>:

> Hello everyone,
>
> Flink is widely used in Alibaba Group, especially in our Search and
> Recommendation Infra. Retraction is one of the most important features that
> we needed. We have spent lots of efforts to try to solve this problem, and
> gladly at the end we develop an approach which can address most of
> retraction problems in our production scenarios. Same as usual, we (Alibaba
> search-data infra team) would like to share our retraction solution to the
> entire Flink community. If you like this proposal, I would also like to
> make it as one of the FLIPs. I am attaching the design doc of "Retraction
> for Flink Streaming" as well as the introduction section below. I have also
> created a master jira (FLINK-6047) to track the discussion and design of
> the Flink retraction. All suggestions and comments are welcome.
>
>
> *Design doc:*
> https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGT
> Qjnz7emkVpZlkw
>
> *Introduction:*
>
> "Retraction" is an important building block for data streaming to refine
> the early fired results in streaming. “Early firing” are very common and
> widely used in many streaming scenarios, for instance “window-less” or
> unbounded aggregate and stream-stream inner join, windowed (with early
> firing) aggregate and stream-stream inner join. As described in Streaming
> 102, there are mainly two cases that require retractions: 1) update on the
> keyed table (the key is either a primaryKey (PK) on source table, or a
> groupKey/partitionKey in an aggregate); 2) When dynamic windows (e.g.,
> session window) are in use, the new value may be replacing more than one
> previous window due to window merging.
>
> To the best of our knowledge, the retraction for the early fired streaming
> results has never been practically solved before. In this proposal, we
> develop a retraction solution and explain how it works for the problem of
> “update on the keyed table”. The same solution can be easily extended for
> the dynamic windows merging, as the key component of retraction - how to
> refine an early fired results - is the same across different problems.
>
> *Master Jira: *
> https://issues.apache.org/jira/browse/FLINK-6047
>
>
> Regards,
> Shaoxuan
>

Reply via email to