Hi all,

A couple of first comments on this:
1. I'm missing the problem statement in the overall introduction. It
immediately goes into proposal mode, I would like to first read what is the
actual problem, before diving into solutions.
2. "Each ETL job creates snapshots with checkpoint info on sink tables in
Table Store"  -> That reads like you're proposing that snapshots need to be
written to Table Store?
3. If you introduce a MetaService, it becomes the single point of failure
because it coordinates everything. But I can't find anything in the FLIP on
making the MetaService high available or how to deal with failovers there.
4. The FLIP states under Rejected Alternatives "Currently watermark in
Flink cannot align data." which is not true, given that there is FLIP-182
https://cwiki.apache.org/confluence/display/FLINK/FLIP-182%3A+Support+watermark+alignment+of+FLIP-27+Sources

5. Given the MetaService role, it feels like this is introducing a tight
dependency between Flink and the Table Store. How pluggable is this
solution, given the changes that need to be made to Flink in order to
support this?

Best regards,

Martijn


On Thu, Dec 1, 2022 at 4:49 AM Shammon FY <zjur...@gmail.com> wrote:

> Hi devs:
>
> I'd like to start a discussion about FLIP-276: Data Consistency of
> Streaming and Batch ETL in Flink and Table Store[1]. In the whole data
> stream processing, there are consistency problems such as how to manage the
> dependencies of multiple jobs and tables, how to define and handle E2E
> delays, and how to ensure the data consistency of queries on flowing data?
> This FLIP aims to support data consistency and answer these questions.
>
> I'v discussed the details of this FLIP with @Jingsong Lee and @libenchao
> offline several times. We hope to support data consistency of queries on
> tables, managing relationships between Flink jobs and tables and revising
> tables on streaming in Flink and Table Store to improve the whole data
> stream processing.
>
> Looking forward to your feedback.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-276%3A+Data+Consistency+of+Streaming+and+Batch+ETL+in+Flink+and+Table+Store
>
>
> Best,
> Shammon
>

Reply via email to