Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-15 Thread meneldor
The query which I'm testing now(trying to avoid the deduplication query because of tombstones) is *almost* correct but there are two questions which I can find an answer to: 1. Some of the *id*'s are just stopping to be produced. 2. Does the Tuble window select only the records whose upd_ts is new

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-11 Thread meneldor
> > Are you sure that the null records are not actually tombstone records? If > you use upsert tables you usually want to have them + compaction. Or how > else will you deal with deletions? yes they are tombstone records, but i cannot avoid them because the deduplication query cant produce an appe

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-11 Thread Arvid Heise
Hi, Are you sure that the null records are not actually tombstone records? If you use upsert tables you usually want to have them + compaction. Or how else will you deal with deletions? Is there anyone who is successfully deduplicating CDC records into either > kafka topic or S3 files(CSV/parquet

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-09 Thread meneldor
Unfortunately using row_ts doesn't help. Setting the kafka topic cleanup.policy to compact is not a very good idea as it increases cpu, memory and might lead to other problems. So for now I'll just ignore the null records. Is there anyone who is successfully deduplicating CDC records into either ka

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-08 Thread meneldor
Thanks for the quick reply, Timo. Ill test with the row_ts and compaction mode suggestions. However, ive read somewhere in the archives that the append only stream is only possible if i extract "the first" record from the ranking only which in my case is the oldest record. Regards On Mon, Feb 8,

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-08 Thread Timo Walther
Hi, could the problem be that you are mixing OVER and TUMBLE window with each other? The TUMBLE is correctly defined over time attribute `row_ts` but the OVER window is defined using a regular column `upd_ts`. This might be the case why the query is not append-only but updating. Maybe you ca

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-08 Thread Khachatryan Roman
Hi, AFAIK this should be supported in 1.12 via FLINK-19568 [1] I'm pulling in Timo and Jark who might know better. https://issues.apache.org/jira/browse/FLINK-19857 Regards, Roman On Mon, Feb 8, 2021 at 9:14 AM meneldor wrote: > Any help please? Is there a way to use the "Last row" from a de

Re: Kafka connector doesn't support consuming update and delete changes in Table SQL API

2021-02-08 Thread meneldor
Any help please? Is there a way to use the "Last row" from a deduplication in an append-only stream or tell upsert-kafka to not produce *null* records in the sink? Thank you On Thu, Feb 4, 2021 at 1:22 PM meneldor wrote: > Hello, > Flink 1.12.1(pyflink) > I am deduplicating CDC records coming f