Re: [ANNOUNCE] Apache Flink 1.18.1 released

2024-01-22 文章 Jing Ge
Hi folks,

I am still working on the official images because of the issue
https://issues.apache.org/jira/browse/FLINK-34165. Images under
apache/flink are
available.

Best regards,
Jing

On Sun, Jan 21, 2024 at 11:06 PM Jing Ge  wrote:

> Thanks Leonard for the feedback! Also thanks @Jark Wu  
> @Chesnay
> Schepler  and each and everyone who worked closely
> with me for this release. We made it together!
>
> Best regards,
> Jing
>
> On Sun, Jan 21, 2024 at 9:25 AM Leonard Xu  wrote:
>
>> Thanks Jing for driving the release, nice work!
>>
>> Thanks all who involved this release!
>>
>> Best,
>> Leonard
>>
>> > 2024年1月20日 上午12:01,Jing Ge  写道:
>> >
>> > The Apache Flink community is very happy to announce the release of
>> Apache
>> > Flink 1.18.1, which is the first bugfix release for the Apache Flink
>> 1.18
>> > series.
>> >
>> > Apache Flink® is an open-source stream processing framework for
>> > distributed, high-performing, always-available, and accurate data
>> streaming
>> > applications.
>> >
>> > The release is available for download at:
>> > https://flink.apache.org/downloads.html
>> >
>> > Please check out the release blog post for an overview of the
>> improvements
>> > for this bugfix release:
>> >
>> https://flink.apache.org/2024/01/19/apache-flink-1.18.1-release-announcement/
>> >
>> > Please note: Users that have state compression should not migrate to
>> 1.18.1
>> > (nor 1.18.0) due to a critical bug that could lead to data loss. Please
>> > refer to FLINK-34063 for more information.
>> >
>> > The full release notes are available in Jira:
>> >
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640
>> >
>> > We would like to thank all contributors of the Apache Flink community
>> who
>> > made this release possible! Special thanks to @Qingsheng Ren @Leonard Xu
>> > @Xintong Song @Matthias Pohl @Martijn Visser for the support during this
>> > release.
>> >
>> > A Jira task series based on the Flink release wiki has been created for
>> > 1.18.1 release. Tasks that need to be done by PMC have been explicitly
>> > created separately. It will be convenient for the release manager to
>> reach
>> > out to PMC for those tasks. Any future patch release could consider
>> cloning
>> > it and follow the standard release process.
>> > https://issues.apache.org/jira/browse/FLINK-33824
>> >
>> > Feel free to reach out to the release managers (or respond to this
>> thread)
>> > with feedback on the release process. Our goal is to constantly improve
>> the
>> > release process. Feedback on what could be improved or things that
>> didn't
>> > go so well are appreciated.
>> >
>> > Regards,
>> > Jing
>>
>>


RE: Re:RE: Re:RE: 如何基于flink cdc 3.x自定义不基于debezium的全增量一体source connector?

2024-01-22 文章 Jiabao Sun
Hi,

ResumeToken[1] can be considered globally unique[2].

Best,
Jiabao

[1] https://www.mongodb.com/docs/manual/changeStreams/#resume-tokens
[2] 
https://img.alicdn.com/imgextra/i4/O1CN010e81SP1vkgoyL0nhd_!!66211-0-tps-2468-1360.jpg

On 2024/01/22 09:36:42 "casel.chen" wrote:
> 
> 
> 
> V1版本依赖于DebeziumSourceFunction,后者依赖于DebeziumEngine产生changelog
> V2版本虽然依赖了 flink-connector-debezium 但没有用到debezium内部类
> 
> 
> 另外有一个问题:mongodb change stream断点续传用的resumeToken是像mysql binlog offset一样全局唯一么?
> 如果数据源是像kafka一样每个分片有binlog offset的话,
> 是不是要在对应xxxOffset类中要定义一个Map类型的offsetField 
> (类似mongodb对应ChangeStreamOffset中的resumeTokenField)? 
> 当前mongodb中定义的是Json String类型
> 
> 在 2024-01-22 11:03:55,"Jiabao Sun"  写道:
> >Hi,
> >
> >Flink CDC MongoDB connector V1是基于 mongo-kafka 实现的,没有基于 debezium 实现。
> >Flink CDC MongoDB connector V2是基于增量快照读框架实现的,不依赖 mongo-kafka 和 debezium 。
> >
> >Best,
> >Jiabao
> >
> >[1] https://github.com/mongodb/mongo-kafka
> >
> >
> >On 2024/01/22 02:57:38 "casel.chen" wrote:
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> Flink CDC MongoDB connector 还是基于debezium实现的
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 在 2024-01-22 10:14:32,"Jiabao Sun"  写道:
> >> >Hi,
> >> >
> >> >可以参考 Flink CDC MongoDB connector 的实现。
> >> >
> >> >Best,
> >> >Jiabao
> >> >
> >> >
> >> >On 2024/01/22 02:06:37 "casel.chen" wrote:
> >> >> 现有一种数据源不在debezium支持范围内,需要通过flink sql全增量一体消费,想着基于flink cdc 
> >> >> 3.x自行开发,查了一下现有大部分flink cdc source 
> >> >> connector都是基于debezium库开发的,只有oceanbase和tidb不是,但后二者用的source接口还是V1版本的,不是最新V2版本的incremental
> >> >>  snapshot,意味着全量阶段不支持checkpoint,如果作业失败需要重新从头消费。
> >> >> 想问一下有没有不基于debezium实现的V2版本source connector示例?谢谢!
> >> 
> 

Re:RE: Re:RE: 如何基于flink cdc 3.x自定义不基于debezium的全增量一体source connector?

2024-01-22 文章 casel.chen



V1版本依赖于DebeziumSourceFunction,后者依赖于DebeziumEngine产生changelog
V2版本虽然依赖了 flink-connector-debezium 但没有用到debezium内部类


另外有一个问题:mongodb change stream断点续传用的resumeToken是像mysql binlog offset一样全局唯一么?
如果数据源是像kafka一样每个分片有binlog offset的话,
是不是要在对应xxxOffset类中要定义一个Map类型的offsetField 
(类似mongodb对应ChangeStreamOffset中的resumeTokenField)? 
当前mongodb中定义的是Json String类型

在 2024-01-22 11:03:55,"Jiabao Sun"  写道:
>Hi,
>
>Flink CDC MongoDB connector V1是基于 mongo-kafka 实现的,没有基于 debezium 实现。
>Flink CDC MongoDB connector V2是基于增量快照读框架实现的,不依赖 mongo-kafka 和 debezium 。
>
>Best,
>Jiabao
>
>[1] https://github.com/mongodb/mongo-kafka
>
>
>On 2024/01/22 02:57:38 "casel.chen" wrote:
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> Flink CDC MongoDB connector 还是基于debezium实现的
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2024-01-22 10:14:32,"Jiabao Sun"  写道:
>> >Hi,
>> >
>> >可以参考 Flink CDC MongoDB connector 的实现。
>> >
>> >Best,
>> >Jiabao
>> >
>> >
>> >On 2024/01/22 02:06:37 "casel.chen" wrote:
>> >> 现有一种数据源不在debezium支持范围内,需要通过flink sql全增量一体消费,想着基于flink cdc 
>> >> 3.x自行开发,查了一下现有大部分flink cdc source 
>> >> connector都是基于debezium库开发的,只有oceanbase和tidb不是,但后二者用的source接口还是V1版本的,不是最新V2版本的incremental
>> >>  snapshot,意味着全量阶段不支持checkpoint,如果作业失败需要重新从头消费。
>> >> 想问一下有没有不基于debezium实现的V2版本source connector示例?谢谢!
>>