Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Best, Zakelly On Thu, Mar 21, 2024 at 12:05 PM weijie guo wrote: > Congratulations! Well done. > > > Best regards, > > Weijie > > > Feng Jin 于2024年3月21日周四 11:40写道: > >> Congratulations! >> >> >> Best, >> Feng >> >> >> On Thu, Mar 21, 2024 at 11:37 AM Ron liu wrote: >> >> > Congratulations! >> > >> > Best, >> > Ron >> > >> > Jark Wu 于2024年3月21日周四 10:46写道: >> > >> > > Congratulations and welcome! >> > > >> > > Best, >> > > Jark >> > > >> > > On Thu, 21 Mar 2024 at 10:35, Rui Fan <1996fan...@gmail.com> wrote: >> > > >> > > > Congratulations! >> > > > >> > > > Best, >> > > > Rui >> > > > >> > > > On Thu, Mar 21, 2024 at 10:25 AM Hang Ruan >> > > wrote: >> > > > >> > > > > Congrattulations! >> > > > > >> > > > > Best, >> > > > > Hang >> > > > > >> > > > > Lincoln Lee 于2024年3月21日周四 09:54写道: >> > > > > >> > > > >> >> > > > >> Congrats, thanks for the great work! >> > > > >> >> > > > >> >> > > > >> Best, >> > > > >> Lincoln Lee >> > > > >> >> > > > >> >> > > > >> Peter Huang 于2024年3月20日周三 22:48写道: >> > > > >> >> > > > >>> Congratulations >> > > > >>> >> > > > >>> >> > > > >>> Best Regards >> > > > >>> Peter Huang >> > > > >>> >> > > > >>> On Wed, Mar 20, 2024 at 6:56 AM Huajie Wang > > >> > > > wrote: >> > > > >>> >> > > > >> > > > Congratulations >> > > > >> > > > >> > > > >> > > > Best, >> > > > Huajie Wang >> > > > >> > > > >> > > > >> > > > Leonard Xu 于2024年3月20日周三 21:36写道: >> > > > >> > > > > Hi devs and users, >> > > > > >> > > > > We are thrilled to announce that the donation of Flink CDC as >> a >> > > > > sub-project of Apache Flink has completed. We invite you to >> > explore >> > > > the new >> > > > > resources available: >> > > > > >> > > > > - GitHub Repository: https://github.com/apache/flink-cdc >> > > > > - Flink CDC Documentation: >> > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable >> > > > > >> > > > > After Flink community accepted this donation[1], we have >> > completed >> > > > > software copyright signing, code repo migration, code cleanup, >> > > > website >> > > > > migration, CI migration and github issues migration etc. >> > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, >> > > > > Qingsheng Ren, Jiabao Sun, LvYanquan, loserwang1024 and other >> > > > contributors >> > > > > for their contributions and help during this process! >> > > > > >> > > > > >> > > > > For all previous contributors: The contribution process has >> > > slightly >> > > > > changed to align with the main Flink project. To report bugs >> or >> > > > suggest new >> > > > > features, please open tickets >> > > > > Apache Jira (https://issues.apache.org/jira). Note that we >> will >> > > no >> > > > > longer accept GitHub issues for these purposes. >> > > > > >> > > > > >> > > > > Welcome to explore the new repository and documentation. Your >> > > > feedback >> > > > > and contributions are invaluable as we continue to improve >> Flink >> > > CDC. >> > > > > >> > > > > Thanks everyone for your support and happy exploring Flink >> CDC! >> > > > > >> > > > > Best, >> > > > > Leonard >> > > > > [1] >> > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob >> > > > > >> > > > > >> > > > >> > > >> > >> >
Re: [ANNOUNCE] Apache Flink 1.19.0 released
Congratulations! Thanks Lincoln, Yun, Martijn and Jing for driving this release. Thanks everyone involved. Best, Zakelly On Mon, Mar 18, 2024 at 5:05 PM weijie guo wrote: > Congratulations! > > Thanks release managers and all the contributors involved. > > Best regards, > > Weijie > > > Leonard Xu 于2024年3月18日周一 16:45写道: > >> Congratulations, thanks release managers and all involved for the great >> work! >> >> >> Best, >> Leonard >> >> > 2024年3月18日 下午4:32,Jingsong Li 写道: >> > >> > Congratulations! >> > >> > On Mon, Mar 18, 2024 at 4:30 PM Rui Fan <1996fan...@gmail.com> wrote: >> >> >> >> Congratulations, thanks for the great work! >> >> >> >> Best, >> >> Rui >> >> >> >> On Mon, Mar 18, 2024 at 4:26 PM Lincoln Lee >> wrote: >> >>> >> >>> The Apache Flink community is very happy to announce the release of >> Apache Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19 >> series. >> >>> >> >>> Apache Flink® is an open-source stream processing framework for >> distributed, high-performing, always-available, and accurate data streaming >> applications. >> >>> >> >>> The release is available for download at: >> >>> https://flink.apache.org/downloads.html >> >>> >> >>> Please check out the release blog post for an overview of the >> improvements for this bugfix release: >> >>> >> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/ >> >>> >> >>> The full release notes are available in Jira: >> >>> >> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282 >> >>> >> >>> We would like to thank all contributors of the Apache Flink community >> who made this release possible! >> >>> >> >>> >> >>> Best, >> >>> Yun, Jing, Martijn and Lincoln >> >>
Re: Re:Re: RocksDB增量模式checkpoint大小持续增长的问题
图挂了看不到,不然你把文字信息简单复制下来看看? 另外你的ProcessWindowFunction里是否会访问state,如果访问了,是否实现了clear方法? On Thu, Jan 18, 2024 at 3:01 PM fufu wrote: > 看hdfs上shard文件比chk-xxx要大很多。 > > > > 在 2024-01-18 14:49:14,"fufu" 写道: > > 是datastream作业,窗口算子本身没有设置TTL,其余算子设置了TTL,是在Flink > UI上看到窗口算子的size不断增大,一天能增加个600~800M,持续不断的增大。以下图为例:ID为313的cp比ID为304的大了将近10M,一直运行,会一直这么增加下去。cp文件和rocksdb文件正在看~ > > 在 2024-01-18 10:56:51,"Zakelly Lan" 写道: > > >你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State > >TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大 > > > >On Wed, Jan 17, 2024 at 4:09 PM fufu wrote: > > > >> > >> > 我有一个Flink任务,使用的是flink1.14.6版本,任务中有一个增量(AggregateFunction)+全量(ProcessWindowFunction)的窗口,任务运行的时候这个算子的状态在不断增大,每天能增大个几百M这种,这个问题怎么排查?使用的事件时间,水位线下发正常,其余的算子都正常,就这个算子在不断增长,非常诡异。在网上搜到一个类似的文章: > >> https://blog.csdn.net/RL_LEEE/article/details/123864487 > ,想尝试下,但不知道manifest大小如何设置,没有找到对应的参数, > >> 请社区指导下,或者有没有别的解决方案?感谢社区! >
Re: RocksDB增量模式checkpoint大小持续增长的问题
你好,能提供一些详细的信息吗,比如:是datastream作业吧?是否设置了State TTL?观测到逐渐变大是通过checkpoint监控吗,总量是什么级别。cp文件或者本地rocksdb目录下哪些文件最大 On Wed, Jan 17, 2024 at 4:09 PM fufu wrote: > > 我有一个Flink任务,使用的是flink1.14.6版本,任务中有一个增量(AggregateFunction)+全量(ProcessWindowFunction)的窗口,任务运行的时候这个算子的状态在不断增大,每天能增大个几百M这种,这个问题怎么排查?使用的事件时间,水位线下发正常,其余的算子都正常,就这个算子在不断增长,非常诡异。在网上搜到一个类似的文章: > https://blog.csdn.net/RL_LEEE/article/details/123864487,想尝试下,但不知道manifest大小如何设置,没有找到对应的参数, > 请社区指导下,或者有没有别的解决方案?感谢社区!
Re: flink-checkpoint 问题
748) > > > JM日志,没有25548的触发记录: > 2023-12-31 18:39:10.664 [jobmanager-future-thread-20] INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Completed > checkpoint 25546 for job d12f3c6e836f56fb23d96e31737ff0b3 (411347921 bytes > in 50128 ms). > 2023-12-31 18:40:10.681 [Checkpoint Timer] INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Triggering > checkpoint 25547 (type=CHECKPOINT) @ 1704019210665 for job > d12f3c6e836f56fb23d96e31737ff0b3. > 2023-12-31 18:50:10.681 [Checkpoint Timer] INFO > org.apache.flink.runtime.checkpoint.CheckpointCoordinator - Checkpoint > 25547 of job d12f3c6e836f56fb23d96e31737ff0b3 expired before completing. > 2023-12-31 18:50:10.698 [flink-akka.actor.default-dispatcher-3] INFO > org.apache.flink.runtime.jobmaster.JobMaster - Trying to recover from a > global failure. > org.apache.flink.util.FlinkRuntimeException: Exceeded checkpoint tolerable > failure threshold. > at > org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleCheckpointException(CheckpointFailureManager.java:90) > at > org.apache.flink.runtime.checkpoint.CheckpointFailureManager.handleJobLevelCheckpointException(CheckpointFailureManager.java:65) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1760) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.abortPendingCheckpoint(CheckpointCoordinator.java:1733) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator.access$600(CheckpointCoordinator.java:93) > at > org.apache.flink.runtime.checkpoint.CheckpointCoordinator$CheckpointCanceller.run(CheckpointCoordinator.java:1870) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:748) > > > > > checkpoing路径下有: > 25546:正常 > 25547:无 > 25548:有,路径下为空 > > > > > 任务人为从25548恢复时失败,抛出异常找不到_metadate文件 > > > | | > 吴先生 > | > | > 15951914...@163.com > | > 回复的原邮件 ---- > | 发件人 | Xuyang | > | 发送日期 | 2024年1月11日 14:55 | > | 收件人 | | > | 主题 | Re:回复: flink-checkpoint 问题 | > Hi, 你的图挂了,可以用图床处理一下,或者直接贴log。 > > > > > -- > > Best! > Xuyang > > > > > 在 2024-01-11 13:40:43,"吴先生" <15951914...@163.com> 写道: > > JM中chk失败时间点日志,没有25548的触发记录: > > > 自动recovery失败: > > > TM日志: > > > checkpoint文件路径,25548里面空的: > > > | | > 吴先生 > | > | > 15951914...@163.com > | > 回复的原邮件 > | 发件人 | Zakelly Lan | > | 发送日期 | 2024年1月10日 18:20 | > | 收件人 | | > | 主题 | Re: flink-checkpoint 问题 | > 你好, > 方便的话贴一下jobmanager的log吧,应该有一些线索 > > > On Wed, Jan 10, 2024 at 5:55 PM 吴先生 <15951914...@163.com> wrote: > > Flink版本: 1.12 > checkpoint配置:hdfs > > > 现象:作业由于一些因素第N个checkpoint失败,导致任务重试,任务重试失败,hdfs中不存在第N个chk路径,但是为什么会出现一个第N+1的chk路径,且这个路径下是空的 > > >
Re: flink-checkpoint 问题
你好, 方便的话贴一下jobmanager的log吧,应该有一些线索 On Wed, Jan 10, 2024 at 5:55 PM 吴先生 <15951914...@163.com> wrote: > Flink版本: 1.12 > checkpoint配置:hdfs > > 现象:作业由于一些因素第N个checkpoint失败,导致任务重试,任务重试失败,hdfs中不存在第N个chk路径,但是为什么会出现一个第N+1的chk路径,且这个路径下是空的 > >
Re: Problems with the state.backend.fs.memory-threshold parameter
Hi rui, The 'state.backend.fs.memory-threshold' configures the threshold below which state is stored as part of the metadata, rather than in separate files. So as a result the JM will use its memory to merge small checkpoint files and write them into one file. Currently the FLIP-306[1][2] is proposed to merge small checkpoint files without consuming JM memory. This feature is currently being worked on and is targeted for the next minor release (1.19). Best, Zakelly [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints [2] https://issues.apache.org/jira/browse/FLINK-32070 On Fri, Oct 13, 2023 at 6:28 PM rui chen wrote: > > We found that for some tasks, the JM memory continued to increase. I set > the parameter of state.backend.fs.memory-threshold to 0, and the JM memory > would no longer increase, but many small files might be written in this > way. Does the community have any optimization plan for this area?