Re: [DISCUSS] Planning Flink 1.20
Thanks Leonard! > I'd like to help you if you need some help like permissions from PMC side, please feel free to ping me. Nice to know. It'll help a lot! Best regards, Weijie Leonard Xu 于2024年3月22日周五 12:09写道: > +1 for the proposed release managers (Weijie Guo, Rui Fan), both the two > candidates are pretty active committers thus I believe they know the > community development process well. The recent releases have four release > managers, and I am also looking forward to having other volunteers > join the management of Flink 1.20. > > +1 for targeting date (feature freeze: June 15, 2024), referring to the > release cycle of recent versions, release cycle of 4 months makes sense to > me. > > > I'd like to help you if you need some help like permissions from PMC side, > please feel free to ping me. > > Best, > Leonard > > > > 2024年3月19日 下午5:35,Rui Fan <1996fan...@gmail.com> 写道: > > > > Hi Weijie, > > > > Thanks for kicking off 1.20! I'd like to join you and participate in the > > 1.20 release. > > > > Best, > > Rui > > > > On Tue, Mar 19, 2024 at 5:30 PM weijie guo > > wrote: > > > >> Hi everyone, > >> > >> With the release announcement of Flink 1.19, it's a good time to kick > off > >> discussion of the next release 1.20. > >> > >> > >> - Release managers > >> > >> > >> I'd like to volunteer as one of the release managers this time. It has > been > >> good practice to have a team of release managers from different > >> backgrounds, so please raise you hand if you'd like to volunteer and get > >> involved. > >> > >> > >> > >> - Timeline > >> > >> > >> Flink 1.19 has been released. With a target release cycle of 4 months, > >> we propose a feature freeze date of *June 15, 2024*. > >> > >> > >> > >> - Collecting features > >> > >> > >> As usual, we've created a wiki page[1] for collecting new features in > 1.20. > >> > >> > >> In addition, we already have a number of FLIPs that have been voted or > are > >> in the process, including pre-works for version 2.0. > >> > >> > >> In the meantime, the release management team will be finalized in the > next > >> few days, and we'll continue to create Jira Boards and Sync meetings > >> to make it easy > >> for everyone to get an overview and track progress. > >> > >> > >> > >> Best regards, > >> > >> Weijie > >> > >> > >> > >> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release > >> > >
Re: [DISCUSS] Planning Flink 1.20
+1 for the proposed release managers (Weijie Guo, Rui Fan), both the two candidates are pretty active committers thus I believe they know the community development process well. The recent releases have four release managers, and I am also looking forward to having other volunteers join the management of Flink 1.20. +1 for targeting date (feature freeze: June 15, 2024), referring to the release cycle of recent versions, release cycle of 4 months makes sense to me. I'd like to help you if you need some help like permissions from PMC side, please feel free to ping me. Best, Leonard > 2024年3月19日 下午5:35,Rui Fan <1996fan...@gmail.com> 写道: > > Hi Weijie, > > Thanks for kicking off 1.20! I'd like to join you and participate in the > 1.20 release. > > Best, > Rui > > On Tue, Mar 19, 2024 at 5:30 PM weijie guo > wrote: > >> Hi everyone, >> >> With the release announcement of Flink 1.19, it's a good time to kick off >> discussion of the next release 1.20. >> >> >> - Release managers >> >> >> I'd like to volunteer as one of the release managers this time. It has been >> good practice to have a team of release managers from different >> backgrounds, so please raise you hand if you'd like to volunteer and get >> involved. >> >> >> >> - Timeline >> >> >> Flink 1.19 has been released. With a target release cycle of 4 months, >> we propose a feature freeze date of *June 15, 2024*. >> >> >> >> - Collecting features >> >> >> As usual, we've created a wiki page[1] for collecting new features in 1.20. >> >> >> In addition, we already have a number of FLIPs that have been voted or are >> in the process, including pre-works for version 2.0. >> >> >> In the meantime, the release management team will be finalized in the next >> few days, and we'll continue to create Jira Boards and Sync meetings >> to make it easy >> for everyone to get an overview and track progress. >> >> >> >> Best regards, >> >> Weijie >> >> >> >> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release >>
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Thanks for the efforts. On Fri, Mar 22, 2024 at 10:00 AM Yanfei Lei wrote: > Congratulations! > > Best regards, > Yanfei > > Xuannan Su 于2024年3月22日周五 09:21写道: > > > > Congratulations! > > > > Best regards, > > Xuannan > > > > On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang > wrote: > > > > > > Congratulations! > > > > > > Best wishes, > > > Charles Zhang > > > from Apache InLong > > > > > > > > > Jeyhun Karimov 于2024年3月22日周五 04:16写道: > > > > > > > Great news! Congratulations! > > > > > > > > Regards, > > > > Jeyhun > > > > > > > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan > wrote: > > > > > > > > > Congratulations! Thanks for the efforts. > > > > > > > > > > > > > > > Best, > > > > > Yuxin > > > > > > > > > > > > > > > Samrat Deb 于2024年3月21日周四 20:28写道: > > > > > > > > > > > Congratulations ! > > > > > > > > > > > > Bests > > > > > > Samrat > > > > > > > > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy < > hamdy10...@gmail.com> > > > > > wrote: > > > > > > > > > > > > > Congratulations, great work and great news. > > > > > > > Best Regards > > > > > > > Ahmed Hamdy > > > > > > > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li > > > > > wrote: > > > > > > > > > > > > > > > Congratulations, and thanks for the great work! > > > > > > > > > > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > > > > > > > > > > > Thanks for driving these efforts! > > > > > > > > > > > > > > > > > > Congratulations > > > > > > > > > > > > > > > > > > Best > > > > > > > > > Yuan > > > > > > > > > > > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li > wrote: > > > > > > > > > > > > > > > > > > > Congratulations and look forward to its further > development! > > > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > > Yu > > > > > > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam < > jam.gz...@gmail.com> > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > > > > > > > > > > > We are thrilled to announce that the donation of > Flink CDC > > > > > as a > > > > > > > > > > > > sub-project of Apache Flink has completed. We invite > you to > > > > > > > explore > > > > > > > > > > the new > > > > > > > > > > > > resources available: > > > > > > > > > > > > > > > > > > > > > > > > - GitHub Repository: > https://github.com/apache/flink-cdc > > > > > > > > > > > > - Flink CDC Documentation: > > > > > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > > > > > > > > > > > After Flink community accepted this donation[1], we > have > > > > > > > completed > > > > > > > > > > > > software copyright signing, code repo migration, code > > > > > cleanup, > > > > > > > > website > > > > > > > > > > > > migration, CI migration and github issues migration > etc. > > > > > > > > > > > > Here I am particularly grateful to Hang Ruan, > Zhongqaing > > > > > Gong, > > > > > > > > > > Qingsheng > > > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > > > > > > contributors > > > > > > > > for > > > > > > > > > > their > > > > > > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution > process has > > > > > > > > slightly > > > > > > > > > > > > changed to align with the main Flink project. To > report > > > > bugs > > > > > or > > > > > > > > > > suggest new > > > > > > > > > > > > features, please open tickets > > > > > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note > that > > > > we > > > > > > will > > > > > > > > no > > > > > > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and > documentation. > > > > Your > > > > > > > > feedback > > > > > > > > > > and > > > > > > > > > > > > contributions are invaluable as we continue to > improve > > > > Flink > > > > > > CDC. > > > > > > > > > > > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring > Flink > > > > > CDC! > > > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > > Leonard > > > > > > > > > > > > [1] > > > > > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > Best > > > > > > > > > > > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > > > Best, > > > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- Best, Hangxiang.
Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State
Hi Jeyhun, Thanks for your thoughtful feedback! > Why dont we consider an option where checkpoint directory just contains > metadata. So, we do not need to copy the data all the time from working > directory to the checkpointing directory. > Basically, when checkpointing, 1) we mark files in working directories as > "read-only", 2) optionally change the working directory, and 3) update the > metadata in the checkpoint directory (that points to some files in the > working directory). The method you described is essentially consistent with the "Mid/long term follow up work"[1] outlined in this FLIP. Under this approach, the ForStDB (TM side) needs to manage the lifecycle of both state files and checkpoint files, so that the checkpoint file can reuse state db files. For instance, files that are still referenced by the checkpoint but no longer used by ForStDB should be guaranteed not to be deleted by ForStDB. However, this way conflicts with the current mechanism where the JobManager manages the lifecycle of checkpoint files, and we need to refine it step by step. As outlined in flip-423[2], we will introduce this "zero-copy" faster checkpointing & restoring at milestorn-2. [1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046865#FLIP428:FaultTolerance/RescaleIntegrationforDisaggregatedState-Mid/longtermfollowupwork [2] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046855#FLIP423:DisaggregatedStateStorageandManagement(UmbrellaFLIP)-RoadMap+LaunchingPlan > This is more of a technical question about RocksDB. Do we have a > guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as > well? I think we don't need to consider WAL; even we can disable WAL. The reasons are: (1) we will set flush_memtable=true when DB.GetLiveFiles() invoked, ensuring that MemTable data could also be persisted. (2) DB.GetLiveFiles() is called during the snapshot synchronization phase, during which there are no concurrent state read/write operations; > As far as I understand, DB.GetLiveFiles() retrieves the global mutex > lock. I am wondering if RocksDBs optimistic transactions can be any of help > in this situation? As mentioned above, there is no concurrent read/write during GetLiveFiles(), and since GetLiveFiles() is a pure memory operation, I'm not particularly worried about the mutex lock impacting performance. Best, Jinzhong On Fri, Mar 22, 2024 at 6:36 AM Jeyhun Karimov wrote: > Hi Jinzhong, > > Thanks for the FLIP. +1 for it. > > I have a few questions: > > - Why dont we consider an option where checkpoint directory just contains > metadata. So, we do not need to copy the data all the time from working > directory to the checkpointing directory. > Basically, when checkpointing, 1) we mark files in working directories as > "read-only", 2) optionally change the working directory, and 3) update the > metadata in the checkpoint directory (that points to some files in the > working directory). > > - This is more of a technical question about RocksDB. Do we have a > guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as > well? > > - As far as I understand, DB.GetLiveFiles() retrieves the global mutex > lock. I am wondering if RocksDBs optimistic transactions can be any of help > in this situation? > > Regards, > Jeyhun > > On Wed, Mar 20, 2024 at 1:35 PM Jinzhong Li > wrote: > > > Hi Yue, > > > > Thanks for your feedback! > > > > > 1. If we choose Option-3 for ForSt , how would we handle Manifest File > > > ? Should we take a snapshot of the Manifest during the synchronization > > phase? > > > > IIUC, the GetLiveFiles() API in Option-3 can also catch the fileInfo of > > Manifest files, and this api also return the manifest file size, which > > means this api could take snapshot for Manifest FileInfo (filename + > > fileSize) during the synchronization phase. > > You could refer to the rocksdb source code[1] to verify this. > > > > > > > However, many distributed storage systems do not support the > > > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to > > > directly read and write remote files. Can we not copy or Fast duplicate > > > these files, but instand of directly reuse and. reference these remote > > > files? I think this can reduce file download time and may be more > useful > > > for most users who use HDFS (do not support Fast Duplicate)? > > > > Firstly, as far as I know, most remote file systems support the > > FastDuplicate, eg. S3 copyObject/Azure Blob Storage copyBlob/OSS > > copyObject, and the HDFS indeed does not support FastDuplicate. > > > > Actually,we have considered the design which reuses remote files. And > that > > is what we want to implement in the coming future, where both checkpoints > > and restores can reuse existing files residing on the remote state > storage. > > However, this design conflicts with the current file management system in > > Flink. At present, remote state files are
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (non-binding) Regards, Yuepeng Pan 在 2024-03-22 04:11:32,"Jeyhun Karimov" 写道: >+1 (non-binding) > >Regards, >Jeyhun > >On Thu, Mar 21, 2024 at 2:04 PM Márton Balassi >wrote: > >> +1(binding) >> >> On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu wrote: >> >> > +1(binding) >> > >> > Best, >> > Leonard >> > >> > > 2024年3月21日 下午5:21,Martijn Visser 写道: >> > > >> > > +1 (binding) >> > > >> > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang < >> > gongzhongqi...@apache.org> >> > > wrote: >> > > >> > >> +1 (non-binding) >> > >> >> > >> Bests, >> > >> Zhongqiang Gong >> > >> >> > >> Ferenc Csaky 于2024年3月20日周三 22:11写道: >> > >> >> > >>> Hello devs, >> > >>> >> > >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to >> > >>> externalize the Kudu >> > >>> connector from the recently retired Apache Bahir project [2] to keep >> it >> > >>> maintainable and >> > >>> make it up to date as well. Discussion thread [3]. >> > >>> >> > >>> The vote will be open for at least 72 hours (until 2024 March 23 >> 14:03 >> > >>> UTC) unless there >> > >>> are any objections or insufficient votes. >> > >>> >> > >>> Thanks, >> > >>> Ferenc >> > >>> >> > >>> [1] >> > >>> >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir >> > >>> [2] https://attic.apache.org/projects/bahir.html >> > >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz >> > >> >> > >> > >>
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Best regards, Yanfei Xuannan Su 于2024年3月22日周五 09:21写道: > > Congratulations! > > Best regards, > Xuannan > > On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang wrote: > > > > Congratulations! > > > > Best wishes, > > Charles Zhang > > from Apache InLong > > > > > > Jeyhun Karimov 于2024年3月22日周五 04:16写道: > > > > > Great news! Congratulations! > > > > > > Regards, > > > Jeyhun > > > > > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan wrote: > > > > > > > Congratulations! Thanks for the efforts. > > > > > > > > > > > > Best, > > > > Yuxin > > > > > > > > > > > > Samrat Deb 于2024年3月21日周四 20:28写道: > > > > > > > > > Congratulations ! > > > > > > > > > > Bests > > > > > Samrat > > > > > > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy > > > > wrote: > > > > > > > > > > > Congratulations, great work and great news. > > > > > > Best Regards > > > > > > Ahmed Hamdy > > > > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li > > > wrote: > > > > > > > > > > > > > Congratulations, and thanks for the great work! > > > > > > > > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > > > > > > > > > Thanks for driving these efforts! > > > > > > > > > > > > > > > > Congratulations > > > > > > > > > > > > > > > > Best > > > > > > > > Yuan > > > > > > > > > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > > > > > > > > > > > Congratulations and look forward to its further development! > > > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > > Yu > > > > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam > > > > > wrote: > > > > > > > > > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC > > > > as a > > > > > > > > > > > sub-project of Apache Flink has completed. We invite you > > > > > > > > > > > to > > > > > > explore > > > > > > > > > the new > > > > > > > > > > > resources available: > > > > > > > > > > > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > > > > > > - Flink CDC Documentation: > > > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > > > > > > > > > After Flink community accepted this donation[1], we have > > > > > > completed > > > > > > > > > > > software copyright signing, code repo migration, code > > > > cleanup, > > > > > > > website > > > > > > > > > > > migration, CI migration and github issues migration etc. > > > > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing > > > > Gong, > > > > > > > > > Qingsheng > > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > > > > > contributors > > > > > > > for > > > > > > > > > their > > > > > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process > > > > > > > > > > > has > > > > > > > slightly > > > > > > > > > > > changed to align with the main Flink project. To report > > > bugs > > > > or > > > > > > > > > suggest new > > > > > > > > > > > features, please open tickets > > > > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note that > > > we > > > > > will > > > > > > > no > > > > > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. > > > Your > > > > > > > feedback > > > > > > > > > and > > > > > > > > > > > contributions are invaluable as we continue to improve > > > Flink > > > > > CDC. > > > > > > > > > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink > > > > CDC! > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Leonard > > > > > > > > > > > [1] > > > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best > > > > > > > > > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > > > Best, > > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > > > > > > >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Best regards, Xuannan On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang wrote: > > Congratulations! > > Best wishes, > Charles Zhang > from Apache InLong > > > Jeyhun Karimov 于2024年3月22日周五 04:16写道: > > > Great news! Congratulations! > > > > Regards, > > Jeyhun > > > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan wrote: > > > > > Congratulations! Thanks for the efforts. > > > > > > > > > Best, > > > Yuxin > > > > > > > > > Samrat Deb 于2024年3月21日周四 20:28写道: > > > > > > > Congratulations ! > > > > > > > > Bests > > > > Samrat > > > > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy > > > wrote: > > > > > > > > > Congratulations, great work and great news. > > > > > Best Regards > > > > > Ahmed Hamdy > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li > > wrote: > > > > > > > > > > > Congratulations, and thanks for the great work! > > > > > > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > > > > > > > Thanks for driving these efforts! > > > > > > > > > > > > > > Congratulations > > > > > > > > > > > > > > Best > > > > > > > Yuan > > > > > > > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > > > > > > > > > Congratulations and look forward to its further development! > > > > > > > > > > > > > > > > Best Regards, > > > > > > > > Yu > > > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam > > > > wrote: > > > > > > > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC > > > as a > > > > > > > > > > sub-project of Apache Flink has completed. We invite you to > > > > > explore > > > > > > > > the new > > > > > > > > > > resources available: > > > > > > > > > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > > > > > - Flink CDC Documentation: > > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > > > > > > > After Flink community accepted this donation[1], we have > > > > > completed > > > > > > > > > > software copyright signing, code repo migration, code > > > cleanup, > > > > > > website > > > > > > > > > > migration, CI migration and github issues migration etc. > > > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing > > > Gong, > > > > > > > > Qingsheng > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > > > > contributors > > > > > > for > > > > > > > > their > > > > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > > > > > > slightly > > > > > > > > > > changed to align with the main Flink project. To report > > bugs > > > or > > > > > > > > suggest new > > > > > > > > > > features, please open tickets > > > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note that > > we > > > > will > > > > > > no > > > > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. > > Your > > > > > > feedback > > > > > > > > and > > > > > > > > > > contributions are invaluable as we continue to improve > > Flink > > > > CDC. > > > > > > > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink > > > CDC! > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > Leonard > > > > > > > > > > [1] > > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > Best > > > > > > > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > > > Best, > > > > > > Benchao Li > > > > > > > > > > > > > > > > > > > >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Best wishes, Charles Zhang from Apache InLong Jeyhun Karimov 于2024年3月22日周五 04:16写道: > Great news! Congratulations! > > Regards, > Jeyhun > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan wrote: > > > Congratulations! Thanks for the efforts. > > > > > > Best, > > Yuxin > > > > > > Samrat Deb 于2024年3月21日周四 20:28写道: > > > > > Congratulations ! > > > > > > Bests > > > Samrat > > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy > > wrote: > > > > > > > Congratulations, great work and great news. > > > > Best Regards > > > > Ahmed Hamdy > > > > > > > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li > wrote: > > > > > > > > > Congratulations, and thanks for the great work! > > > > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > > > > > Thanks for driving these efforts! > > > > > > > > > > > > Congratulations > > > > > > > > > > > > Best > > > > > > Yuan > > > > > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > > > > > > > Congratulations and look forward to its further development! > > > > > > > > > > > > > > Best Regards, > > > > > > > Yu > > > > > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam > > > wrote: > > > > > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC > > as a > > > > > > > > > sub-project of Apache Flink has completed. We invite you to > > > > explore > > > > > > > the new > > > > > > > > > resources available: > > > > > > > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > > > > - Flink CDC Documentation: > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > > > > > After Flink community accepted this donation[1], we have > > > > completed > > > > > > > > > software copyright signing, code repo migration, code > > cleanup, > > > > > website > > > > > > > > > migration, CI migration and github issues migration etc. > > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing > > Gong, > > > > > > > Qingsheng > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > > > contributors > > > > > for > > > > > > > their > > > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > > > > > slightly > > > > > > > > > changed to align with the main Flink project. To report > bugs > > or > > > > > > > suggest new > > > > > > > > > features, please open tickets > > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note that > we > > > will > > > > > no > > > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. > Your > > > > > feedback > > > > > > > and > > > > > > > > > contributions are invaluable as we continue to improve > Flink > > > CDC. > > > > > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink > > CDC! > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > Leonard > > > > > > > > > [1] > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best > > > > > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > > > Best, > > > > > Benchao Li > > > > > > > > > > > > > > >
Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State
Hi Jinzhong, Thanks for the FLIP. +1 for it. I have a few questions: - Why dont we consider an option where checkpoint directory just contains metadata. So, we do not need to copy the data all the time from working directory to the checkpointing directory. Basically, when checkpointing, 1) we mark files in working directories as "read-only", 2) optionally change the working directory, and 3) update the metadata in the checkpoint directory (that points to some files in the working directory). - This is more of a technical question about RocksDB. Do we have a guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as well? - As far as I understand, DB.GetLiveFiles() retrieves the global mutex lock. I am wondering if RocksDBs optimistic transactions can be any of help in this situation? Regards, Jeyhun On Wed, Mar 20, 2024 at 1:35 PM Jinzhong Li wrote: > Hi Yue, > > Thanks for your feedback! > > > 1. If we choose Option-3 for ForSt , how would we handle Manifest File > > ? Should we take a snapshot of the Manifest during the synchronization > phase? > > IIUC, the GetLiveFiles() API in Option-3 can also catch the fileInfo of > Manifest files, and this api also return the manifest file size, which > means this api could take snapshot for Manifest FileInfo (filename + > fileSize) during the synchronization phase. > You could refer to the rocksdb source code[1] to verify this. > > > > However, many distributed storage systems do not support the > > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to > > directly read and write remote files. Can we not copy or Fast duplicate > > these files, but instand of directly reuse and. reference these remote > > files? I think this can reduce file download time and may be more useful > > for most users who use HDFS (do not support Fast Duplicate)? > > Firstly, as far as I know, most remote file systems support the > FastDuplicate, eg. S3 copyObject/Azure Blob Storage copyBlob/OSS > copyObject, and the HDFS indeed does not support FastDuplicate. > > Actually,we have considered the design which reuses remote files. And that > is what we want to implement in the coming future, where both checkpoints > and restores can reuse existing files residing on the remote state storage. > However, this design conflicts with the current file management system in > Flink. At present, remote state files are managed by the ForStDB > (TaskManager side), while checkpoint files are managed by the JobManager, > which is a major hindrance to file reuse. For example, issues could arise > if a TM reuses a checkpoint file that is subsequently deleted by the JM. > Therefore, as mentioned in FLIP-423[2], our roadmap is to first integrate > checkpoint/restore mechanisms with existing framework at milestone-1. > Then, at milestone-2, we plan to introduce TM State Ownership and Faster > Checkpointing mechanisms, which will allow both checkpointing and restoring > to directly reuse remote files, thus achieving faster checkpointing and > restoring. > > [1] > > https://github.com/facebook/rocksdb/blob/6ddfa5f06140c8d0726b561e16dc6894138bcfa0/db/db_filesnapshot.cc#L77 > [2] > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046855#FLIP423:DisaggregatedStateStorageandManagement(UmbrellaFLIP)-RoadMap+LaunchingPlan > > Best, > Jinzhong > > > > > > > > On Wed, Mar 20, 2024 at 4:01 PM yue ma wrote: > > > Hi Jinzhong > > > > Thank you for initiating this FLIP. > > > > I have just some minor question: > > > > 1. If we choice Option-3 for ForSt , how would we handle Manifest File > > ? Should we take snapshot of the Manifest during the synchronization > phase? > > Otherwise, may the Manifest and MetaInfo information be inconsistent > during > > recovery? > > 2. For the Restore Operation , we need Fast Duplicate Checkpoint Files > to > > Working Dir . However, many distributed storage systems do not support > the > > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to > > directly read and write remote files. Can we not copy or Fast duplicate > > these files, but instand of directly reuse and. reference these remote > > files? I think this can reduce file download time and may be more useful > > for most users who use HDFS (do not support Fast Duplicate)? > > > > -- > > Best, > > Yue > > >
Re: [DISCUSS] FLIP-XXX Apicurio-avro format
Hi David, Thanks for the FLIP. +1 for it. I have a minor comment. Can you please elaborate more on mechanisms in place to ensure data consistency and integrity, particularly in the event of schema conflicts? Since each message includes a schema ID for inbound and outbound messages, can you elaborate more on message consistency in the context of schema evolution? Regards, Jeyhun On Wed, Mar 20, 2024 at 4:34 PM David Radley wrote: > Thank you very much for your feedback Mark. I have made the changes in the > latest google document. On reflection I agree with you that the > globalIdPlacement format configuration should apply to the deserialization > as well, so it is declarative. I am also going to have a new configuration > option to work with content IDs as well as global IDs. In line with the > deser Apicurio IdHandler and headerHandlers. > > kind regards, David. > > > On 2024/03/20 15:18:37 Mark Nuttall wrote: > > +1 to this > > > > A few small comments: > > > > Currently, if users have Avro schemas in an Apicurio Registry (an open > source Apache 2 licensed schema registry), then the natural way to work > with those Avro flows is to use the schemas in the Apicurio Repository. > > 'those Avro flows' ... this is the first reference to flows. > > > > The new format will use the global Id to look up the Avro schema that > the message was written during deserialization. > > I get the point, phrasing is awkward. Probably you're more interested in > content than word polish at this point though. > > > > The Avro Schema Registry (apicurio-avro) format > > The Confluent format is called avro-confluent; this should be > avro-apicurio > > > > How to create tables with Apicurio-avro format > > s/Apicurio-avro/avro-apicurio/g > > > > HEADER – globalId is put in the header > > LEGACY– global Id is put in the message as a long > > CONFLUENT - globalId is put in the message as an int. > > Please could we specify 'four-byte int' and 'eight-byte long' ? > > > > For a Kafka source the globalId will be looked for in this order: > > - In the header > > - After a magic byte as an int > > - After a magic byte as a long. > > but apicurio-avro.globalid-placement has a default value of HEADER : why > do we have a search order as well? Isn't apicurio-avro.globalid-placement > enough? Don't the two mechanisms conflict? > > > > In addition to the types listed there, Flink supports reading/writing > nullable types. Flink maps nullable types to Avro union(something, null), > where something is the Avro type converted from Flink type. > > Is that definitely the right way round? I know we've had multiple > conversations about how unions work with Flink > > > > This is because the writer schema is expanded, but this could not > complete if there are circularities. > > I understand your meaning but the sentence is awkward. > > > > The registered schema will be created or if it exists be updated. > > same again > > > > At some stage the lowest Flink level supported by the Kafka connector > will contain the additionalProperties methods in code flink. > > wording > > > > There existing Kafka deserialization for the writer schema passes down > the message body to be deserialised. > > wording > > > > @Override > > public void deserialize(ConsumerRecord message, > Collector out) > > throws IOException { > > Map additionalPropertiesMap = new HashMap<>(); > > for (Header header : message.additionalProperties()) { > > headersMap.put(header.key(), header.value()); > > } > > deserializationSchema.deserialize(message.value(), headersMap, > out); > > } > > This fails to compile at headersMap. > > > > The input stream and additionalProperties will be sent so the Apicurio > SchemaCoder which will try getting the globalId from the headers, then 4 > bytes from the payload then 8 bytes from the payload. > > I'm still stuck on apicurio-avro.globalid-placement having a default > value of HEADER . Should we try all three, or fail if this config param has > a wrong value? > > > > Other considerations > > The implementation does not use the Apicurio deser libraries, > > Please can we refer to them as SerDes; this is the term used within the > documentation that you link to > > > > > > On 2024/03/20 10:09:08 David Radley wrote: > > > Hi, > > > As per the FLIP process I would like to raise a FLIP, but do not have > authority, so have created a google doc for the Flip to introduce a new > Apicurio Avro format. The document is > https://docs.google.com/document/d/14LWZPVFQ7F9mryJPdKXb4l32n7B0iWYkcOdEd1xTC7w/edit?usp=sharing > > > > > > I have prototyped a lot of the content to prove that this approach is > feasible. I look forward to the discussion, > > > Kind regards, David. > > > > > > > > > > > > Unless otherwise stated above: > > > > > > IBM United Kingdom Limited > > > Registered in England and Wales with number 741598 > > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Great news! Congratulations! Regards, Jeyhun On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan wrote: > Congratulations! Thanks for the efforts. > > > Best, > Yuxin > > > Samrat Deb 于2024年3月21日周四 20:28写道: > > > Congratulations ! > > > > Bests > > Samrat > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy > wrote: > > > > > Congratulations, great work and great news. > > > Best Regards > > > Ahmed Hamdy > > > > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li wrote: > > > > > > > Congratulations, and thanks for the great work! > > > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > > > Thanks for driving these efforts! > > > > > > > > > > Congratulations > > > > > > > > > > Best > > > > > Yuan > > > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > > > > > Congratulations and look forward to its further development! > > > > > > > > > > > > Best Regards, > > > > > > Yu > > > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam > > wrote: > > > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC > as a > > > > > > > > sub-project of Apache Flink has completed. We invite you to > > > explore > > > > > > the new > > > > > > > > resources available: > > > > > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > > > - Flink CDC Documentation: > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > > > After Flink community accepted this donation[1], we have > > > completed > > > > > > > > software copyright signing, code repo migration, code > cleanup, > > > > website > > > > > > > > migration, CI migration and github issues migration etc. > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing > Gong, > > > > > > Qingsheng > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > > contributors > > > > for > > > > > > their > > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > > > > slightly > > > > > > > > changed to align with the main Flink project. To report bugs > or > > > > > > suggest new > > > > > > > > features, please open tickets > > > > > > > > Apache Jira (https://issues.apache.org/jira). Note that we > > will > > > > no > > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. Your > > > > feedback > > > > > > and > > > > > > > > contributions are invaluable as we continue to improve Flink > > CDC. > > > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink > CDC! > > > > > > > > > > > > > > > > Best, > > > > > > > > Leonard > > > > > > > > [1] > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > Best > > > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > Best, > > > > Benchao Li > > > > > > > > > >
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (non-binding) Regards, Jeyhun On Thu, Mar 21, 2024 at 2:04 PM Márton Balassi wrote: > +1(binding) > > On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu wrote: > > > +1(binding) > > > > Best, > > Leonard > > > > > 2024年3月21日 下午5:21,Martijn Visser 写道: > > > > > > +1 (binding) > > > > > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang < > > gongzhongqi...@apache.org> > > > wrote: > > > > > >> +1 (non-binding) > > >> > > >> Bests, > > >> Zhongqiang Gong > > >> > > >> Ferenc Csaky 于2024年3月20日周三 22:11写道: > > >> > > >>> Hello devs, > > >>> > > >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to > > >>> externalize the Kudu > > >>> connector from the recently retired Apache Bahir project [2] to keep > it > > >>> maintainable and > > >>> make it up to date as well. Discussion thread [3]. > > >>> > > >>> The vote will be open for at least 72 hours (until 2024 March 23 > 14:03 > > >>> UTC) unless there > > >>> are any objections or insufficient votes. > > >>> > > >>> Thanks, > > >>> Ferenc > > >>> > > >>> [1] > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > > >>> [2] https://attic.apache.org/projects/bahir.html > > >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz > > >> > > > > >
Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster
No worries, thanks for the reply Gyula. Ah yes, I see how those points you raised make the feature tricky to implement. Could this be considered for a FLIP (or two) in the future? On Wed, Mar 20, 2024 at 2:21 PM Gyula Fóra wrote: > Sorry for the late reply Kevin. > > I think what you are suggesting makes sense, it would be basically a > `last-state` startup mode. This would also help in cases where the current > last-state mechanism fails to locate HA metadata (and the state). > > This is somewhat of a tricky feature to implement: > 1. The operator will need FS plugins and access to the different user envs > (this will not work in many prod environments unfortunately) > 2. Flink doesn't expose a good way to detect the latest checkpoint just by > looking at the FS so we need to figure out something here. Probably some > changes are necessary on Flink core side as well > > Gyula >
Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources
Hi Lorenzo, Thanks a lot for your comments. Please find my answers below: For the interface `SupportsPartitioning`, why returning `Optional`? > If one decides to implement that, partitions must exist (at maximum, > return and empty list). Returning `Optional` seem just to complicate the > logic of the code using that interface. - The reasoning behind the use of Optional is that sometimes (e.g., in HiveTableSource) the partitioning info is in catalog. Therefore, we return Optional.empty(), so that the list of partitions is queried from the catalog. I foresee the using code doing something like: "if the source supports > partitioning, get the partitions, but if they don't exist, raise a runtime > exception". Let's simply make that safe at compile time and guarantee the > code that partitions exist. - Yes, once partitions cannot be found, neither from catalog nor from the interface implementation, then we raise an exception during query compile time. Another thing is that you show Hive-like partitioning in your FS > structure, do you think it makes sense to add a note about auto-discovery > of partitions? - Yes, the FLIP contains just an example partitioning for filesystem connector. Each connector already "knows" about autodiscovery of its partitions. And we rely on this fact. For example, partition discovery is different between kafka and filesystem sources. So, we do not handle the manual discovery of partitions. Please correct me if I misunderstood your question. In other terms, it looks a bit counterintuitive that the user implementing > the source has to specify which partitions exist statically (and they can > change at runtime), while the source itself knows the data provider and can > directly implement a method `discoverPartitions`. Then Flink would take > care of invoking that method when needed. We utilize table option SOURCE_MONITOR_INTERVAL to check whether partitions are static or not. So, a user still should give Flink a hint about partitions being static or not. With static partitions Flink can do more optimizations. Please let me know if my replies answer your questions or if you have more comments. Regards, Jeyhun On Thu, Mar 21, 2024 at 10:03 AM wrote: > Hello Jeyhun, > I really like the proposal and definitely makes sense to me. > > I have a couple of nits here and there: > > For the interface `SupportsPartitioning`, why returning `Optional`? > If one decides to implement that, partitions must exist (at maximum, > return and empty list). Returning `Optional` seem just to complicate the > logic of the code using that interface. > > I foresee the using code doing something like: "if the source supports > partitioning, get the partitions, but if they don't exist, raise a runtime > exception". Let's simply make that safe at compile time and guarantee the > code that partitions exist. > > Another thing is that you show Hive-like partitioning in your FS > structure, do you think it makes sense to add a note about auto-discovery > of partitions? > > In other terms, it looks a bit counterintuitive that the user implementing > the source has to specify which partitions exist statically (and they can > change at runtime), while the source itself knows the data provider and can > directly implement a method `discoverPartitions`. Then Flink would take > care of invoking that method when needed. > On Mar 15, 2024 at 22:09 +0100, Jeyhun Karimov , > wrote: > > Hi Benchao, > > Thanks for your comments. > > 1. What the parallelism would you take? E.g., 128 + 256 => 128? What > > if we cannot have a good greatest common divisor, like 127 + 128, > could we just utilize one side's pre-partitioned attribute, and let > another side just do the shuffle? > > > > There are two cases we need to consider: > > 1. Static Partition (no partitions are added during the query execution) is > enabled AND both sources implement "SupportsPartitionPushdown" > > In this case, we are sure that no new partitions will be added at runtime. > So, we have a chance equalize both sources' partitions and parallelism, IFF > both sources implement "SupportsPartitionPushdown" interface. > To achieve so, first we will fetch the existing partitions from source1 > (say p_s1) and from source2 (say p_s2). > Then, we find the intersection of these two partition sets (say > p_intersect) and pushdown these partitions: > > SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that > only specific partitions are read > SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read > with filtered partitions > > Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3) > all of their downstream operators until (and including) their first common > ancestor (e.g., join) to be equal to the number of partitions (size of > p_intersect). > > 2. All other cases > > In all other cases, the parallelism of both sources and their downstream > operators until their common ancestor
[jira] [Created] (FLINK-34911) ChangelogRecoveryRescaleITCase failed fatally with 127 exit code
Ryan Skraba created FLINK-34911: --- Summary: ChangelogRecoveryRescaleITCase failed fatally with 127 exit code Key: FLINK-34911 URL: https://issues.apache.org/jira/browse/FLINK-34911 Project: Flink Issue Type: Bug Affects Versions: 1.20.0 Reporter: Ryan Skraba [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58455=logs=a657ddbf-d986-5381-9649-342d9c92e7fb=dc085d4a-05c8-580e-06ab-21f5624dab16=9029] {code:java} Mar 21 01:50:42 01:50:42.553 [ERROR] Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:3.2.2:test (integration-tests) on project flink-tests: Mar 21 01:50:42 01:50:42.553 [ERROR] Mar 21 01:50:42 01:50:42.553 [ERROR] Please refer to /__w/1/s/flink-tests/target/surefire-reports for the individual test results. Mar 21 01:50:42 01:50:42.553 [ERROR] Please refer to dump files (if any exist) [date].dump, [date]-jvmRun[N].dump and [date].dumpstream. Mar 21 01:50:42 01:50:42.553 [ERROR] ExecutionException The forked VM terminated without properly saying goodbye. VM crash or System.exit called? Mar 21 01:50:42 01:50:42.553 [ERROR] Command was /bin/sh -c cd '/__w/1/s/flink-tests' && '/usr/lib/jvm/jdk-21.0.1+12/bin/java' '-XX:+UseG1GC' '-Xms256m' '-XX:+IgnoreUnrecognizedVMOptions' '--add-opens=java.base/java.util=ALL-UNNAMED' '--add-opens=java.base/java.io=ALL-UNNAMED' '-Xmx1536m' '-jar' '/__w/1/s/flink-tests/target/surefire/surefirebooter-20240321010847189_810.jar' '/__w/1/s/flink-tests/target/surefire' '2024-03-21T01-08-44_720-jvmRun3' 'surefire-20240321010847189_808tmp' 'surefire_207-20240321010847189_809tmp' Mar 21 01:50:42 01:50:42.553 [ERROR] Error occurred in starting fork, check output in log Mar 21 01:50:42 01:50:42.553 [ERROR] Process Exit Code: 127 Mar 21 01:50:42 01:50:42.553 [ERROR] Crashed tests: Mar 21 01:50:42 01:50:42.553 [ERROR] org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase Mar 21 01:50:42 01:50:42.553 [ERROR] org.apache.maven.surefire.booter.SurefireBooterForkException: ExecutionException The forked VM terminated without properly saying goodbye. VM crash or System.exit called? Mar 21 01:50:42 01:50:42.553 [ERROR] Command was /bin/sh -c cd '/__w/1/s/flink-tests' && '/usr/lib/jvm/jdk-21.0.1+12/bin/java' '-XX:+UseG1GC' '-Xms256m' '-XX:+IgnoreUnrecognizedVMOptions' '--add-opens=java.base/java.util=ALL-UNNAMED' '--add-opens=java.base/java.io=ALL-UNNAMED' '-Xmx1536m' '-jar' '/__w/1/s/flink-tests/target/surefire/surefirebooter-20240321010847189_810.jar' '/__w/1/s/flink-tests/target/surefire' '2024-03-21T01-08-44_720-jvmRun3' 'surefire-20240321010847189_808tmp' 'surefire_207-20240321010847189_809tmp' Mar 21 01:50:42 01:50:42.553 [ERROR] Error occurred in starting fork, check output in log Mar 21 01:50:42 01:50:42.553 [ERROR] Process Exit Code: 127 Mar 21 01:50:42 01:50:42.553 [ERROR] Crashed tests: Mar 21 01:50:42 01:50:42.553 [ERROR] org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase Mar 21 01:50:42 01:50:42.553 [ERROR]at org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:456) Mar 21 01:50:42 01:50:42.553 [ERROR]at org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:418) Mar 21 01:50:42 01:50:42.553 [ERROR]at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:297) Mar 21 01:50:42 01:50:42.553 [ERROR]at org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:250) Mar 21 01:50:42 01:50:42.554 [ERROR]at org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1240) {code} >From the watchdog, only {{ChangelogRecoveryRescaleITCase}} didn't complete, >specifically parameterized with an {{EmbeddedRocksDBStateBackend}} with >incremental checkpointing enabled. The base class ({{{}ChangelogRecoveryITCaseBase{}}}) starts a {{MiniClusterWithClientResource}} {code:java} ~/Downloads/CI/logs-cron_jdk21-test_cron_jdk21_tests-1710982836$ cat watchdog| grep "Tests run\|Running org.apache.flink" | grep -o "org.apache.flink[^ ]*$" | sort | uniq -c | sort -n | head 1 org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase 2 org.apache.flink.api.connector.source.lib.NumberSequenceSourceITCase 2 org.apache.flink.api.connector.source.lib.util.GatedRateLimiterTest 2 org.apache.flink.api.connector.source.lib.util.RateLimitedSourceReaderITCase 2 org.apache.flink.api.datastream.DataStreamBatchExecutionITCase 2 org.apache.flink.api.datastream.DataStreamCollectTestITCase{code} {color:#00} {color} -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1
The vote is now closed. I'm happy to announce that we have unanimously approved this release. There are 6 approving votes, 3 of which are binding: * Gyula Fora (binding) * Marton Balassi (binding) * Maximilian Michels (binding) * Rui Fan (non-binding) * Alexander Fedulov (non-binding) * Mate Czagany (non-binding) There are no disapproving votes. Thank you all for voting! -Max On Thu, Mar 21, 2024 at 3:49 PM Maximilian Michels wrote: > > +1 (binding) > > 1. Verified the archives, checksums, and signatures > 2. Extracted and inspected the source code for binaries > 3. Compiled and tested the source code via mvn verify > 4. Verified license files / headers > 5. Deployed helm chart to test cluster > 6. Ran example job > 7. Tested autoscaling without resource requirements API > 8. Tested autotuning > > -Max > > On Thu, Mar 21, 2024 at 8:50 AM Márton Balassi > wrote: > > > > +1 (binding) > > > > As per Gyula's suggestion above verified with " > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 ". > > > > - Verified Helm repo works as expected, points to correct image tag, build, > > version > > - Verified basic examples + checked operator logs everything looks as > > expected > > - Verified hashes, signatures and source release contains no binaries > > - Ran built-in tests, built jars + docker image from source successfully > > - Upgraded the operator and the CRD from 1.7.0 to 1.8.0 > > > > Best, > > Marton > > > > On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany wrote: > > > > > Hi, > > > > > > +1 (non-binding) > > > > > > - Verified checksums > > > - Verified signatures > > > - Verified no binaries in source distribution > > > - Verified Apache License and NOTICE files > > > - Executed tests > > > - Built container image > > > - Verified chart version and appVersion matches > > > - Verified Helm chart can be installed with default values > > > - Verify that RC repo works as Helm repo > > > > > > Best Regards, > > > Mate > > > > > > Alexander Fedulov ezt írta (időpont: 2024. > > > márc. 19., K, 23:10): > > > > > > > Hi Max, > > > > > > > > +1 > > > > > > > > - Verified SHA checksums > > > > - Verified GPG signatures > > > > - Verified that the source distributions do not contain binaries > > > > - Verified built-in tests (mvn clean verify) > > > > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C) > > > > - Verified that Helm and operator files contain Apache licenses (rg -L > > > > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .). > > > > I am not sure we need to > > > > include ./examples/flink-beam-example/dependency-reduced-pom.xml > > > > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though > > > > - Verified that chart and appVersion matches the target release > > > > (91d67d9) > > > > - Verified that Helm chart can be installed from the local Helm folder > > > > without overriding any parameters > > > > - Verified that Helm chart can be installed from the RC repo without > > > > overriding any parameters ( > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1 > > > > ) > > > > - Verified docker container build > > > > > > > > Best, > > > > Alex > > > > > > > > > > > > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels > > > > wrote: > > > > > > > > > @Rui @Gyula Thanks for checking the release! > > > > > > > > > > >A minor correction is that [3] in the email should point to: > > > > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart > > > > and > > > > > > everything is correct. It's a typo in the vote email. > > > > > > > > > > Good catch. Indeed, for the linked Docker image 8938658 points to > > > > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes > > > > > between those two commits, except for updating the version. So the > > > > > votes are not impacted, especially because votes are casted against > > > > > the source release which, as you pointed out, contains the correct > > > > > image ref. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra > > > wrote: > > > > > > > > > > > > Hi Max! > > > > > > > > > > > > +1 (binding) > > > > > > > > > > > > - Verified source release, helm chart + checkpoints / signatures > > > > > > - Helm points to correct image > > > > > > - Deployed operator, stateful example and executed upgrade + > > > savepoint > > > > > > redeploy > > > > > > - Verified logs > > > > > > - Flink web PR looks good +1 > > > > > > > > > > > > A minor correction is that [3] in the email should point to: > > > > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm > > > chart > > > > > and > > > > > > everything is correct. It's a typo in the vote email. > > > > > > > > > > > > Thank you for preparing the release! > > > > > > > > > > > > Cheers, > > > > > > Gyula > > > > > > > > > > > > On Mon, Mar 18, 2024
Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1
+1 (binding) 1. Verified the archives, checksums, and signatures 2. Extracted and inspected the source code for binaries 3. Compiled and tested the source code via mvn verify 4. Verified license files / headers 5. Deployed helm chart to test cluster 6. Ran example job 7. Tested autoscaling without resource requirements API 8. Tested autotuning -Max On Thu, Mar 21, 2024 at 8:50 AM Márton Balassi wrote: > > +1 (binding) > > As per Gyula's suggestion above verified with " > ghcr.io/apache/flink-kubernetes-operator:91d67d9 ". > > - Verified Helm repo works as expected, points to correct image tag, build, > version > - Verified basic examples + checked operator logs everything looks as > expected > - Verified hashes, signatures and source release contains no binaries > - Ran built-in tests, built jars + docker image from source successfully > - Upgraded the operator and the CRD from 1.7.0 to 1.8.0 > > Best, > Marton > > On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany wrote: > > > Hi, > > > > +1 (non-binding) > > > > - Verified checksums > > - Verified signatures > > - Verified no binaries in source distribution > > - Verified Apache License and NOTICE files > > - Executed tests > > - Built container image > > - Verified chart version and appVersion matches > > - Verified Helm chart can be installed with default values > > - Verify that RC repo works as Helm repo > > > > Best Regards, > > Mate > > > > Alexander Fedulov ezt írta (időpont: 2024. > > márc. 19., K, 23:10): > > > > > Hi Max, > > > > > > +1 > > > > > > - Verified SHA checksums > > > - Verified GPG signatures > > > - Verified that the source distributions do not contain binaries > > > - Verified built-in tests (mvn clean verify) > > > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C) > > > - Verified that Helm and operator files contain Apache licenses (rg -L > > > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .). > > > I am not sure we need to > > > include ./examples/flink-beam-example/dependency-reduced-pom.xml > > > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though > > > - Verified that chart and appVersion matches the target release (91d67d9) > > > - Verified that Helm chart can be installed from the local Helm folder > > > without overriding any parameters > > > - Verified that Helm chart can be installed from the RC repo without > > > overriding any parameters ( > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1 > > > ) > > > - Verified docker container build > > > > > > Best, > > > Alex > > > > > > > > > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels wrote: > > > > > > > @Rui @Gyula Thanks for checking the release! > > > > > > > > >A minor correction is that [3] in the email should point to: > > > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart > > > and > > > > > everything is correct. It's a typo in the vote email. > > > > > > > > Good catch. Indeed, for the linked Docker image 8938658 points to > > > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes > > > > between those two commits, except for updating the version. So the > > > > votes are not impacted, especially because votes are casted against > > > > the source release which, as you pointed out, contains the correct > > > > image ref. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra > > wrote: > > > > > > > > > > Hi Max! > > > > > > > > > > +1 (binding) > > > > > > > > > > - Verified source release, helm chart + checkpoints / signatures > > > > > - Helm points to correct image > > > > > - Deployed operator, stateful example and executed upgrade + > > savepoint > > > > > redeploy > > > > > - Verified logs > > > > > - Flink web PR looks good +1 > > > > > > > > > > A minor correction is that [3] in the email should point to: > > > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm > > chart > > > > and > > > > > everything is correct. It's a typo in the vote email. > > > > > > > > > > Thank you for preparing the release! > > > > > > > > > > Cheers, > > > > > Gyula > > > > > > > > > > On Mon, Mar 18, 2024 at 8:26 AM Rui Fan <1996fan...@gmail.com> > > wrote: > > > > > > > > > > > Thanks Max for driving this release! > > > > > > > > > > > > +1(non-binding) > > > > > > > > > > > > - Downloaded artifacts from dist ( svn co > > > > > > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/ > > > > > > ) > > > > > > - Verified SHA512 checksums : ( for i in *.tgz; do echo $i; > > sha512sum > > > > > > --check $i.sha512; done ) > > > > > > - Verified GPG signatures : ( $ for i in *.tgz; do echo $i; gpg > > > > --verify > > > > > > $i.asc $i ) > > > > > > - Build the source with java-11 and java-17 ( mvn -T 20 clean > > install > > > > > > -DskipTests ) > > > > > > - Verified
Re: [DISCUSS] FLIP-425: Asynchronous Execution Model
Thanks for your reading and valuable comments! > 1) About locking VS reference counting: I would like to clear out which > mechanism prevents what: The `KeyAccountingUnit` implements locking behavior on keys and ensures 2 state requests on the same key happen in order. Double-locking the same key does not result in deadlocks (thanks to the `previous == record` condition in your pseudo-code), so, the same callback chain can update/read multiple times the same piece of state. On the other side we have the reference counting mechanism that is used to understand whether a record has been fully processed, i.e., all state invocations have been carried out. Here is the question: am I correct if we say that key accounting is needed for out-of-order while reference counting is needed for checkpointing and watermarking? Regarding the "deadlock" of `KeyAccountingUnit`: good catch, we will emphasize this in FLIP, the KeyAccountingUnitis reentrant, so the state requests of the same record can update/read multiple times without deadlock. Regarding the question: records, checkpoint barriers and watermarks can be regarded as inputs, this FLIP discusses the *order* between all inputs, in simple terms, the inputs of the same key that arrive first need to be executed first. And the `KeyAccountingUnit` and reference counting work together to preserve the order, when the reference counting mechanism recognizes a record has been fully processed, the record will be removed from the `KeyAccountingUnit`. The checkpoint or watermark would start util all the reference counting of arrived inputs reach zero. > 2) Number of mails: Do you end up having two mails? Yes, there are two mails in this case. > 3) Would this change something on the consistency guarantees provided? I guess not, as, the lock is held in any case until the value on the state hasn't been updated. Could lead to any inconsistency (most probably the state would be updated to 0). Yes, the results of the two cases you mentioned are as you analyzed. The result of the first case is 1, and the result of the second case is 0. No matter which case it is, the next `processElement` with the same key will be executed after the code in this `processElement` is completely executed. Therefore it wouldn't lead to inconsistency. > 4) On the local variables/attributes: I'm not sure if I understand your question. In Java, this case(modifying the local local variable) is not allowed[1], but there are ways to get around the limitation of lambda. In this case, the modification to x may be concurrent, which needs to be handled carefully. 5) On watermarks: It seems that, in order to achieve a good throughput, out-of-order mode should be used. In the FLIP I could not understand well how many things could go wrong if that one is used. Could you please clarify that? A typical example is the order between "event timer fire" and "the subsequent records of watermark". Although the semantics of watermarks do not define the sequence between a watermark and subsequent records, an implicit fact in sync API is that "event timer fire" would execute before "the subsequent records of watermark", but in out-of-order mode(async API), the execution order between them is not guaranteed. There also are some related discussions in FLIP-423[2,3] proposed by Yunfeng Zhou and Xintong Song. [1] https://stackoverflow.com/questions/30026824/modifying-local-variable-from-inside-lambda [2] https://lists.apache.org/thread/djsnybs9whzrt137z3qmxdwn031o93gn [3] https://lists.apache.org/thread/986zxq1k9rv3vkbk39yw16g24o6h83mz 于2024年3月21日周四 19:29写道: > > Thank you everybody for the questions and answers (especially Yanfei Lei), it > was very instructive to go over the discussion. > I am gonna add some questions on top of what happened and add some thoughts > as well below. > > 1) About locking VS reference counting: > I would like to clear out which mechanism prevents what: > The `KeyAccountingUnit` implements locking behavior on keys and ensures 2 > state requests on the same key happen in order. Double-locking the same key > does not result in deadlocks (thanks to the `previous == record` condition in > your pseudo-code), so, the same callback chain can update/read multiple times > the same piece of state. > On the other side we have the reference counting mechanism that is used to > understand whether a record has been fully processed, i.e., all state > invocations have been carried out. > Here is the question: am I correct if we say that key accounting is needed > for out-of-order while reference counting is needed for checkpointing and > watermarking? > > 2) Number of mails: > To expand on what Jing Ge already asked, in the example code in the FLIP: > > ``` > state.value().then( > val -> { > return state.update(val + 1).then( > empty -> { > out.collect(val + 1); > }; > } > } > ``` > Do you end up having two mails?: > > first wrapping `val -> {...}` > second
Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources
Jeyhun, Sorry for the delay. And thanks for the explanation, it sounds good to me! Jeyhun Karimov 于2024年3月16日周六 05:09写道: > > Hi Benchao, > > Thanks for your comments. > > 1. What the parallelism would you take? E.g., 128 + 256 => 128? What > > if we cannot have a good greatest common divisor, like 127 + 128, > > could we just utilize one side's pre-partitioned attribute, and let > > another side just do the shuffle? > > > There are two cases we need to consider: > > 1. Static Partition (no partitions are added during the query execution) is > enabled AND both sources implement "SupportsPartitionPushdown" > > In this case, we are sure that no new partitions will be added at runtime. > So, we have a chance equalize both sources' partitions and parallelism, IFF > both sources implement "SupportsPartitionPushdown" interface. > To achieve so, first we will fetch the existing partitions from source1 > (say p_s1) and from source2 (say p_s2). > Then, we find the intersection of these two partition sets (say > p_intersect) and pushdown these partitions: > > SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that > only specific partitions are read > SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read > with filtered partitions > > Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3) > all of their downstream operators until (and including) their first common > ancestor (e.g., join) to be equal to the number of partitions (size of > p_intersect). > > 2. All other cases > > In all other cases, the parallelism of both sources and their downstream > operators until their common ancestor would be equal to the MIN(p_s1, > p_s2). > That is, minimum of the partition size of source1 and partition size of > source2 will be selected as the parallelism. > Coming back to your example, if source1 parallelism is 127 and source2 > parallelism is 128, then we will first check the partition size of source1 > and source2. > Say partition size of source1 is 100 and partition size of source2 is 90. > Then, we would set the parallelism for source1, source2, and all of their > downstream operators until (and including) the join operator > to 90 (min(100, 90)). > We also plan to implement a cost based decision instead of the rule-based > one (the ones explained above - MIN rule). > One possible result of the cost based estimation is to keep the partitions > on one side and perform the shuffling on another source. > > > > 2. In our current shuffle remove design (FlinkExpandConversionRule), > > we don't consider parallelism, we just remove unnecessary shuffles > > according to the distribution columns. After this FLIP, the > > parallelism may be bundled with source's partitions, then how will > > this optimization accommodate with FlinkExpandConversionRule, will you > > also change downstream operator's parallelisms if we want to also > > remove subsequent shuffles? > > > > - From my understanding of FlinkExpandConversionRule, its removal logic is > agnostic to operator parallelism. > So, if FlinkExpandConversionRule decides to remove a shuffle operation, > then this FLIP will search another possible shuffle (the one closest to the > source) to remove. > If there is such an opportunity, this FLIP will remove the shuffle. So, > from my understanding FlinkExpandConversionRule and this optimization rule > can work together safely. > Please correct me if I misunderstood your question. > > > > Regarding the new optimization rule, have you also considered to allow > > some non-strict mode like FlinkRelDistribution#requireStrict? For > > example, source is pre-partitioned by a, b columns, if we are > > consuming this source, and do a aggregate on a, b, c, can we utilize > > this optimization? > > > - Good point. Yes, there are some cases that non-strict mode will apply. > For example: > > - pre-partitioned columns and aggregate columns are the same but have > different order (e.g., source pre-partitioned w.r.t. a,b and aggregate has > a GROUP BY b,a) > - columns in the Exchange operator is a list-prefix of pre-partitoned > columns of source (e.g., source is pre-partitioned w.r.t. a,b,c and > Exchange's partition columns are a,b) > > Please let me know if the above answers your questions or if you have any > other comments. > > Regards, > Jeyhun > > On Thu, Mar 14, 2024 at 12:48 PM Benchao Li wrote: > > > Thanks Jeyhun for bringing up this discussion, it is really exiting, > > +1 for the general idea. > > > > We also introduced a similar concept in Flink Batch internally to cope > > with bucketed tables in Hive, it is a very important improvement. > > > > > One thing to note is that for join queries, the parallelism of each join > > > source might be different. This might result in > > > inconsistencies while using the pre-partitioned/pre-divided data (e.g., > > > different mappings of partitions to source operators). > > > Therefore, it is the job of planner to detect this and
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1(binding) On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu wrote: > +1(binding) > > Best, > Leonard > > > 2024年3月21日 下午5:21,Martijn Visser 写道: > > > > +1 (binding) > > > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang < > gongzhongqi...@apache.org> > > wrote: > > > >> +1 (non-binding) > >> > >> Bests, > >> Zhongqiang Gong > >> > >> Ferenc Csaky 于2024年3月20日周三 22:11写道: > >> > >>> Hello devs, > >>> > >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to > >>> externalize the Kudu > >>> connector from the recently retired Apache Bahir project [2] to keep it > >>> maintainable and > >>> make it up to date as well. Discussion thread [3]. > >>> > >>> The vote will be open for at least 72 hours (until 2024 March 23 14:03 > >>> UTC) unless there > >>> are any objections or insufficient votes. > >>> > >>> Thanks, > >>> Ferenc > >>> > >>> [1] > >>> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > >>> [2] https://attic.apache.org/projects/bahir.html > >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz > >> > >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations! Thanks for the efforts. Best, Yuxin Samrat Deb 于2024年3月21日周四 20:28写道: > Congratulations ! > > Bests > Samrat > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy wrote: > > > Congratulations, great work and great news. > > Best Regards > > Ahmed Hamdy > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li wrote: > > > > > Congratulations, and thanks for the great work! > > > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > > > Thanks for driving these efforts! > > > > > > > > Congratulations > > > > > > > > Best > > > > Yuan > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > > > Congratulations and look forward to its further development! > > > > > > > > > > Best Regards, > > > > > Yu > > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam > wrote: > > > > > > > > > > > > Congrattulations! > > > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC as a > > > > > > > sub-project of Apache Flink has completed. We invite you to > > explore > > > > > the new > > > > > > > resources available: > > > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > > - Flink CDC Documentation: > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > > > After Flink community accepted this donation[1], we have > > completed > > > > > > > software copyright signing, code repo migration, code cleanup, > > > website > > > > > > > migration, CI migration and github issues migration etc. > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, > > > > > Qingsheng > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other > contributors > > > for > > > > > their > > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > > > slightly > > > > > > > changed to align with the main Flink project. To report bugs or > > > > > suggest new > > > > > > > features, please open tickets > > > > > > > Apache Jira (https://issues.apache.org/jira). Note that we > will > > > no > > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. Your > > > feedback > > > > > and > > > > > > > contributions are invaluable as we continue to improve Flink > CDC. > > > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > > > > > > > > > > > Best, > > > > > > > Leonard > > > > > > > [1] > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Best > > > > > > > > > > > > ConradJam > > > > > > > > > > > > > > > > > -- > > > > > > Best, > > > Benchao Li > > > > > >
[jira] [Created] (FLINK-34910) Can not plan window join without projections
Dawid Wysakowicz created FLINK-34910: Summary: Can not plan window join without projections Key: FLINK-34910 URL: https://issues.apache.org/jira/browse/FLINK-34910 Project: Flink Issue Type: Bug Components: Table SQL / Planner Affects Versions: 1.19.0 Reporter: Dawid Wysakowicz Assignee: Dawid Wysakowicz Fix For: 1.20.0 When running: {code} @Test def testWindowJoinWithoutProjections(): Unit = { val sql = """ |SELECT * |FROM | TABLE(TUMBLE(TABLE MyTable, DESCRIPTOR(rowtime), INTERVAL '15' MINUTE)) AS L |JOIN | TABLE(TUMBLE(TABLE MyTable2, DESCRIPTOR(rowtime), INTERVAL '15' MINUTE)) AS R |ON L.window_start = R.window_start AND L.window_end = R.window_end AND L.a = R.a """.stripMargin util.verifyRelPlan(sql) } {code} It fails with: {code} FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME_MATERIALIZE(proctime) AS proctime, window_start, window_end, window_time, a0, b0, c0, rowtime0, PROCTIME_MATERIALIZE(proctime0) AS proctime0, window_start0, window_end0, window_time0]) +- FlinkLogicalCorrelate(correlation=[$cor0], joinType=[inner], requiredColumns=[{}]) :- FlinkLogicalTableFunctionScan(invocation=[TUMBLE(DESCRIPTOR($3), 90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, VARCHAR(2147483647) b, BIGINT c, TIMESTAMP(3) *ROWTIME* rowtime, TIMESTAMP_LTZ(3) *PROCTIME* proctime, TIMESTAMP(3) window_start, TIMESTAMP(3) window_end, TIMESTAMP(3) *ROWTIME* window_time)]) : +- FlinkLogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($3, 1000:INTERVAL SECOND)]) : +- FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME() AS proctime]) :+- FlinkLogicalTableSourceScan(table=[[default_catalog, default_database, MyTable]], fields=[a, b, c, rowtime]) +- FlinkLogicalTableFunctionScan(invocation=[TUMBLE(DESCRIPTOR(CAST($3):TIMESTAMP(3)), 90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, VARCHAR(2147483647) b, BIGINT c, TIMESTAMP(3) *ROWTIME* rowtime, TIMESTAMP_LTZ(3) *PROCTIME* proctime, TIMESTAMP(3) window_start, TIMESTAMP(3) window_end, TIMESTAMP(3) *ROWTIME* window_time)]) +- FlinkLogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($3, 1000:INTERVAL SECOND)]) +- FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME() AS proctime]) +- FlinkLogicalTableSourceScan(table=[[default_catalog, default_database, MyTable2]], fields=[a, b, c, rowtime]) Failed to get time attribute index from DESCRIPTOR(CAST($3):TIMESTAMP(3)). This is a bug, please file a JIRA issue. Please check the documentation for the set of currently supported SQL features. {code} In prior versions this had another problem of ambiguous {{rowtime}} column, but this has been fixed by [FLINK-32648]. In versions < 1.19 WindowTableFunctions were incorrectly scoped, because they were not extending from Calcite's SqlWindowTableFunction and the scoping implemented in SqlValidatorImpl#convertFrom was incorrect. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34909) OceanBase事务ID需求
xiaotouming created FLINK-34909: --- Summary: OceanBase事务ID需求 Key: FLINK-34909 URL: https://issues.apache.org/jira/browse/FLINK-34909 Project: Flink Issue Type: New Feature Components: Flink CDC Affects Versions: cdc-3.1.0 Reporter: xiaotouming Fix For: cdc-3.1.0 可以通过flink data stream方式解析到OceanBase的事务ID -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations ! Bests Samrat On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy wrote: > Congratulations, great work and great news. > Best Regards > Ahmed Hamdy > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li wrote: > > > Congratulations, and thanks for the great work! > > > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > > > Thanks for driving these efforts! > > > > > > Congratulations > > > > > > Best > > > Yuan > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > > > Congratulations and look forward to its further development! > > > > > > > > Best Regards, > > > > Yu > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam wrote: > > > > > > > > > > Congrattulations! > > > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > > > Hi devs and users, > > > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC as a > > > > > > sub-project of Apache Flink has completed. We invite you to > explore > > > > the new > > > > > > resources available: > > > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > > - Flink CDC Documentation: > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > > > After Flink community accepted this donation[1], we have > completed > > > > > > software copyright signing, code repo migration, code cleanup, > > website > > > > > > migration, CI migration and github issues migration etc. > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, > > > > Qingsheng > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors > > for > > > > their > > > > > > contributions and help during this process! > > > > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > > slightly > > > > > > changed to align with the main Flink project. To report bugs or > > > > suggest new > > > > > > features, please open tickets > > > > > > Apache Jira (https://issues.apache.org/jira). Note that we will > > no > > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. Your > > feedback > > > > and > > > > > > contributions are invaluable as we continue to improve Flink CDC. > > > > > > > > > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > > > > > > > > > Best, > > > > > > Leonard > > > > > > [1] > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > > > > > -- > > > > > Best > > > > > > > > > > ConradJam > > > > > > > > > > > > -- > > > > Best, > > Benchao Li > > >
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1(binding) Best, Leonard > 2024年3月21日 下午5:21,Martijn Visser 写道: > > +1 (binding) > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang > wrote: > >> +1 (non-binding) >> >> Bests, >> Zhongqiang Gong >> >> Ferenc Csaky 于2024年3月20日周三 22:11写道: >> >>> Hello devs, >>> >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to >>> externalize the Kudu >>> connector from the recently retired Apache Bahir project [2] to keep it >>> maintainable and >>> make it up to date as well. Discussion thread [3]. >>> >>> The vote will be open for at least 72 hours (until 2024 March 23 14:03 >>> UTC) unless there >>> are any objections or insufficient votes. >>> >>> Thanks, >>> Ferenc >>> >>> [1] >>> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir >>> [2] https://attic.apache.org/projects/bahir.html >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz >>
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations, great work and great news. Best Regards Ahmed Hamdy On Thu, 21 Mar 2024 at 11:41, Benchao Li wrote: > Congratulations, and thanks for the great work! > > Yuan Mei 于2024年3月21日周四 18:31写道: > > > > Thanks for driving these efforts! > > > > Congratulations > > > > Best > > Yuan > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > > > Congratulations and look forward to its further development! > > > > > > Best Regards, > > > Yu > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam wrote: > > > > > > > > Congrattulations! > > > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > > > Hi devs and users, > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC as a > > > > > sub-project of Apache Flink has completed. We invite you to explore > > > the new > > > > > resources available: > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > > - Flink CDC Documentation: > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > > > After Flink community accepted this donation[1], we have completed > > > > > software copyright signing, code repo migration, code cleanup, > website > > > > > migration, CI migration and github issues migration etc. > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, > > > Qingsheng > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors > for > > > their > > > > > contributions and help during this process! > > > > > > > > > > > > > > > For all previous contributors: The contribution process has > slightly > > > > > changed to align with the main Flink project. To report bugs or > > > suggest new > > > > > features, please open tickets > > > > > Apache Jira (https://issues.apache.org/jira). Note that we will > no > > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > > > > Welcome to explore the new repository and documentation. Your > feedback > > > and > > > > > contributions are invaluable as we continue to improve Flink CDC. > > > > > > > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > > > > > > > Best, > > > > > Leonard > > > > > [1] > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > > > > > -- > > > > Best > > > > > > > > ConradJam > > > > > > > -- > > Best, > Benchao Li >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations, and thanks for the great work! Yuan Mei 于2024年3月21日周四 18:31写道: > > Thanks for driving these efforts! > > Congratulations > > Best > Yuan > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > > > Congratulations and look forward to its further development! > > > > Best Regards, > > Yu > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam wrote: > > > > > > Congrattulations! > > > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > > > Hi devs and users, > > > > > > > > We are thrilled to announce that the donation of Flink CDC as a > > > > sub-project of Apache Flink has completed. We invite you to explore > > the new > > > > resources available: > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > > - Flink CDC Documentation: > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > > > After Flink community accepted this donation[1], we have completed > > > > software copyright signing, code repo migration, code cleanup, website > > > > migration, CI migration and github issues migration etc. > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, > > Qingsheng > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for > > their > > > > contributions and help during this process! > > > > > > > > > > > > For all previous contributors: The contribution process has slightly > > > > changed to align with the main Flink project. To report bugs or > > suggest new > > > > features, please open tickets > > > > Apache Jira (https://issues.apache.org/jira). Note that we will no > > > > longer accept GitHub issues for these purposes. > > > > > > > > > > > > Welcome to explore the new repository and documentation. Your feedback > > and > > > > contributions are invaluable as we continue to improve Flink CDC. > > > > > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > > > > > Best, > > > > Leonard > > > > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > > > > > -- > > > Best > > > > > > ConradJam > > -- Best, Benchao Li
Re: [DISCUSS] FLIP-425: Asynchronous Execution Model
Thank you everybody for the questions and answers (especially Yanfei Lei), it was very instructive to go over the discussion. I am gonna add some questions on top of what happened and add some thoughts as well below. 1) About locking VS reference counting: I would like to clear out which mechanism prevents what: The `KeyAccountingUnit` implements locking behavior on keys and ensures 2 state requests on the same key happen in order. Double-locking the same key does not result in deadlocks (thanks to the `previous == record` condition in your pseudo-code), so, the same callback chain can update/read multiple times the same piece of state. On the other side we have the reference counting mechanism that is used to understand whether a record has been fully processed, i.e., all state invocations have been carried out. Here is the question: am I correct if we say that key accounting is needed for out-of-order while reference counting is needed for checkpointing and watermarking? 2) Number of mails: To expand on what Jing Ge already asked, in the example code in the FLIP: ``` state.value().then( val -> { return state.update(val + 1).then( empty -> { out.collect(val + 1); }; } } ``` Do you end up having two mails?: • first wrapping `val -> {...}` • second wrapping `empty -> {...}` Did I get it correctly? 3) On the guarantees provided by the async execution framework: Always referring to your example, say that when `val -> {...}` gets called the state value is 0. The callback will update to 1 and register another callback to collect while still holding the lock on that piece of state, and preventing somebody else to read the value during the entire process (implementing atomicity for a transaction). Now, say the code changes to: ``` state.value().then( val -> { state.update(val + 1); out.collect(val + 1); } } ``` Would this change something on the consistency guarantees provided? I guess not, as, the lock is held in any case until the value on the state hasn't been updated. This, instead (similar to your example): ``` int x = 0; state.value().then( val -> { x = val + 1; } ); state.update(x); out.collect(x); ``` Could lead to any inconsistency (most probably the state would be updated to 0). 4) On the local variables/attributes: The last example above exemplifies one of my concerns: what about the values enclosed in the callbacks? That seems a bit counter-intuitive and brittle from a user perspective. In the example we have a function-level variable, but what about fields: ``` int x = 0; void processElement(...) { state.value().then( val -> { x += val; } ); ... out.collect(x); } ``` What could happen here? Every callback would enclose the current value of the field? Don't know exactly where I am heading, but it seems quite complex/convoluted :) 5) On watermarks: It seems that, in order to achieve a good throughput, out-of-order mode should be used. In the FLIP I could not understand well how many things could go wrong if that one is used. Could you please clarify that? Thank you for your availability and your great work! On Mar 19, 2024 at 10:51 +0100, Yanfei Lei , wrote: > Hi everyone, > > Thanks for your valuable discussion and feedback! > > Our discussions have been going on for a while and there have been no > new comments for several days. So I would like to start a vote after > 72 hours. > > Please let me know if you have any concerns, thanks! > > Yanfei Lei 于2024年3月13日周三 12:54写道: > > > > > Hi Jing, > > Thanks for the reply and follow up. > > > > > > What is the benefit for users to build a chain of mails instead of just > > > > one mail(it is still async)? > > > > Just to make sure we're on the same page, I try to paraphrase your question: > > A `then()` call will be encapsulated as a callback mail. Your question > > is whether we can call then() as little as possible to reduce the > > overhead of encapsulating it into a mail. > > > > In general, whether to call `then()` depends on the user's data > > dependencies. The operations in a chain of `then()` are strictly > > ordered. > > > > > > > > The following is an example without data dependencies, if written in > > the form of a `then` chain: > > stateA.update(1).then(stateB.update(2).then(stateC.update(3))); > > > > The execution order is: > > ``` > > stateA update 1 -> stateB update 2-> stateC update 3 > > ``` > > > > If written in the form without `then()` call, they will be placed in a > > "mail/mailboxDefaultAction", and each state request will still be > > executed asynchronously: > > ``` > > stateA.update(1); > > stateB.update(2); > > stateC.update(3); > > ``` > > > > The order in which they are executed is undefined and may be: > > ``` > > - stateA update 1 -> stateB update 2-> stateC update 3 > > - stateB update 2 -> stateC update 3-> stateA update 1 > > - stateC update 3 -> stateA update 1-> stateB update 2 > > ... > > ``` > > And the final results are "stateA =
[jira] [Created] (FLINK-34908) mysql pipeline to doris and starrocks will lost precision for timestamp
Xin Gong created FLINK-34908: Summary: mysql pipeline to doris and starrocks will lost precision for timestamp Key: FLINK-34908 URL: https://issues.apache.org/jira/browse/FLINK-34908 Project: Flink Issue Type: Improvement Components: Flink CDC Reporter: Xin Gong Fix For: cdc-3.1.0 flink cdc pipeline will decide timestamp zone by config of pipeline. I found mysql2doris and mysql2starracks will specific datetime format -MM-dd HH:mm:ss, it will cause lost datatime precision. I think we can don't specific datetime format, just return LocalDateTime object. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] FLIP-433: State Access on DataStream API V2
+1 (non-binding) Best, Jinzhong On Thu, Mar 21, 2024 at 6:15 PM Zakelly Lan wrote: > +1 non-binding > > > Best, > Zakelly > > On Thu, Mar 21, 2024 at 5:34 PM Gyula Fóra wrote: > > > +1 (binding) > > > > Gyula > > > > On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote: > > > > > +1(binding) > > > > > > Thanks to Weijie for driving this proposal, which solves the problem > > that I > > > raised in FLIP-359. > > > > > > Best, > > > Rui > > > > > > On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu > > wrote: > > > > > > > +1 (binding) > > > > > > > > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song > > > > > wrote: > > > > > > > > > +1 (binding) > > > > > > > > > > Best, > > > > > > > > > > Xintong > > > > > > > > > > > > > > > > > > > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo < > > guoweijieres...@gmail.com> > > > > > wrote: > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > Thanks for all the feedback about the FLIP-433: State Access on > > > > > > DataStream API V2 [1]. The discussion thread is here [2]. > > > > > > > > > > > > > > > > > > The vote will be open for at least 72 hours unless there is an > > > > > > objection or insufficient votes. > > > > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > > > Weijie > > > > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2 > > > > > > > > > > > > [2] > > https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Best, > > > > Hangxiang. > > > > > > > > > >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Thanks for driving these efforts! Congratulations Best Yuan On Thu, Mar 21, 2024 at 4:35 PM Yu Li wrote: > Congratulations and look forward to its further development! > > Best Regards, > Yu > > On Thu, 21 Mar 2024 at 15:54, ConradJam wrote: > > > > Congrattulations! > > > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > > > Hi devs and users, > > > > > > We are thrilled to announce that the donation of Flink CDC as a > > > sub-project of Apache Flink has completed. We invite you to explore > the new > > > resources available: > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > > - Flink CDC Documentation: > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > > > After Flink community accepted this donation[1], we have completed > > > software copyright signing, code repo migration, code cleanup, website > > > migration, CI migration and github issues migration etc. > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, > Qingsheng > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for > their > > > contributions and help during this process! > > > > > > > > > For all previous contributors: The contribution process has slightly > > > changed to align with the main Flink project. To report bugs or > suggest new > > > features, please open tickets > > > Apache Jira (https://issues.apache.org/jira). Note that we will no > > > longer accept GitHub issues for these purposes. > > > > > > > > > Welcome to explore the new repository and documentation. Your feedback > and > > > contributions are invaluable as we continue to improve Flink CDC. > > > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > > > Best, > > > Leonard > > > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > > > > > -- > > Best > > > > ConradJam >
Re: [VOTE] FLIP-433: State Access on DataStream API V2
+1 non-binding Best, Zakelly On Thu, Mar 21, 2024 at 5:34 PM Gyula Fóra wrote: > +1 (binding) > > Gyula > > On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote: > > > +1(binding) > > > > Thanks to Weijie for driving this proposal, which solves the problem > that I > > raised in FLIP-359. > > > > Best, > > Rui > > > > On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu > wrote: > > > > > +1 (binding) > > > > > > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song > > > wrote: > > > > > > > +1 (binding) > > > > > > > > Best, > > > > > > > > Xintong > > > > > > > > > > > > > > > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo < > guoweijieres...@gmail.com> > > > > wrote: > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > Thanks for all the feedback about the FLIP-433: State Access on > > > > > DataStream API V2 [1]. The discussion thread is here [2]. > > > > > > > > > > > > > > > The vote will be open for at least 72 hours unless there is an > > > > > objection or insufficient votes. > > > > > > > > > > > > > > > Best regards, > > > > > > > > > > Weijie > > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2 > > > > > > > > > > [2] > https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo > > > > > > > > > > > > > > > > > > -- > > > Best, > > > Hangxiang. > > > > > >
[jira] [Created] (FLINK-34907) jobRunningTs should be the timestamp that all tasks are running
Rui Fan created FLINK-34907: --- Summary: jobRunningTs should be the timestamp that all tasks are running Key: FLINK-34907 URL: https://issues.apache.org/jira/browse/FLINK-34907 Project: Flink Issue Type: Improvement Components: Autoscaler Reporter: Rui Fan Assignee: Rui Fan Fix For: kubernetes-operator-1.9.0 Currently, we consider the timestamp that JobStatus is changed to RUNNING as jobRunningTs. But the JobStatus will be RUNNING once job starts schedule, so it doesn't mean all tasks are running. It will let the isStabilizing or estimating restart time are not accurate. Solution: jobRunningTs should be the timestamp that all tasks are running. It can be got from SubtasksTimesHeaders rest api. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations
+1 (non-binding) Best, Zhongqiang Gong Alex Nitavsky 于2024年3月7日周四 23:09写道: > Hi everyone, > > I'd like to start a vote on FLIP-402 [1]. It introduces new configuration > options for Apache Flink's ZooKeeper integration for high availability by > reflecting existing Apache Curator configuration options. It has been > discussed in this thread [2]. > > I would like to start a vote. The vote will be open for at least 72 hours > (until March 10th 18:00 GMT) unless there is an objection or > insufficient votes. > > [1] > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations > [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z > > Thanks > Alex >
[jira] [Created] (FLINK-34906) Don't start autoscaling when some tasks are not running
Rui Fan created FLINK-34906: --- Summary: Don't start autoscaling when some tasks are not running Key: FLINK-34906 URL: https://issues.apache.org/jira/browse/FLINK-34906 Project: Flink Issue Type: Improvement Components: Autoscaler Reporter: Rui Fan Assignee: Rui Fan Fix For: 1.9.0 Attachments: image-2024-03-21-17-40-23-523.png Currently, the autoscaler will scale a job when the JobStatus is RUNNING. But the JobStatus will be RUNNING once job starts schedule, so it doesn't mean all tasks are running. Especially, when the resource isn't enough or job recovers from large state. The autoscaler will throw exception and generate the AutoscalerError event when tasks are not ready, such as: !image-2024-03-21-17-40-23-523.png! Solution: we only scale job that all tasks are running(some of tasks may be finished). -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] FLIP-433: State Access on DataStream API V2
+1 (binding) Gyula On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote: > +1(binding) > > Thanks to Weijie for driving this proposal, which solves the problem that I > raised in FLIP-359. > > Best, > Rui > > On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu wrote: > > > +1 (binding) > > > > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song > > wrote: > > > > > +1 (binding) > > > > > > Best, > > > > > > Xintong > > > > > > > > > > > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo > > > wrote: > > > > > > > Hi everyone, > > > > > > > > > > > > Thanks for all the feedback about the FLIP-433: State Access on > > > > DataStream API V2 [1]. The discussion thread is here [2]. > > > > > > > > > > > > The vote will be open for at least 72 hours unless there is an > > > > objection or insufficient votes. > > > > > > > > > > > > Best regards, > > > > > > > > Weijie > > > > > > > > > > > > [1] > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2 > > > > > > > > [2] https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo > > > > > > > > > > > > > -- > > Best, > > Hangxiang. > > >
Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations
+1 (binding) On Wed, Mar 20, 2024 at 1:19 PM Ferenc Csaky wrote: > +1 (non-binding), thanks for driving this! > > Best, > Ferenc > > > On Wednesday, March 20th, 2024 at 10:57, Yang Wang < > wangyang0...@apache.org> wrote: > > > > > > > +1 (binding) since ZK HA is still widely used. > > > > > > Best, > > Yang > > > > On Thu, Mar 14, 2024 at 6:27 PM Matthias Pohl > > matthias.p...@aiven.io.invalid wrote: > > > > > Nothing to add from my side. Thanks, Alex. > > > > > > +1 (binding) > > > > > > On Thu, Mar 7, 2024 at 4:09 PM Alex Nitavsky alexnitav...@gmail.com > > > wrote: > > > > > > > Hi everyone, > > > > > > > > I'd like to start a vote on FLIP-402 [1]. It introduces new > configuration > > > > options for Apache Flink's ZooKeeper integration for high > availability by > > > > reflecting existing Apache Curator configuration options. It has been > > > > discussed in this thread [2]. > > > > > > > > I would like to start a vote. The vote will be open for at least 72 > > > > hours > > > > (until March 10th 18:00 GMT) unless there is an objection or > > > > insufficient votes. > > > > > > > > [1] > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations > > > > > > > [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z > > > > > > > > Thanks > > > > Alex >
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (binding) On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang wrote: > +1 (non-binding) > > Bests, > Zhongqiang Gong > > Ferenc Csaky 于2024年3月20日周三 22:11写道: > > > Hello devs, > > > > I would like to start a vote about FLIP-439 [1]. The FLIP is about to > > externalize the Kudu > > connector from the recently retired Apache Bahir project [2] to keep it > > maintainable and > > make it up to date as well. Discussion thread [3]. > > > > The vote will be open for at least 72 hours (until 2024 March 23 14:03 > > UTC) unless there > > are any objections or insufficient votes. > > > > Thanks, > > Ferenc > > > > [1] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > > [2] https://attic.apache.org/projects/bahir.html > > [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz >
Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources
Hello Jeyhun, I really like the proposal and definitely makes sense to me. I have a couple of nits here and there: For the interface `SupportsPartitioning`, why returning `Optional`? If one decides to implement that, partitions must exist (at maximum, return and empty list). Returning `Optional` seem just to complicate the logic of the code using that interface. I foresee the using code doing something like: "if the source supports partitioning, get the partitions, but if they don't exist, raise a runtime exception". Let's simply make that safe at compile time and guarantee the code that partitions exist. Another thing is that you show Hive-like partitioning in your FS structure, do you think it makes sense to add a note about auto-discovery of partitions? In other terms, it looks a bit counterintuitive that the user implementing the source has to specify which partitions exist statically (and they can change at runtime), while the source itself knows the data provider and can directly implement a method `discoverPartitions`. Then Flink would take care of invoking that method when needed. On Mar 15, 2024 at 22:09 +0100, Jeyhun Karimov , wrote: > Hi Benchao, > > Thanks for your comments. > > 1. What the parallelism would you take? E.g., 128 + 256 => 128? What > > if we cannot have a good greatest common divisor, like 127 + 128, > > could we just utilize one side's pre-partitioned attribute, and let > > another side just do the shuffle? > > > There are two cases we need to consider: > > 1. Static Partition (no partitions are added during the query execution) is > enabled AND both sources implement "SupportsPartitionPushdown" > > In this case, we are sure that no new partitions will be added at runtime. > So, we have a chance equalize both sources' partitions and parallelism, IFF > both sources implement "SupportsPartitionPushdown" interface. > To achieve so, first we will fetch the existing partitions from source1 > (say p_s1) and from source2 (say p_s2). > Then, we find the intersection of these two partition sets (say > p_intersect) and pushdown these partitions: > > SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that > only specific partitions are read > SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read > with filtered partitions > > Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3) > all of their downstream operators until (and including) their first common > ancestor (e.g., join) to be equal to the number of partitions (size of > p_intersect). > > 2. All other cases > > In all other cases, the parallelism of both sources and their downstream > operators until their common ancestor would be equal to the MIN(p_s1, > p_s2). > That is, minimum of the partition size of source1 and partition size of > source2 will be selected as the parallelism. > Coming back to your example, if source1 parallelism is 127 and source2 > parallelism is 128, then we will first check the partition size of source1 > and source2. > Say partition size of source1 is 100 and partition size of source2 is 90. > Then, we would set the parallelism for source1, source2, and all of their > downstream operators until (and including) the join operator > to 90 (min(100, 90)). > We also plan to implement a cost based decision instead of the rule-based > one (the ones explained above - MIN rule). > One possible result of the cost based estimation is to keep the partitions > on one side and perform the shuffling on another source. > > > > 2. In our current shuffle remove design (FlinkExpandConversionRule), > > we don't consider parallelism, we just remove unnecessary shuffles > > according to the distribution columns. After this FLIP, the > > parallelism may be bundled with source's partitions, then how will > > this optimization accommodate with FlinkExpandConversionRule, will you > > also change downstream operator's parallelisms if we want to also > > remove subsequent shuffles? > > > > - From my understanding of FlinkExpandConversionRule, its removal logic is > agnostic to operator parallelism. > So, if FlinkExpandConversionRule decides to remove a shuffle operation, > then this FLIP will search another possible shuffle (the one closest to the > source) to remove. > If there is such an opportunity, this FLIP will remove the shuffle. So, > from my understanding FlinkExpandConversionRule and this optimization rule > can work together safely. > Please correct me if I misunderstood your question. > > > > Regarding the new optimization rule, have you also considered to allow > > some non-strict mode like FlinkRelDistribution#requireStrict? For > > example, source is pre-partitioned by a, b columns, if we are > > consuming this source, and do a aggregate on a, b, c, can we utilize > > this optimization? > > > - Good point. Yes, there are some cases that non-strict mode will apply. > For example: > > - pre-partitioned columns and aggregate columns are the same
[jira] [Created] (FLINK-34905) The default length of CHAR/BINARY data type of Add column DDL
Qishang Zhong created FLINK-34905: - Summary: The default length of CHAR/BINARY data type of Add column DDL Key: FLINK-34905 URL: https://issues.apache.org/jira/browse/FLINK-34905 Project: Flink Issue Type: Bug Components: Flink CDC Reporter: Qishang Zhong I run the DDL in mysql {code:java} ALTER TABLE test.products ADD Column1 BINARY NULL; ALTER TABLE test.products ADD Column2 CHAR NULL; {code} Encountered the follow error: {code:java} Caused by: java.lang.IllegalArgumentException: Binary string length must be between 1 and 2147483647 (both inclusive). at org.apache.flink.cdc.common.types.BinaryType.(BinaryType.java:53) at org.apache.flink.cdc.common.types.BinaryType.(BinaryType.java:61) at org.apache.flink.cdc.common.types.DataTypes.BINARY(DataTypes.java:42) at org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.convertFromColumn(MySqlTypeUtils.java:221) at org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.fromDbzColumn(MySqlTypeUtils.java:111) at org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.toCdcColumn(CustomAlterTableParserListener.java:256) at org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.lambda$exitAlterByAddColumn$0(CustomAlterTableParserListener.java:126) at io.debezium.connector.mysql.antlr.MySqlAntlrDdlParser.runIfNotNull(MySqlAntlrDdlParser.java:358) at org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.exitAlterByAddColumn(CustomAlterTableParserListener.java:98) at io.debezium.ddl.parser.mysql.generated.MySqlParser$AlterByAddColumnContext.exitRule(MySqlParser.java:15459) at io.debezium.antlr.ProxyParseTreeListenerUtil.delegateExitRule(ProxyParseTreeListenerUtil.java:64) at org.apache.flink.cdc.connectors.mysql.source.parser.CustomMySqlAntlrDdlParserListener.exitEveryRule(CustomMySqlAntlrDdlParserListener.java:124) at org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:48) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:30) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28) at org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28) at io.debezium.antlr.AntlrDdlParser.parse(AntlrDdlParser.java:87) at org.apache.flink.cdc.connectors.mysql.source.MySqlEventDeserializer.deserializeSchemaChangeRecord(MySqlEventDeserializer.java:88) at org.apache.flink.cdc.debezium.event.SourceRecordEventDeserializer.deserialize(SourceRecordEventDeserializer.java:52) at org.apache.flink.cdc.debezium.event.DebeziumEventDeserializationSchema.deserialize(DebeziumEventDeserializationSchema.java:93) at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitElement(MySqlRecordEmitter.java:119) at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.processElement(MySqlRecordEmitter.java:96) at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.processElement(MySqlPipelineRecordEmitter.java:120) at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitRecord(MySqlRecordEmitter.java:73) at org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitRecord(MySqlRecordEmitter.java:46) at org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:160) at org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:419) at org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68) at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65) at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562) at org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231) at org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807) at org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953) at org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932) at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746) at
[jira] [Created] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.
ZhengYu Chen created FLINK-34904: Summary: [Feature] submit Flink CDC pipeline job to yarn application cluster. Key: FLINK-34904 URL: https://issues.apache.org/jira/browse/FLINK-34904 Project: Flink Issue Type: Improvement Components: Flink CDC Affects Versions: 3.1.0 Reporter: ZhengYu Chen Fix For: 3.1.0 support flink cdc cli submit pipeline job to yarn application cluster.discuss in FLINK-34853 -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congratulations and look forward to its further development! Best Regards, Yu On Thu, 21 Mar 2024 at 15:54, ConradJam wrote: > > Congrattulations! > > Leonard Xu 于2024年3月20日周三 21:36写道: > > > Hi devs and users, > > > > We are thrilled to announce that the donation of Flink CDC as a > > sub-project of Apache Flink has completed. We invite you to explore the new > > resources available: > > > > - GitHub Repository: https://github.com/apache/flink-cdc > > - Flink CDC Documentation: > > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > > > After Flink community accepted this donation[1], we have completed > > software copyright signing, code repo migration, code cleanup, website > > migration, CI migration and github issues migration etc. > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their > > contributions and help during this process! > > > > > > For all previous contributors: The contribution process has slightly > > changed to align with the main Flink project. To report bugs or suggest new > > features, please open tickets > > Apache Jira (https://issues.apache.org/jira). Note that we will no > > longer accept GitHub issues for these purposes. > > > > > > Welcome to explore the new repository and documentation. Your feedback and > > contributions are invaluable as we continue to improve Flink CDC. > > > > Thanks everyone for your support and happy exploring Flink CDC! > > > > Best, > > Leonard > > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > > > > > -- > Best > > ConradJam
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (non-binding) Bests, Zhongqiang Gong Ferenc Csaky 于2024年3月20日周三 22:11写道: > Hello devs, > > I would like to start a vote about FLIP-439 [1]. The FLIP is about to > externalize the Kudu > connector from the recently retired Apache Bahir project [2] to keep it > maintainable and > make it up to date as well. Discussion thread [3]. > > The vote will be open for at least 72 hours (until 2024 March 23 14:03 > UTC) unless there > are any objections or insufficient votes. > > Thanks, > Ferenc > > [1] > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > [2] https://attic.apache.org/projects/bahir.html > [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congrattulations! Leonard Xu 于2024年3月20日周三 21:36写道: > Hi devs and users, > > We are thrilled to announce that the donation of Flink CDC as a > sub-project of Apache Flink has completed. We invite you to explore the new > resources available: > > - GitHub Repository: https://github.com/apache/flink-cdc > - Flink CDC Documentation: > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > After Flink community accepted this donation[1], we have completed > software copyright signing, code repo migration, code cleanup, website > migration, CI migration and github issues migration etc. > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their > contributions and help during this process! > > > For all previous contributors: The contribution process has slightly > changed to align with the main Flink project. To report bugs or suggest new > features, please open tickets > Apache Jira (https://issues.apache.org/jira). Note that we will no > longer accept GitHub issues for these purposes. > > > Welcome to explore the new repository and documentation. Your feedback and > contributions are invaluable as we continue to improve Flink CDC. > > Thanks everyone for your support and happy exploring Flink CDC! > > Best, > Leonard > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > > -- Best ConradJam
Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1
+1 (binding) As per Gyula's suggestion above verified with " ghcr.io/apache/flink-kubernetes-operator:91d67d9 ". - Verified Helm repo works as expected, points to correct image tag, build, version - Verified basic examples + checked operator logs everything looks as expected - Verified hashes, signatures and source release contains no binaries - Ran built-in tests, built jars + docker image from source successfully - Upgraded the operator and the CRD from 1.7.0 to 1.8.0 Best, Marton On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany wrote: > Hi, > > +1 (non-binding) > > - Verified checksums > - Verified signatures > - Verified no binaries in source distribution > - Verified Apache License and NOTICE files > - Executed tests > - Built container image > - Verified chart version and appVersion matches > - Verified Helm chart can be installed with default values > - Verify that RC repo works as Helm repo > > Best Regards, > Mate > > Alexander Fedulov ezt írta (időpont: 2024. > márc. 19., K, 23:10): > > > Hi Max, > > > > +1 > > > > - Verified SHA checksums > > - Verified GPG signatures > > - Verified that the source distributions do not contain binaries > > - Verified built-in tests (mvn clean verify) > > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C) > > - Verified that Helm and operator files contain Apache licenses (rg -L > > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .). > > I am not sure we need to > > include ./examples/flink-beam-example/dependency-reduced-pom.xml > > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though > > - Verified that chart and appVersion matches the target release (91d67d9) > > - Verified that Helm chart can be installed from the local Helm folder > > without overriding any parameters > > - Verified that Helm chart can be installed from the RC repo without > > overriding any parameters ( > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1 > > ) > > - Verified docker container build > > > > Best, > > Alex > > > > > > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels wrote: > > > > > @Rui @Gyula Thanks for checking the release! > > > > > > >A minor correction is that [3] in the email should point to: > > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart > > and > > > > everything is correct. It's a typo in the vote email. > > > > > > Good catch. Indeed, for the linked Docker image 8938658 points to > > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes > > > between those two commits, except for updating the version. So the > > > votes are not impacted, especially because votes are casted against > > > the source release which, as you pointed out, contains the correct > > > image ref. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra > wrote: > > > > > > > > Hi Max! > > > > > > > > +1 (binding) > > > > > > > > - Verified source release, helm chart + checkpoints / signatures > > > > - Helm points to correct image > > > > - Deployed operator, stateful example and executed upgrade + > savepoint > > > > redeploy > > > > - Verified logs > > > > - Flink web PR looks good +1 > > > > > > > > A minor correction is that [3] in the email should point to: > > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm > chart > > > and > > > > everything is correct. It's a typo in the vote email. > > > > > > > > Thank you for preparing the release! > > > > > > > > Cheers, > > > > Gyula > > > > > > > > On Mon, Mar 18, 2024 at 8:26 AM Rui Fan <1996fan...@gmail.com> > wrote: > > > > > > > > > Thanks Max for driving this release! > > > > > > > > > > +1(non-binding) > > > > > > > > > > - Downloaded artifacts from dist ( svn co > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/ > > > > > ) > > > > > - Verified SHA512 checksums : ( for i in *.tgz; do echo $i; > sha512sum > > > > > --check $i.sha512; done ) > > > > > - Verified GPG signatures : ( $ for i in *.tgz; do echo $i; gpg > > > --verify > > > > > $i.asc $i ) > > > > > - Build the source with java-11 and java-17 ( mvn -T 20 clean > install > > > > > -DskipTests ) > > > > > - Verified the license header during build the source > > > > > - Verified that chart and appVersion matches the target release > (less > > > the > > > > > index.yaml and Chart.yaml ) > > > > > - RC repo works as Helm repo( helm repo add > > > flink-operator-repo-1.8.0-rc1 > > > > > > > > > > > > > > > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/ > > > > > ) > > > > > - Verified Helm chart can be installed ( helm install > > > > > flink-kubernetes-operator > > > > > flink-operator-repo-1.8.0-rc1/flink-kubernetes-operator --set > > > > > webhook.create=false ) > > > > > - Submitted the autoscaling demo, the autoscaler works well with > > > *memory > > >
[jira] [Created] (FLINK-34903) Add mysql-pipeline-connector with table.exclude.list option to exclude unnecessary tables
shiyuyang created FLINK-34903: - Summary: Add mysql-pipeline-connector with table.exclude.list option to exclude unnecessary tables Key: FLINK-34903 URL: https://issues.apache.org/jira/browse/FLINK-34903 Project: Flink Issue Type: Improvement Components: Flink CDC Reporter: shiyuyang Fix For: cdc-3.1.0 When using the MySQL Pipeline connector for whole-database synchronization, users currently cannot exclude unnecessary tables. Taking reference from Debezium's parameters, specifically the {*}table.exclude.list{*}, if the *table.include.list* is declared, then the *table.exclude.list* parameter will not take effect. However, the tables specified in the tables parameter of the MySQL Pipeline connector are effectively added to the *table.include.list* in Debezium's context. In summary, it is necessary to introduce an externally-exposed *table.exclude.list* parameter within the MySQL Pipeline connector to facilitate the exclusion of tables. This is because the current setup does not allow for excluding unnecessary tables when including others through the tables parameter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (non-binding ) Bests, Samrat On Thu, Mar 21, 2024 at 12:55 PM Hang Ruan wrote: > +1 (non-binding) > > Best, > Hang > > Őrhidi Mátyás 于2024年3月21日周四 00:00写道: > > > +1 (binding) > > > > On Wed, Mar 20, 2024 at 8:37 AM Gabor Somogyi > > > wrote: > > > > > +1 (binding) > > > > > > G > > > > > > > > > On Wed, Mar 20, 2024 at 3:59 PM Gyula Fóra > wrote: > > > > > > > +1 (binding) > > > > > > > > Thanks! > > > > Gyula > > > > > > > > On Wed, Mar 20, 2024 at 3:36 PM Mate Czagany > > wrote: > > > > > > > > > +1 (non-binding) > > > > > > > > > > Thank you, > > > > > Mate > > > > > > > > > > Ferenc Csaky ezt írta (időpont: 2024. > > > márc. > > > > > 20., Sze, 15:11): > > > > > > > > > > > Hello devs, > > > > > > > > > > > > I would like to start a vote about FLIP-439 [1]. The FLIP is > about > > to > > > > > > externalize the Kudu > > > > > > connector from the recently retired Apache Bahir project [2] to > > keep > > > it > > > > > > maintainable and > > > > > > make it up to date as well. Discussion thread [3]. > > > > > > > > > > > > The vote will be open for at least 72 hours (until 2024 March 23 > > > 14:03 > > > > > > UTC) unless there > > > > > > are any objections or insufficient votes. > > > > > > > > > > > > Thanks, > > > > > > Ferenc > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > > > > > > [2] https://attic.apache.org/projects/bahir.html > > > > > > [3] > > https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz > > > > > > > > > > > > > > >
Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed
Congrattulations! Thanks for the great work! Best, Zhongqiang Gong Leonard Xu 于2024年3月20日周三 21:36写道: > Hi devs and users, > > We are thrilled to announce that the donation of Flink CDC as a > sub-project of Apache Flink has completed. We invite you to explore the new > resources available: > > - GitHub Repository: https://github.com/apache/flink-cdc > - Flink CDC Documentation: > https://nightlies.apache.org/flink/flink-cdc-docs-stable > > After Flink community accepted this donation[1], we have completed > software copyright signing, code repo migration, code cleanup, website > migration, CI migration and github issues migration etc. > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their > contributions and help during this process! > > > For all previous contributors: The contribution process has slightly > changed to align with the main Flink project. To report bugs or suggest new > features, please open tickets > Apache Jira (https://issues.apache.org/jira). Note that we will no > longer accept GitHub issues for these purposes. > > > Welcome to explore the new repository and documentation. Your feedback and > contributions are invaluable as we continue to improve Flink CDC. > > Thanks everyone for your support and happy exploring Flink CDC! > > Best, > Leonard > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob > >
Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir
+1 (non-binding) Best, Hang Őrhidi Mátyás 于2024年3月21日周四 00:00写道: > +1 (binding) > > On Wed, Mar 20, 2024 at 8:37 AM Gabor Somogyi > wrote: > > > +1 (binding) > > > > G > > > > > > On Wed, Mar 20, 2024 at 3:59 PM Gyula Fóra wrote: > > > > > +1 (binding) > > > > > > Thanks! > > > Gyula > > > > > > On Wed, Mar 20, 2024 at 3:36 PM Mate Czagany > wrote: > > > > > > > +1 (non-binding) > > > > > > > > Thank you, > > > > Mate > > > > > > > > Ferenc Csaky ezt írta (időpont: 2024. > > márc. > > > > 20., Sze, 15:11): > > > > > > > > > Hello devs, > > > > > > > > > > I would like to start a vote about FLIP-439 [1]. The FLIP is about > to > > > > > externalize the Kudu > > > > > connector from the recently retired Apache Bahir project [2] to > keep > > it > > > > > maintainable and > > > > > make it up to date as well. Discussion thread [3]. > > > > > > > > > > The vote will be open for at least 72 hours (until 2024 March 23 > > 14:03 > > > > > UTC) unless there > > > > > are any objections or insufficient votes. > > > > > > > > > > Thanks, > > > > > Ferenc > > > > > > > > > > [1] > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir > > > > > [2] https://attic.apache.org/projects/bahir.html > > > > > [3] > https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz > > > > > > > > > >
[jira] [Created] (FLINK-34902) INSERT INTO column mismatch leads to IndexOutOfBoundsException
Timo Walther created FLINK-34902: Summary: INSERT INTO column mismatch leads to IndexOutOfBoundsException Key: FLINK-34902 URL: https://issues.apache.org/jira/browse/FLINK-34902 Project: Flink Issue Type: Bug Components: Table SQL / Planner Reporter: Timo Walther SQL: {code} INSERT INTO t (a, b) SELECT 1; {code} Stack trace: {code} org.apache.flink.table.api.ValidationException: SQL validation failed. Index 1 out of bounds for length 1 at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:200) at org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:117) at Caused by: java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1 at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64) at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70) at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248) at java.base/java.util.Objects.checkIndex(Objects.java:374) at java.base/java.util.ArrayList.get(ArrayList.java:459) at org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$reorder$1(PreValidateReWriter.scala:355) at org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$reorder$1$adapted(PreValidateReWriter.scala:355) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)