date:20240321

Re: [DISCUSS] Planning Flink 1.20

2024-03-21 Thread weijie guo

Thanks Leonard!

> I'd like to help you if you need some help like permissions from PMC
side, please feel free to ping me.

Nice to know. It'll help a lot!

Best regards,

Weijie


Leonard Xu  于2024年3月22日周五 12:09写道：

> +1 for the proposed release managers (Weijie Guo, Rui Fan), both the two
> candidates are pretty active committers thus I believe they know the
> community development process well. The recent releases have four release
> managers, and I am also looking forward to having other volunteers
>  join the management of Flink 1.20.
>
> +1 for targeting date (feature freeze: June 15, 2024), referring to the
> release cycle of recent versions, release cycle of 4 months makes sense to
> me.
>
>
> I'd like to help you if you need some help like permissions from PMC side,
> please feel free to ping me.
>
> Best,
> Leonard
>
>
> > 2024年3月19日 下午5:35，Rui Fan <1996fan...@gmail.com> 写道：
> >
> > Hi Weijie,
> >
> > Thanks for kicking off 1.20! I'd like to join you and participate in the
> > 1.20 release.
> >
> > Best,
> > Rui
> >
> > On Tue, Mar 19, 2024 at 5:30 PM weijie guo 
> > wrote:
> >
> >> Hi everyone,
> >>
> >> With the release announcement of Flink 1.19, it's a good time to kick
> off
> >> discussion of the next release 1.20.
> >>
> >>
> >> - Release managers
> >>
> >>
> >> I'd like to volunteer as one of the release managers this time. It has
> been
> >> good practice to have a team of release managers from different
> >> backgrounds, so please raise you hand if you'd like to volunteer and get
> >> involved.
> >>
> >>
> >>
> >> - Timeline
> >>
> >>
> >> Flink 1.19 has been released. With a target release cycle of 4 months,
> >> we propose a feature freeze date of *June 15, 2024*.
> >>
> >>
> >>
> >> - Collecting features
> >>
> >>
> >> As usual, we've created a wiki page[1] for collecting new features in
> 1.20.
> >>
> >>
> >> In addition, we already have a number of FLIPs that have been voted or
> are
> >> in the process, including pre-works for version 2.0.
> >>
> >>
> >> In the meantime, the release management team will be finalized in the
> next
> >> few days, and we'll continue to create Jira Boards and Sync meetings
> >> to make it easy
> >> for everyone to get an overview and track progress.
> >>
> >>
> >>
> >> Best regards,
> >>
> >> Weijie
> >>
> >>
> >>
> >> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
> >>
>
>

Re: [DISCUSS] Planning Flink 1.20

2024-03-21 Thread Leonard Xu

+1 for the proposed release managers (Weijie Guo, Rui Fan), both the two 
candidates are pretty active committers thus I believe they know the 
community development process well. The recent releases have four release 
managers, and I am also looking forward to having other volunteers
 join the management of Flink 1.20.

+1 for targeting date (feature freeze: June 15, 2024), referring to the release 
cycle of recent versions, release cycle of 4 months makes sense to me.


I'd like to help you if you need some help like permissions from PMC side, 
please feel free to ping me.

Best,
Leonard


> 2024年3月19日 下午5:35，Rui Fan <1996fan...@gmail.com> 写道：
> 
> Hi Weijie,
> 
> Thanks for kicking off 1.20! I'd like to join you and participate in the
> 1.20 release.
> 
> Best,
> Rui
> 
> On Tue, Mar 19, 2024 at 5:30 PM weijie guo 
> wrote:
> 
>> Hi everyone,
>> 
>> With the release announcement of Flink 1.19, it's a good time to kick off
>> discussion of the next release 1.20.
>> 
>> 
>> - Release managers
>> 
>> 
>> I'd like to volunteer as one of the release managers this time. It has been
>> good practice to have a team of release managers from different
>> backgrounds, so please raise you hand if you'd like to volunteer and get
>> involved.
>> 
>> 
>> 
>> - Timeline
>> 
>> 
>> Flink 1.19 has been released. With a target release cycle of 4 months,
>> we propose a feature freeze date of *June 15, 2024*.
>> 
>> 
>> 
>> - Collecting features
>> 
>> 
>> As usual, we've created a wiki page[1] for collecting new features in 1.20.
>> 
>> 
>> In addition, we already have a number of FLIPs that have been voted or are
>> in the process, including pre-works for version 2.0.
>> 
>> 
>> In the meantime, the release management team will be finalized in the next
>> few days, and we'll continue to create Jira Boards and Sync meetings
>> to make it easy
>> for everyone to get an overview and track progress.
>> 
>> 
>> 
>> Best regards,
>> 
>> Weijie
>> 
>> 
>> 
>> [1] https://cwiki.apache.org/confluence/display/FLINK/1.20+Release
>>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Hangxiang Yu

Congratulations!
Thanks for the efforts.

On Fri, Mar 22, 2024 at 10:00 AM Yanfei Lei  wrote:

> Congratulations!
>
> Best regards,
> Yanfei
>
> Xuannan Su  于2024年3月22日周五 09:21写道：
> >
> > Congratulations!
> >
> > Best regards,
> > Xuannan
> >
> > On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang 
> wrote:
> > >
> > > Congratulations!
> > >
> > > Best wishes,
> > > Charles Zhang
> > > from Apache InLong
> > >
> > >
> > > Jeyhun Karimov  于2024年3月22日周五 04:16写道：
> > >
> > > > Great news! Congratulations!
> > > >
> > > > Regards,
> > > > Jeyhun
> > > >
> > > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan 
> wrote:
> > > >
> > > > > Congratulations! Thanks for the efforts.
> > > > >
> > > > >
> > > > > Best,
> > > > > Yuxin
> > > > >
> > > > >
> > > > > Samrat Deb  于2024年3月21日周四 20:28写道：
> > > > >
> > > > > > Congratulations !
> > > > > >
> > > > > > Bests
> > > > > > Samrat
> > > > > >
> > > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy <
> hamdy10...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Congratulations, great work and great news.
> > > > > > > Best Regards
> > > > > > > Ahmed Hamdy
> > > > > > >
> > > > > > >
> > > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li  >
> > > > wrote:
> > > > > > >
> > > > > > > > Congratulations, and thanks for the great work!
> > > > > > > >
> > > > > > > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > > > > > > >
> > > > > > > > > Thanks for driving these efforts!
> > > > > > > > >
> > > > > > > > > Congratulations
> > > > > > > > >
> > > > > > > > > Best
> > > > > > > > > Yuan
> > > > > > > > >
> > > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li 
> wrote:
> > > > > > > > >
> > > > > > > > > > Congratulations and look forward to its further
> development!
> > > > > > > > > >
> > > > > > > > > > Best Regards,
> > > > > > > > > > Yu
> > > > > > > > > >
> > > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam <
> jam.gz...@gmail.com>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Congrattulations!
> > > > > > > > > > >
> > > > > > > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > > > > > > >
> > > > > > > > > > > > Hi devs and users,
> > > > > > > > > > > >
> > > > > > > > > > > > We are thrilled to announce that the donation of
> Flink CDC
> > > > > as a
> > > > > > > > > > > > sub-project of Apache Flink has completed. We invite
> you to
> > > > > > > explore
> > > > > > > > > > the new
> > > > > > > > > > > > resources available:
> > > > > > > > > > > >
> > > > > > > > > > > > - GitHub Repository:
> https://github.com/apache/flink-cdc
> > > > > > > > > > > > - Flink CDC Documentation:
> > > > > > > > > > > >
> https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > > > > > > >
> > > > > > > > > > > > After Flink community accepted this donation[1], we
> have
> > > > > > > completed
> > > > > > > > > > > > software copyright signing, code repo migration, code
> > > > > cleanup,
> > > > > > > > website
> > > > > > > > > > > > migration, CI migration and github issues migration
> etc.
> > > > > > > > > > > > Here I am particularly grateful to Hang Ruan,
> Zhongqaing
> > > > > Gong,
> > > > > > > > > > Qingsheng
> > > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > > > > > contributors
> > > > > > > > for
> > > > > > > > > > their
> > > > > > > > > > > > contributions and help during this process！
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > For all previous contributors: The contribution
> process has
> > > > > > > > slightly
> > > > > > > > > > > > changed to align with the main Flink project. To
> report
> > > > bugs
> > > > > or
> > > > > > > > > > suggest new
> > > > > > > > > > > > features, please open tickets
> > > > > > > > > > > > Apache Jira (https://issues.apache.org/jira).  Note
> that
> > > > we
> > > > > > will
> > > > > > > > no
> > > > > > > > > > > > longer accept GitHub issues for these purposes.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Welcome to explore the new repository and
> documentation.
> > > > Your
> > > > > > > > feedback
> > > > > > > > > > and
> > > > > > > > > > > > contributions are invaluable as we continue to
> improve
> > > > Flink
> > > > > > CDC.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks everyone for your support and happy exploring
> Flink
> > > > > CDC!
> > > > > > > > > > > >
> > > > > > > > > > > > Best，
> > > > > > > > > > > > Leonard
> > > > > > > > > > > > [1]
> > > > > > > >
> https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > Best
> > > > > > > > > > >
> > > > > > > > > > > ConradJam
> > > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Benchao Li
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
>


-- 
Best,
Hangxiang.

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-21 Thread Jinzhong Li

Hi Jeyhun,

Thanks for your thoughtful feedback!

> Why dont we consider an option where checkpoint directory just contains
> metadata. So, we do not need to copy the data all the time from working
> directory to the checkpointing directory.
> Basically, when checkpointing, 1) we mark files in working directories as
> "read-only", 2) optionally change the working directory, and 3) update the
> metadata in the checkpoint directory (that points to some files in the
> working directory).

The method you described is essentially consistent with the "Mid/long term
follow up work"[1] outlined in this FLIP.
Under this approach, the ForStDB (TM side) needs to manage the lifecycle of
both state files and checkpoint files, so that the checkpoint file can
reuse state db files. For instance, files that are still referenced by the
checkpoint but no longer used by ForStDB should be guaranteed not to be
deleted by ForStDB.
However, this way conflicts with the current mechanism where the JobManager
manages the lifecycle of checkpoint files, and we need to refine it step by
step. As outlined in flip-423[2], we will introduce this "zero-copy" faster
checkpointing & restoring at milestorn-2.

[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046865#FLIP428:FaultTolerance/RescaleIntegrationforDisaggregatedState-Mid/longtermfollowupwork
[2]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046855#FLIP423:DisaggregatedStateStorageandManagement(UmbrellaFLIP)-RoadMap+LaunchingPlan

> This is more of a technical question about RocksDB. Do we have a
> guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as
> well?

I think we don't need to consider WAL; even we can disable WAL. The reasons
are:
(1) we will set flush_memtable=true when DB.GetLiveFiles() invoked,
ensuring that MemTable data could also be persisted.
(2) DB.GetLiveFiles() is called during the snapshot synchronization phase,
during which there are no concurrent state read/write operations;

> As far as I understand, DB.GetLiveFiles() retrieves the global mutex
> lock. I am wondering if RocksDBs optimistic transactions can be any of
help
> in this situation?

As mentioned above, there is no concurrent read/write during
GetLiveFiles(), and since GetLiveFiles() is a pure memory operation, I'm
not particularly worried about the mutex lock impacting performance.

Best,
Jinzhong

On Fri, Mar 22, 2024 at 6:36 AM Jeyhun Karimov  wrote:

> Hi Jinzhong,
>
> Thanks for the FLIP. +1 for it.
>
> I have a few questions:
>
> - Why dont we consider an option where checkpoint directory just contains
> metadata. So, we do not need to copy the data all the time from working
> directory to the checkpointing directory.
> Basically, when checkpointing, 1) we mark files in working directories as
> "read-only", 2) optionally change the working directory, and 3) update the
> metadata in the checkpoint directory (that points to some files in the
> working directory).
>
> - This is more of a technical question about RocksDB. Do we have a
> guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as
> well?
>
> - As far as I understand, DB.GetLiveFiles() retrieves the global mutex
> lock. I am wondering if RocksDBs optimistic transactions can be any of help
> in this situation?
>
> Regards,
> Jeyhun
>
> On Wed, Mar 20, 2024 at 1:35 PM Jinzhong Li 
> wrote:
>
> > Hi Yue,
> >
> > Thanks for your feedback!
> >
> > > 1. If we choose Option-3 for ForSt , how would we handle Manifest File
> > > ? Should we take a snapshot of the Manifest during the synchronization
> > phase?
> >
> > IIUC, the GetLiveFiles() API in Option-3 can also catch the fileInfo of
> > Manifest files, and this api also return the manifest file size, which
> > means this api could take snapshot for Manifest FileInfo (filename +
> > fileSize) during the synchronization phase.
> > You could refer to the rocksdb source code[1] to verify this.
> >
> >
> >  > However, many distributed storage systems do not support the
> > > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to
> > > directly read and write remote files. Can we not copy or Fast duplicate
> > > these files, but instand of directly reuse and. reference these remote
> > > files? I think this can reduce file download time and may be more
> useful
> > > for most users who use HDFS (do not support Fast Duplicate)?
> >
> > Firstly, as far as I know, most remote file systems support the
> > FastDuplicate, eg. S3 copyObject/Azure Blob Storage copyBlob/OSS
> > copyObject, and the HDFS indeed does not support FastDuplicate.
> >
> > Actually，we have considered the design which reuses remote files. And
> that
> > is what we want to implement in the coming future, where both checkpoints
> > and restores can reuse existing files residing on the remote state
> storage.
> > However, this design conflicts with the current file management system in
> > Flink.  At present, remote state files are

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Yuepeng Pan




+1 (non-binding)


Regards,
Yuepeng Pan














在 2024-03-22 04:11:32，"Jeyhun Karimov"  写道：
>+1 (non-binding)
>
>Regards,
>Jeyhun
>
>On Thu, Mar 21, 2024 at 2:04 PM Márton Balassi 
>wrote:
>
>> +1(binding)
>>
>> On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu  wrote:
>>
>> > +1(binding)
>> >
>> > Best,
>> > Leonard
>> >
>> > > 2024年3月21日 下午5:21，Martijn Visser  写道：
>> > >
>> > > +1 (binding)
>> > >
>> > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang <
>> > gongzhongqi...@apache.org>
>> > > wrote:
>> > >
>> > >> +1 (non-binding)
>> > >>
>> > >> Bests,
>> > >> Zhongqiang Gong
>> > >>
>> > >> Ferenc Csaky  于2024年3月20日周三 22:11写道：
>> > >>
>> > >>> Hello devs,
>> > >>>
>> > >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to
>> > >>> externalize the Kudu
>> > >>> connector from the recently retired Apache Bahir project [2] to keep
>> it
>> > >>> maintainable and
>> > >>> make it up to date as well. Discussion thread [3].
>> > >>>
>> > >>> The vote will be open for at least 72 hours (until 2024 March 23
>> 14:03
>> > >>> UTC) unless there
>> > >>> are any objections or insufficient votes.
>> > >>>
>> > >>> Thanks,
>> > >>> Ferenc
>> > >>>
>> > >>> [1]
>> > >>>
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
>> > >>> [2] https://attic.apache.org/projects/bahir.html
>> > >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
>> > >>
>> >
>> >
>>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Yanfei Lei

Congratulations!

Best regards,
Yanfei

Xuannan Su  于2024年3月22日周五 09:21写道：
>
> Congratulations!
>
> Best regards,
> Xuannan
>
> On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang  wrote:
> >
> > Congratulations!
> >
> > Best wishes,
> > Charles Zhang
> > from Apache InLong
> >
> >
> > Jeyhun Karimov  于2024年3月22日周五 04:16写道：
> >
> > > Great news! Congratulations!
> > >
> > > Regards,
> > > Jeyhun
> > >
> > > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan  wrote:
> > >
> > > > Congratulations! Thanks for the efforts.
> > > >
> > > >
> > > > Best,
> > > > Yuxin
> > > >
> > > >
> > > > Samrat Deb  于2024年3月21日周四 20:28写道：
> > > >
> > > > > Congratulations !
> > > > >
> > > > > Bests
> > > > > Samrat
> > > > >
> > > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy 
> > > > wrote:
> > > > >
> > > > > > Congratulations, great work and great news.
> > > > > > Best Regards
> > > > > > Ahmed Hamdy
> > > > > >
> > > > > >
> > > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li 
> > > wrote:
> > > > > >
> > > > > > > Congratulations, and thanks for the great work!
> > > > > > >
> > > > > > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > > > > > >
> > > > > > > > Thanks for driving these efforts!
> > > > > > > >
> > > > > > > > Congratulations
> > > > > > > >
> > > > > > > > Best
> > > > > > > > Yuan
> > > > > > > >
> > > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > > > > > > >
> > > > > > > > > Congratulations and look forward to its further development!
> > > > > > > > >
> > > > > > > > > Best Regards,
> > > > > > > > > Yu
> > > > > > > > >
> > > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam 
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Congrattulations!
> > > > > > > > > >
> > > > > > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > > > > > >
> > > > > > > > > > > Hi devs and users,
> > > > > > > > > > >
> > > > > > > > > > > We are thrilled to announce that the donation of Flink CDC
> > > > as a
> > > > > > > > > > > sub-project of Apache Flink has completed. We invite you 
> > > > > > > > > > > to
> > > > > > explore
> > > > > > > > > the new
> > > > > > > > > > > resources available:
> > > > > > > > > > >
> > > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > > > > > > - Flink CDC Documentation:
> > > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > > > > > >
> > > > > > > > > > > After Flink community accepted this donation[1], we have
> > > > > > completed
> > > > > > > > > > > software copyright signing, code repo migration, code
> > > > cleanup,
> > > > > > > website
> > > > > > > > > > > migration, CI migration and github issues migration etc.
> > > > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing
> > > > Gong,
> > > > > > > > > Qingsheng
> > > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > > > > contributors
> > > > > > > for
> > > > > > > > > their
> > > > > > > > > > > contributions and help during this process！
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > For all previous contributors: The contribution process 
> > > > > > > > > > > has
> > > > > > > slightly
> > > > > > > > > > > changed to align with the main Flink project. To report
> > > bugs
> > > > or
> > > > > > > > > suggest new
> > > > > > > > > > > features, please open tickets
> > > > > > > > > > > Apache Jira (https://issues.apache.org/jira).  Note that
> > > we
> > > > > will
> > > > > > > no
> > > > > > > > > > > longer accept GitHub issues for these purposes.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Welcome to explore the new repository and documentation.
> > > Your
> > > > > > > feedback
> > > > > > > > > and
> > > > > > > > > > > contributions are invaluable as we continue to improve
> > > Flink
> > > > > CDC.
> > > > > > > > > > >
> > > > > > > > > > > Thanks everyone for your support and happy exploring Flink
> > > > CDC!
> > > > > > > > > > >
> > > > > > > > > > > Best，
> > > > > > > > > > > Leonard
> > > > > > > > > > > [1]
> > > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > Best
> > > > > > > > > >
> > > > > > > > > > ConradJam
> > > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > >
> > > > > > > Best,
> > > > > > > Benchao Li
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Xuannan Su

Congratulations!

Best regards,
Xuannan

On Fri, Mar 22, 2024 at 9:17 AM Charles Zhang  wrote:
>
> Congratulations!
>
> Best wishes,
> Charles Zhang
> from Apache InLong
>
>
> Jeyhun Karimov  于2024年3月22日周五 04:16写道：
>
> > Great news! Congratulations!
> >
> > Regards,
> > Jeyhun
> >
> > On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan  wrote:
> >
> > > Congratulations! Thanks for the efforts.
> > >
> > >
> > > Best,
> > > Yuxin
> > >
> > >
> > > Samrat Deb  于2024年3月21日周四 20:28写道：
> > >
> > > > Congratulations !
> > > >
> > > > Bests
> > > > Samrat
> > > >
> > > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy 
> > > wrote:
> > > >
> > > > > Congratulations, great work and great news.
> > > > > Best Regards
> > > > > Ahmed Hamdy
> > > > >
> > > > >
> > > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li 
> > wrote:
> > > > >
> > > > > > Congratulations, and thanks for the great work!
> > > > > >
> > > > > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > > > > >
> > > > > > > Thanks for driving these efforts!
> > > > > > >
> > > > > > > Congratulations
> > > > > > >
> > > > > > > Best
> > > > > > > Yuan
> > > > > > >
> > > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > > > > > >
> > > > > > > > Congratulations and look forward to its further development!
> > > > > > > >
> > > > > > > > Best Regards,
> > > > > > > > Yu
> > > > > > > >
> > > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam 
> > > > wrote:
> > > > > > > > >
> > > > > > > > > Congrattulations!
> > > > > > > > >
> > > > > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > > > > >
> > > > > > > > > > Hi devs and users,
> > > > > > > > > >
> > > > > > > > > > We are thrilled to announce that the donation of Flink CDC
> > > as a
> > > > > > > > > > sub-project of Apache Flink has completed. We invite you to
> > > > > explore
> > > > > > > > the new
> > > > > > > > > > resources available:
> > > > > > > > > >
> > > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > > > > > - Flink CDC Documentation:
> > > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > > > > >
> > > > > > > > > > After Flink community accepted this donation[1], we have
> > > > > completed
> > > > > > > > > > software copyright signing, code repo migration, code
> > > cleanup,
> > > > > > website
> > > > > > > > > > migration, CI migration and github issues migration etc.
> > > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing
> > > Gong,
> > > > > > > > Qingsheng
> > > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > > > contributors
> > > > > > for
> > > > > > > > their
> > > > > > > > > > contributions and help during this process！
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > For all previous contributors: The contribution process has
> > > > > > slightly
> > > > > > > > > > changed to align with the main Flink project. To report
> > bugs
> > > or
> > > > > > > > suggest new
> > > > > > > > > > features, please open tickets
> > > > > > > > > > Apache Jira (https://issues.apache.org/jira).  Note that
> > we
> > > > will
> > > > > > no
> > > > > > > > > > longer accept GitHub issues for these purposes.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Welcome to explore the new repository and documentation.
> > Your
> > > > > > feedback
> > > > > > > > and
> > > > > > > > > > contributions are invaluable as we continue to improve
> > Flink
> > > > CDC.
> > > > > > > > > >
> > > > > > > > > > Thanks everyone for your support and happy exploring Flink
> > > CDC!
> > > > > > > > > >
> > > > > > > > > > Best，
> > > > > > > > > > Leonard
> > > > > > > > > > [1]
> > > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > Best
> > > > > > > > >
> > > > > > > > > ConradJam
> > > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > >
> > > > > > Best,
> > > > > > Benchao Li
> > > > > >
> > > > >
> > > >
> > >
> >

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Charles Zhang

Congratulations!

Best wishes,
Charles Zhang
from Apache InLong


Jeyhun Karimov  于2024年3月22日周五 04:16写道：

> Great news! Congratulations!
>
> Regards,
> Jeyhun
>
> On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan  wrote:
>
> > Congratulations! Thanks for the efforts.
> >
> >
> > Best,
> > Yuxin
> >
> >
> > Samrat Deb  于2024年3月21日周四 20:28写道：
> >
> > > Congratulations !
> > >
> > > Bests
> > > Samrat
> > >
> > > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy 
> > wrote:
> > >
> > > > Congratulations, great work and great news.
> > > > Best Regards
> > > > Ahmed Hamdy
> > > >
> > > >
> > > > On Thu, 21 Mar 2024 at 11:41, Benchao Li 
> wrote:
> > > >
> > > > > Congratulations, and thanks for the great work!
> > > > >
> > > > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > > > >
> > > > > > Thanks for driving these efforts!
> > > > > >
> > > > > > Congratulations
> > > > > >
> > > > > > Best
> > > > > > Yuan
> > > > > >
> > > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > > > > >
> > > > > > > Congratulations and look forward to its further development!
> > > > > > >
> > > > > > > Best Regards,
> > > > > > > Yu
> > > > > > >
> > > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam 
> > > wrote:
> > > > > > > >
> > > > > > > > Congrattulations!
> > > > > > > >
> > > > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > > > >
> > > > > > > > > Hi devs and users,
> > > > > > > > >
> > > > > > > > > We are thrilled to announce that the donation of Flink CDC
> > as a
> > > > > > > > > sub-project of Apache Flink has completed. We invite you to
> > > > explore
> > > > > > > the new
> > > > > > > > > resources available:
> > > > > > > > >
> > > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > > > > - Flink CDC Documentation:
> > > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > > > >
> > > > > > > > > After Flink community accepted this donation[1], we have
> > > > completed
> > > > > > > > > software copyright signing, code repo migration, code
> > cleanup,
> > > > > website
> > > > > > > > > migration, CI migration and github issues migration etc.
> > > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing
> > Gong,
> > > > > > > Qingsheng
> > > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > > contributors
> > > > > for
> > > > > > > their
> > > > > > > > > contributions and help during this process！
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > For all previous contributors: The contribution process has
> > > > > slightly
> > > > > > > > > changed to align with the main Flink project. To report
> bugs
> > or
> > > > > > > suggest new
> > > > > > > > > features, please open tickets
> > > > > > > > > Apache Jira (https://issues.apache.org/jira).  Note that
> we
> > > will
> > > > > no
> > > > > > > > > longer accept GitHub issues for these purposes.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Welcome to explore the new repository and documentation.
> Your
> > > > > feedback
> > > > > > > and
> > > > > > > > > contributions are invaluable as we continue to improve
> Flink
> > > CDC.
> > > > > > > > >
> > > > > > > > > Thanks everyone for your support and happy exploring Flink
> > CDC!
> > > > > > > > >
> > > > > > > > > Best，
> > > > > > > > > Leonard
> > > > > > > > > [1]
> > > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > Best
> > > > > > > >
> > > > > > > > ConradJam
> > > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > >
> > > > > Best,
> > > > > Benchao Li
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-428: Fault Tolerance/Rescale Integration for Disaggregated State

2024-03-21 Thread Jeyhun Karimov

Hi Jinzhong,

Thanks for the FLIP. +1 for it.

I have a few questions:

- Why dont we consider an option where checkpoint directory just contains
metadata. So, we do not need to copy the data all the time from working
directory to the checkpointing directory.
Basically, when checkpointing, 1) we mark files in working directories as
"read-only", 2) optionally change the working directory, and 3) update the
metadata in the checkpoint directory (that points to some files in the
working directory).

- This is more of a technical question about RocksDB. Do we have a
guarantee that when calling DB.GetLiveFiles(), the WAL will be emptied as
well?

- As far as I understand, DB.GetLiveFiles() retrieves the global mutex
lock. I am wondering if RocksDBs optimistic transactions can be any of help
in this situation?

Regards,
Jeyhun

On Wed, Mar 20, 2024 at 1:35 PM Jinzhong Li 
wrote:

> Hi Yue,
>
> Thanks for your feedback!
>
> > 1. If we choose Option-3 for ForSt , how would we handle Manifest File
> > ? Should we take a snapshot of the Manifest during the synchronization
> phase?
>
> IIUC, the GetLiveFiles() API in Option-3 can also catch the fileInfo of
> Manifest files, and this api also return the manifest file size, which
> means this api could take snapshot for Manifest FileInfo (filename +
> fileSize) during the synchronization phase.
> You could refer to the rocksdb source code[1] to verify this.
>
>
>  > However, many distributed storage systems do not support the
> > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to
> > directly read and write remote files. Can we not copy or Fast duplicate
> > these files, but instand of directly reuse and. reference these remote
> > files? I think this can reduce file download time and may be more useful
> > for most users who use HDFS (do not support Fast Duplicate)?
>
> Firstly, as far as I know, most remote file systems support the
> FastDuplicate, eg. S3 copyObject/Azure Blob Storage copyBlob/OSS
> copyObject, and the HDFS indeed does not support FastDuplicate.
>
> Actually，we have considered the design which reuses remote files. And that
> is what we want to implement in the coming future, where both checkpoints
> and restores can reuse existing files residing on the remote state storage.
> However, this design conflicts with the current file management system in
> Flink.  At present, remote state files are managed by the ForStDB
> (TaskManager side), while checkpoint files are managed by the JobManager,
> which is a major hindrance to file reuse. For example, issues could arise
> if a TM reuses a checkpoint file that is subsequently deleted by the JM.
> Therefore, as mentioned in FLIP-423[2], our roadmap is to first integrate
> checkpoint/restore mechanisms with existing framework  at milestone-1.
> Then, at milestone-2, we plan to introduce TM State Ownership and Faster
> Checkpointing mechanisms, which will allow both checkpointing and restoring
> to directly reuse remote files, thus achieving faster checkpointing and
> restoring.
>
> [1]
>
> https://github.com/facebook/rocksdb/blob/6ddfa5f06140c8d0726b561e16dc6894138bcfa0/db/db_filesnapshot.cc#L77
> [2]
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=293046855#FLIP423:DisaggregatedStateStorageandManagement(UmbrellaFLIP)-RoadMap+LaunchingPlan
>
> Best,
> Jinzhong
>
>
>
>
>
>
>
> On Wed, Mar 20, 2024 at 4:01 PM yue ma  wrote:
>
> > Hi Jinzhong
> >
> > Thank you for initiating this FLIP.
> >
> > I have just some minor question:
> >
> > 1. If we choice Option-3 for ForSt , how would we handle Manifest File
> > ? Should we take snapshot of the Manifest during the synchronization
> phase?
> > Otherwise, may the Manifest and MetaInfo information be inconsistent
> during
> > recovery?
> > 2. For the Restore Operation , we need Fast Duplicate  Checkpoint Files
> to
> > Working Dir . However, many distributed storage systems do not support
> the
> > ability of Fast Duplicate (such as HDFS). But ForSt has the ability to
> > directly read and write remote files. Can we not copy or Fast duplicate
> > these files, but instand of directly reuse and. reference these remote
> > files? I think this can reduce file download time and may be more useful
> > for most users who use HDFS (do not support Fast Duplicate)?
> >
> > --
> > Best,
> > Yue
> >
>

Re: [DISCUSS] FLIP-XXX Apicurio-avro format

2024-03-21 Thread Jeyhun Karimov

Hi David,

Thanks for the FLIP. +1 for it.
I have a minor comment.

Can you please elaborate more on mechanisms in place to ensure data
consistency and integrity, particularly in the event of schema conflicts?
Since each message includes a schema ID for inbound and outbound messages,
can you elaborate more on message consistency in the context of schema
evolution?

Regards,
Jeyhun





On Wed, Mar 20, 2024 at 4:34 PM David Radley  wrote:

> Thank you very much for your feedback Mark. I have made the changes in the
> latest google document. On reflection I agree with you that the
> globalIdPlacement format configuration should apply to the deserialization
> as well, so it is declarative. I am also going to have a new configuration
> option to work with content IDs as well as global IDs. In line with the
> deser Apicurio IdHandler and headerHandlers.
>
>  kind regards, David.
>
>
> On 2024/03/20 15:18:37 Mark Nuttall wrote:
> > +1 to this
> >
> > A few small comments:
> >
> > Currently, if users have Avro schemas in an Apicurio Registry (an open
> source Apache 2 licensed schema registry), then the natural way to work
> with those Avro flows is to use the schemas in the Apicurio Repository.
> > 'those Avro flows' ... this is the first reference to flows.
> >
> > The new format will use the global Id to look up the Avro schema that
> the message was written during deserialization.
> > I get the point, phrasing is awkward. Probably you're more interested in
> content than word polish at this point though.
> >
> > The Avro Schema Registry (apicurio-avro) format
> > The Confluent format is called avro-confluent; this should be
> avro-apicurio
> >
> > How to create tables with Apicurio-avro format
> > s/Apicurio-avro/avro-apicurio/g
> >
> > HEADER – globalId is put in the header
> > LEGACY– global Id is put in the message as a long
> > CONFLUENT - globalId is put in the message as an int.
> > Please could we specify 'four-byte int' and 'eight-byte long' ?
> >
> > For a Kafka source the globalId will be looked for in this order:
> > - In the header
> > - After a magic byte as an int
> > - After a magic byte as a long.
> > but apicurio-avro.globalid-placement has a default value of HEADER : why
> do we have a search order as well? Isn't apicurio-avro.globalid-placement
> enough? Don't the two mechanisms conflict?
> >
> > In addition to the types listed there, Flink supports reading/writing
> nullable types. Flink maps nullable types to Avro union(something, null),
> where something is the Avro type converted from Flink type.
> > Is that definitely the right way round? I know we've had multiple
> conversations about how unions work with Flink
> >
> >  This is because the writer schema is expanded, but this could not
> complete if there are circularities.
> > I understand your meaning but the sentence is awkward.
> >
> > The registered schema will be created or if it exists be updated.
> > same again
> >
> > At some stage the lowest Flink level supported by the Kafka connector
> will contain the additionalProperties methods in code flink.
> > wording
> >
> > There existing Kafka deserialization for the writer schema passes down
> the message body to be deserialised.
> > wording
> >
> > @Override
> > public void deserialize(ConsumerRecord message,
> Collector out)
> >   throws IOException {
> >   Map additionalPropertiesMap =  new HashMap<>();
> >   for (Header header : message.additionalProperties()) {
> >   headersMap.put(header.key(), header.value());
> >   }
> >   deserializationSchema.deserialize(message.value(), headersMap,
> out);
> > }
> > This fails to compile at headersMap.
> >
> > The input stream and additionalProperties will be sent so the Apicurio
> SchemaCoder which will try getting the globalId from the headers, then 4
> bytes from the payload then 8 bytes from the payload.
> > I'm still stuck on apicurio-avro.globalid-placement having a default
> value of HEADER . Should we try all three, or fail if this config param has
> a wrong value?
> >
> > Other considerations
> > The implementation does not use the Apicurio deser libraries,
> > Please can we refer to them as SerDes; this is the term used within the
> documentation that you link to
> >
> >
> > On 2024/03/20 10:09:08 David Radley wrote:
> > > Hi,
> > > As per the FLIP process I would like to raise a FLIP, but do not have
> authority, so have created a google doc for the Flip to introduce a new
> Apicurio Avro format. The document is
> https://docs.google.com/document/d/14LWZPVFQ7F9mryJPdKXb4l32n7B0iWYkcOdEd1xTC7w/edit?usp=sharing
> > >
> > > I have prototyped a lot of the content to prove that this approach is
> feasible. I look forward to the discussion,
> > >   Kind regards, David.
> > >
> > >
> > >
> > > Unless otherwise stated above:
> > >
> > > IBM United Kingdom Limited
> > > Registered in England and Wales with number 741598
> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hants. PO6

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Jeyhun Karimov

Great news! Congratulations!

Regards,
Jeyhun

On Thu, Mar 21, 2024 at 2:00 PM Yuxin Tan  wrote:

> Congratulations! Thanks for the efforts.
>
>
> Best,
> Yuxin
>
>
> Samrat Deb  于2024年3月21日周四 20:28写道：
>
> > Congratulations !
> >
> > Bests
> > Samrat
> >
> > On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy 
> wrote:
> >
> > > Congratulations, great work and great news.
> > > Best Regards
> > > Ahmed Hamdy
> > >
> > >
> > > On Thu, 21 Mar 2024 at 11:41, Benchao Li  wrote:
> > >
> > > > Congratulations, and thanks for the great work!
> > > >
> > > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > > >
> > > > > Thanks for driving these efforts!
> > > > >
> > > > > Congratulations
> > > > >
> > > > > Best
> > > > > Yuan
> > > > >
> > > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > > > >
> > > > > > Congratulations and look forward to its further development!
> > > > > >
> > > > > > Best Regards,
> > > > > > Yu
> > > > > >
> > > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam 
> > wrote:
> > > > > > >
> > > > > > > Congrattulations!
> > > > > > >
> > > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > > >
> > > > > > > > Hi devs and users,
> > > > > > > >
> > > > > > > > We are thrilled to announce that the donation of Flink CDC
> as a
> > > > > > > > sub-project of Apache Flink has completed. We invite you to
> > > explore
> > > > > > the new
> > > > > > > > resources available:
> > > > > > > >
> > > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > > > - Flink CDC Documentation:
> > > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > > >
> > > > > > > > After Flink community accepted this donation[1], we have
> > > completed
> > > > > > > > software copyright signing, code repo migration, code
> cleanup,
> > > > website
> > > > > > > > migration, CI migration and github issues migration etc.
> > > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing
> Gong,
> > > > > > Qingsheng
> > > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> > contributors
> > > > for
> > > > > > their
> > > > > > > > contributions and help during this process！
> > > > > > > >
> > > > > > > >
> > > > > > > > For all previous contributors: The contribution process has
> > > > slightly
> > > > > > > > changed to align with the main Flink project. To report bugs
> or
> > > > > > suggest new
> > > > > > > > features, please open tickets
> > > > > > > > Apache Jira (https://issues.apache.org/jira).  Note that we
> > will
> > > > no
> > > > > > > > longer accept GitHub issues for these purposes.
> > > > > > > >
> > > > > > > >
> > > > > > > > Welcome to explore the new repository and documentation. Your
> > > > feedback
> > > > > > and
> > > > > > > > contributions are invaluable as we continue to improve Flink
> > CDC.
> > > > > > > >
> > > > > > > > Thanks everyone for your support and happy exploring Flink
> CDC!
> > > > > > > >
> > > > > > > > Best，
> > > > > > > > Leonard
> > > > > > > > [1]
> > > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > Best
> > > > > > >
> > > > > > > ConradJam
> > > > > >
> > > >
> > > >
> > > >
> > > > --
> > > >
> > > > Best,
> > > > Benchao Li
> > > >
> > >
> >
>

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Jeyhun Karimov

+1 (non-binding)

Regards,
Jeyhun

On Thu, Mar 21, 2024 at 2:04 PM Márton Balassi 
wrote:

> +1(binding)
>
> On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu  wrote:
>
> > +1(binding)
> >
> > Best,
> > Leonard
> >
> > > 2024年3月21日 下午5:21，Martijn Visser  写道：
> > >
> > > +1 (binding)
> > >
> > > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang <
> > gongzhongqi...@apache.org>
> > > wrote:
> > >
> > >> +1 (non-binding)
> > >>
> > >> Bests,
> > >> Zhongqiang Gong
> > >>
> > >> Ferenc Csaky  于2024年3月20日周三 22:11写道：
> > >>
> > >>> Hello devs,
> > >>>
> > >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to
> > >>> externalize the Kudu
> > >>> connector from the recently retired Apache Bahir project [2] to keep
> it
> > >>> maintainable and
> > >>> make it up to date as well. Discussion thread [3].
> > >>>
> > >>> The vote will be open for at least 72 hours (until 2024 March 23
> 14:03
> > >>> UTC) unless there
> > >>> are any objections or insufficient votes.
> > >>>
> > >>> Thanks,
> > >>> Ferenc
> > >>>
> > >>> [1]
> > >>>
> > >>
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> > >>> [2] https://attic.apache.org/projects/bahir.html
> > >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
> > >>
> >
> >
>

Re: Flink Kubernetes Operator Failing Over FlinkDeployments to a New Cluster

2024-03-21 Thread Kevin Lam

No worries, thanks for the reply Gyula.

Ah yes, I see how those points you raised make the feature tricky to
implement.
Could this be considered for a FLIP (or two) in the future?

On Wed, Mar 20, 2024 at 2:21 PM Gyula Fóra  wrote:

> Sorry for the late reply Kevin.
>
> I think what you are suggesting makes sense, it would be basically a
> `last-state` startup mode. This would also help in cases where the current
> last-state mechanism fails to locate HA metadata (and the state).
>
> This is somewhat of a tricky feature to implement:
>  1. The operator will need FS plugins and access to the different user envs
> (this will not work in many prod environments unfortunately)
>  2. Flink doesn't expose a good way to detect the latest checkpoint just by
> looking at the FS so we need to figure out something here. Probably some
> changes are necessary on Flink core side as well
>
> Gyula
>

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-21 Thread Jeyhun Karimov

Hi Lorenzo,

Thanks a lot for your comments. Please find my answers below:


For the interface `SupportsPartitioning`, why returning `Optional`?
> If one decides to implement that, partitions must exist (at maximum,
> return and empty list). Returning `Optional` seem just to complicate the
> logic of the code using that interface.


- The reasoning behind the use of Optional is that sometimes (e.g., in
HiveTableSource) the partitioning info is in catalog.
  Therefore, we return Optional.empty(), so that the list of partitions is
queried from the catalog.


I foresee the using code doing something like: "if the source supports
> partitioning, get the partitions, but if they don't exist, raise a runtime
> exception". Let's simply make that safe at compile time and guarantee the
> code that partitions exist.


- Yes, once partitions cannot be found, neither from catalog nor from the
interface implementation, then we raise an exception during query compile
time.


 Another thing is that you show Hive-like partitioning in your FS
> structure, do you think it makes sense to add a note about auto-discovery
> of partitions?


- Yes, the FLIP contains just an example partitioning for filesystem
connector. Each connector already "knows" about autodiscovery of its
partitions. And we rely on this fact.
  For example, partition discovery is different between kafka and
filesystem sources. So, we do not handle the manual discovery of
partitions. Please correct me if I misunderstood your question.


In other terms, it looks a bit counterintuitive that the user implementing
> the source has to specify which partitions exist statically (and they can
> change at runtime), while the source itself knows the data provider and can
> directly implement a method `discoverPartitions`. Then Flink would take
> care of invoking that method when needed.


We utilize table option SOURCE_MONITOR_INTERVAL to check whether partitions
are static or not. So, a user still should give Flink a hint about
partitions being static or not. With static partitions Flink can do more
optimizations.

Please let me know if my replies answer your questions or if you have more
comments.

Regards,
Jeyhun



On Thu, Mar 21, 2024 at 10:03 AM  wrote:

> Hello Jeyhun,
> I really like the proposal and definitely makes sense to me.
>
> I have a couple of nits here and there:
>
> For the interface `SupportsPartitioning`, why returning `Optional`?
> If one decides to implement that, partitions must exist (at maximum,
> return and empty list). Returning `Optional` seem just to complicate the
> logic of the code using that interface.
>
> I foresee the using code doing something like: "if the source supports
> partitioning, get the partitions, but if they don't exist, raise a runtime
> exception". Let's simply make that safe at compile time and guarantee the
> code that partitions exist.
>
> Another thing is that you show Hive-like partitioning in your FS
> structure, do you think it makes sense to add a note about auto-discovery
> of partitions?
>
> In other terms, it looks a bit counterintuitive that the user implementing
> the source has to specify which partitions exist statically (and they can
> change at runtime), while the source itself knows the data provider and can
> directly implement a method `discoverPartitions`. Then Flink would take
> care of invoking that method when needed.
> On Mar 15, 2024 at 22:09 +0100, Jeyhun Karimov ,
> wrote:
>
> Hi Benchao,
>
> Thanks for your comments.
>
> 1. What the parallelism would you take? E.g., 128 + 256 => 128? What
>
> if we cannot have a good greatest common divisor, like 127 + 128,
> could we just utilize one side's pre-partitioned attribute, and let
> another side just do the shuffle?
>
>
>
> There are two cases we need to consider:
>
> 1. Static Partition (no partitions are added during the query execution) is
> enabled AND both sources implement "SupportsPartitionPushdown"
>
> In this case, we are sure that no new partitions will be added at runtime.
> So, we have a chance equalize both sources' partitions and parallelism, IFF
> both sources implement "SupportsPartitionPushdown" interface.
> To achieve so, first we will fetch the existing partitions from source1
> (say p_s1) and from source2 (say p_s2).
> Then, we find the intersection of these two partition sets (say
> p_intersect) and pushdown these partitions:
>
> SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that
> only specific partitions are read
> SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read
> with filtered partitions
>
> Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3)
> all of their downstream operators until (and including) their first common
> ancestor (e.g., join) to be equal to the number of partitions (size of
> p_intersect).
>
> 2. All other cases
>
> In all other cases, the parallelism of both sources and their downstream
> operators until their common ancestor

[jira] [Created] (FLINK-34911) ChangelogRecoveryRescaleITCase failed fatally with 127 exit code

2024-03-21 Thread Ryan Skraba (Jira)

Ryan Skraba created FLINK-34911:
---

 Summary: ChangelogRecoveryRescaleITCase failed fatally with 127 
exit code
 Key: FLINK-34911
 URL: https://issues.apache.org/jira/browse/FLINK-34911
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.20.0
Reporter: Ryan Skraba


[https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=58455=logs=a657ddbf-d986-5381-9649-342d9c92e7fb=dc085d4a-05c8-580e-06ab-21f5624dab16=9029]

 
{code:java}
 Mar 21 01:50:42 01:50:42.553 [ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:3.2.2:test (integration-tests) 
on project flink-tests: 
Mar 21 01:50:42 01:50:42.553 [ERROR] 
Mar 21 01:50:42 01:50:42.553 [ERROR] Please refer to 
/__w/1/s/flink-tests/target/surefire-reports for the individual test results.
Mar 21 01:50:42 01:50:42.553 [ERROR] Please refer to dump files (if any exist) 
[date].dump, [date]-jvmRun[N].dump and [date].dumpstream.
Mar 21 01:50:42 01:50:42.553 [ERROR] ExecutionException The forked VM 
terminated without properly saying goodbye. VM crash or System.exit called?
Mar 21 01:50:42 01:50:42.553 [ERROR] Command was /bin/sh -c cd 
'/__w/1/s/flink-tests' && '/usr/lib/jvm/jdk-21.0.1+12/bin/java' '-XX:+UseG1GC' 
'-Xms256m' '-XX:+IgnoreUnrecognizedVMOptions' 
'--add-opens=java.base/java.util=ALL-UNNAMED' 
'--add-opens=java.base/java.io=ALL-UNNAMED' '-Xmx1536m' '-jar' 
'/__w/1/s/flink-tests/target/surefire/surefirebooter-20240321010847189_810.jar' 
'/__w/1/s/flink-tests/target/surefire' '2024-03-21T01-08-44_720-jvmRun3' 
'surefire-20240321010847189_808tmp' 'surefire_207-20240321010847189_809tmp'
Mar 21 01:50:42 01:50:42.553 [ERROR] Error occurred in starting fork, check 
output in log
Mar 21 01:50:42 01:50:42.553 [ERROR] Process Exit Code: 127
Mar 21 01:50:42 01:50:42.553 [ERROR] Crashed tests:
Mar 21 01:50:42 01:50:42.553 [ERROR] 
org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase
Mar 21 01:50:42 01:50:42.553 [ERROR] 
org.apache.maven.surefire.booter.SurefireBooterForkException: 
ExecutionException The forked VM terminated without properly saying goodbye. VM 
crash or System.exit called?
Mar 21 01:50:42 01:50:42.553 [ERROR] Command was /bin/sh -c cd 
'/__w/1/s/flink-tests' && '/usr/lib/jvm/jdk-21.0.1+12/bin/java' '-XX:+UseG1GC' 
'-Xms256m' '-XX:+IgnoreUnrecognizedVMOptions' 
'--add-opens=java.base/java.util=ALL-UNNAMED' 
'--add-opens=java.base/java.io=ALL-UNNAMED' '-Xmx1536m' '-jar' 
'/__w/1/s/flink-tests/target/surefire/surefirebooter-20240321010847189_810.jar' 
'/__w/1/s/flink-tests/target/surefire' '2024-03-21T01-08-44_720-jvmRun3' 
'surefire-20240321010847189_808tmp' 'surefire_207-20240321010847189_809tmp'
Mar 21 01:50:42 01:50:42.553 [ERROR] Error occurred in starting fork, check 
output in log
Mar 21 01:50:42 01:50:42.553 [ERROR] Process Exit Code: 127
Mar 21 01:50:42 01:50:42.553 [ERROR] Crashed tests:
Mar 21 01:50:42 01:50:42.553 [ERROR] 
org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase
Mar 21 01:50:42 01:50:42.553 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.awaitResultsDone(ForkStarter.java:456)
Mar 21 01:50:42 01:50:42.553 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.runSuitesForkPerTestSet(ForkStarter.java:418)
Mar 21 01:50:42 01:50:42.553 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:297)
Mar 21 01:50:42 01:50:42.553 [ERROR]at 
org.apache.maven.plugin.surefire.booterclient.ForkStarter.run(ForkStarter.java:250)
Mar 21 01:50:42 01:50:42.554 [ERROR]at 
org.apache.maven.plugin.surefire.AbstractSurefireMojo.executeProvider(AbstractSurefireMojo.java:1240)
{code}
>From the watchdog, only {{ChangelogRecoveryRescaleITCase}} didn't complete, 
>specifically parameterized with an {{EmbeddedRocksDBStateBackend}} with 
>incremental checkpointing enabled.

The base class ({{{}ChangelogRecoveryITCaseBase{}}}) starts a 
{{MiniClusterWithClientResource}}
{code:java}
~/Downloads/CI/logs-cron_jdk21-test_cron_jdk21_tests-1710982836$ cat watchdog| 
grep "Tests run\|Running org.apache.flink" | grep -o "org.apache.flink[^ ]*$" | 
sort | uniq -c | sort -n | head
      1 org.apache.flink.test.checkpointing.ChangelogRecoveryRescaleITCase
      2 org.apache.flink.api.connector.source.lib.NumberSequenceSourceITCase
      2 org.apache.flink.api.connector.source.lib.util.GatedRateLimiterTest
      2 
org.apache.flink.api.connector.source.lib.util.RateLimitedSourceReaderITCase
      2 org.apache.flink.api.datastream.DataStreamBatchExecutionITCase
      2 org.apache.flink.api.datastream.DataStreamCollectTestITCase{code}
 
{color:#00} {color}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1

2024-03-21 Thread Maximilian Michels

The vote is now closed.

I'm happy to announce that we have unanimously approved this release.

There are 6 approving votes, 3 of which are binding:

* Gyula Fora (binding)
* Marton Balassi (binding)
* Maximilian Michels (binding)
* Rui Fan (non-binding)
* Alexander Fedulov (non-binding)
* Mate Czagany (non-binding)

There are no disapproving votes.

Thank you all for voting!

-Max

On Thu, Mar 21, 2024 at 3:49 PM Maximilian Michels  wrote:
>
> +1 (binding)
>
> 1. Verified the archives, checksums, and signatures
> 2. Extracted and inspected the source code for binaries
> 3. Compiled and tested the source code via mvn verify
> 4. Verified license files / headers
> 5. Deployed helm chart to test cluster
> 6. Ran example job
> 7. Tested autoscaling without resource requirements API
> 8. Tested autotuning
>
> -Max
>
> On Thu, Mar 21, 2024 at 8:50 AM Márton Balassi  
> wrote:
> >
> > +1 (binding)
> >
> > As per Gyula's suggestion above verified with "
> > ghcr.io/apache/flink-kubernetes-operator:91d67d9 ".
> >
> > - Verified Helm repo works as expected, points to correct image tag, build,
> > version
> > - Verified basic examples + checked operator logs everything looks as
> > expected
> > - Verified hashes, signatures and source release contains no binaries
> > - Ran built-in tests, built jars + docker image from source successfully
> > - Upgraded the operator and the CRD from 1.7.0 to 1.8.0
> >
> > Best,
> > Marton
> >
> > On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany  wrote:
> >
> > > Hi,
> > >
> > > +1 (non-binding)
> > >
> > > - Verified checksums
> > > - Verified signatures
> > > - Verified no binaries in source distribution
> > > - Verified Apache License and NOTICE files
> > > - Executed tests
> > > - Built container image
> > > - Verified chart version and appVersion matches
> > > - Verified Helm chart can be installed with default values
> > > - Verify that RC repo works as Helm repo
> > >
> > > Best Regards,
> > > Mate
> > >
> > > Alexander Fedulov  ezt írta (időpont: 2024.
> > > márc. 19., K, 23:10):
> > >
> > > > Hi Max,
> > > >
> > > > +1
> > > >
> > > > - Verified SHA checksums
> > > > - Verified GPG signatures
> > > > - Verified that the source distributions do not contain binaries
> > > > - Verified built-in tests (mvn clean verify)
> > > > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C)
> > > > - Verified that Helm and operator files contain Apache licenses (rg -L
> > > > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .).
> > > >  I am not sure we need to
> > > > include ./examples/flink-beam-example/dependency-reduced-pom.xml
> > > > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though
> > > > - Verified that chart and appVersion matches the target release 
> > > > (91d67d9)
> > > > - Verified that Helm chart can be installed from the local Helm folder
> > > > without overriding any parameters
> > > > - Verified that Helm chart can be installed from the RC repo without
> > > > overriding any parameters (
> > > >
> > > >
> > > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1
> > > > )
> > > > - Verified docker container build
> > > >
> > > > Best,
> > > > Alex
> > > >
> > > >
> > > > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels  
> > > > wrote:
> > > >
> > > > > @Rui @Gyula Thanks for checking the release!
> > > > >
> > > > > >A minor correction is that [3] in the email should point to:
> > > > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart
> > > > and
> > > > > > everything is correct. It's a typo in the vote email.
> > > > >
> > > > > Good catch. Indeed, for the linked Docker image 8938658 points to
> > > > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes
> > > > > between those two commits, except for updating the version. So the
> > > > > votes are not impacted, especially because votes are casted against
> > > > > the source release which, as you pointed out, contains the correct
> > > > > image ref.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra 
> > > wrote:
> > > > > >
> > > > > > Hi Max!
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > >  - Verified source release, helm chart + checkpoints / signatures
> > > > > >  - Helm points to correct image
> > > > > >  - Deployed operator, stateful example and executed upgrade +
> > > savepoint
> > > > > > redeploy
> > > > > >  - Verified logs
> > > > > >  - Flink web PR looks good +1
> > > > > >
> > > > > > A minor correction is that [3] in the email should point to:
> > > > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm
> > > chart
> > > > > and
> > > > > > everything is correct. It's a typo in the vote email.
> > > > > >
> > > > > > Thank you for preparing the release!
> > > > > >
> > > > > > Cheers,
> > > > > > Gyula
> > > > > >
> > > > > > On Mon, Mar 18, 2024

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1

2024-03-21 Thread Maximilian Michels

+1 (binding)

1. Verified the archives, checksums, and signatures
2. Extracted and inspected the source code for binaries
3. Compiled and tested the source code via mvn verify
4. Verified license files / headers
5. Deployed helm chart to test cluster
6. Ran example job
7. Tested autoscaling without resource requirements API
8. Tested autotuning

-Max

On Thu, Mar 21, 2024 at 8:50 AM Márton Balassi  wrote:
>
> +1 (binding)
>
> As per Gyula's suggestion above verified with "
> ghcr.io/apache/flink-kubernetes-operator:91d67d9 ".
>
> - Verified Helm repo works as expected, points to correct image tag, build,
> version
> - Verified basic examples + checked operator logs everything looks as
> expected
> - Verified hashes, signatures and source release contains no binaries
> - Ran built-in tests, built jars + docker image from source successfully
> - Upgraded the operator and the CRD from 1.7.0 to 1.8.0
>
> Best,
> Marton
>
> On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany  wrote:
>
> > Hi,
> >
> > +1 (non-binding)
> >
> > - Verified checksums
> > - Verified signatures
> > - Verified no binaries in source distribution
> > - Verified Apache License and NOTICE files
> > - Executed tests
> > - Built container image
> > - Verified chart version and appVersion matches
> > - Verified Helm chart can be installed with default values
> > - Verify that RC repo works as Helm repo
> >
> > Best Regards,
> > Mate
> >
> > Alexander Fedulov  ezt írta (időpont: 2024.
> > márc. 19., K, 23:10):
> >
> > > Hi Max,
> > >
> > > +1
> > >
> > > - Verified SHA checksums
> > > - Verified GPG signatures
> > > - Verified that the source distributions do not contain binaries
> > > - Verified built-in tests (mvn clean verify)
> > > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C)
> > > - Verified that Helm and operator files contain Apache licenses (rg -L
> > > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .).
> > >  I am not sure we need to
> > > include ./examples/flink-beam-example/dependency-reduced-pom.xml
> > > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though
> > > - Verified that chart and appVersion matches the target release (91d67d9)
> > > - Verified that Helm chart can be installed from the local Helm folder
> > > without overriding any parameters
> > > - Verified that Helm chart can be installed from the RC repo without
> > > overriding any parameters (
> > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1
> > > )
> > > - Verified docker container build
> > >
> > > Best,
> > > Alex
> > >
> > >
> > > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels  wrote:
> > >
> > > > @Rui @Gyula Thanks for checking the release!
> > > >
> > > > >A minor correction is that [3] in the email should point to:
> > > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart
> > > and
> > > > > everything is correct. It's a typo in the vote email.
> > > >
> > > > Good catch. Indeed, for the linked Docker image 8938658 points to
> > > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes
> > > > between those two commits, except for updating the version. So the
> > > > votes are not impacted, especially because votes are casted against
> > > > the source release which, as you pointed out, contains the correct
> > > > image ref.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra 
> > wrote:
> > > > >
> > > > > Hi Max!
> > > > >
> > > > > +1 (binding)
> > > > >
> > > > >  - Verified source release, helm chart + checkpoints / signatures
> > > > >  - Helm points to correct image
> > > > >  - Deployed operator, stateful example and executed upgrade +
> > savepoint
> > > > > redeploy
> > > > >  - Verified logs
> > > > >  - Flink web PR looks good +1
> > > > >
> > > > > A minor correction is that [3] in the email should point to:
> > > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm
> > chart
> > > > and
> > > > > everything is correct. It's a typo in the vote email.
> > > > >
> > > > > Thank you for preparing the release!
> > > > >
> > > > > Cheers,
> > > > > Gyula
> > > > >
> > > > > On Mon, Mar 18, 2024 at 8:26 AM Rui Fan <1996fan...@gmail.com>
> > wrote:
> > > > >
> > > > > > Thanks Max for driving this release!
> > > > > >
> > > > > > +1(non-binding)
> > > > > >
> > > > > > - Downloaded artifacts from dist ( svn co
> > > > > >
> > > > > >
> > > >
> > >
> > https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/
> > > > > > )
> > > > > > - Verified SHA512 checksums : ( for i in *.tgz; do echo $i;
> > sha512sum
> > > > > > --check $i.sha512; done )
> > > > > > - Verified GPG signatures : ( $ for i in *.tgz; do echo $i; gpg
> > > > --verify
> > > > > > $i.asc $i )
> > > > > > - Build the source with java-11 and java-17 ( mvn -T 20 clean
> > install
> > > > > > -DskipTests )
> > > > > > - Verified

Re: [DISCUSS] FLIP-425: Asynchronous Execution Model

2024-03-21 Thread Yanfei Lei

Thanks for your reading and valuable comments!

> 1) About locking VS reference counting: I would like to clear out which 
> mechanism prevents what:
The `KeyAccountingUnit` implements locking behavior on keys and
ensures 2 state requests on the same key happen in order.
Double-locking the same key does not result in deadlocks (thanks to
the `previous == record` condition in your pseudo-code), so, the same
callback chain can update/read multiple times the same piece of state.
On the other side we have the reference counting mechanism that is
used to understand whether a record has been fully processed, i.e.,
all state invocations have been carried out.
Here is the question: am I correct if we say that key accounting is
needed for out-of-order while reference counting is needed for
checkpointing and watermarking?


Regarding the "deadlock" of `KeyAccountingUnit`: good catch, we will
emphasize this in FLIP, the KeyAccountingUnitis reentrant, so the
state requests of the same record can update/read multiple times
without deadlock.

Regarding the question: records, checkpoint barriers and watermarks
can be regarded as inputs, this FLIP discusses the *order* between all
inputs, in simple terms, the inputs of the same key that arrive first
need to be executed first.

And the `KeyAccountingUnit` and reference counting work together to
preserve the order, when the reference counting mechanism recognizes a
record has been fully processed, the record will be removed from the
`KeyAccountingUnit`. The checkpoint or watermark would start util all
the reference counting of arrived inputs reach zero.


> 2) Number of mails:
Do you end up having two mails?


Yes, there are two mails in this case.

> 3) Would this change something on the consistency guarantees provided?
I guess not, as, the lock is held in any case until the value on the
state hasn't been updated.
Could lead to any inconsistency (most probably the state would be updated to 0).


Yes, the results of the two cases you mentioned are as you analyzed.
The result of the first case is 1, and the result of the second case
is 0.
No matter which case it is, the next `processElement` with the same
key will be executed after the code in this `processElement` is
completely executed.

Therefore it wouldn't lead to inconsistency.

> 4) On the local variables/attributes:

I'm not sure if I understand your question. In Java, this
case(modifying the local local variable) is not allowed[1], but there
are ways to get around the limitation of lambda.
In this case, the modification to x may be concurrent, which needs to
be handled carefully.

5) On watermarks:
It seems that, in order to achieve a good throughput, out-of-order
mode should be used.
In the FLIP I could not understand well how many things could go wrong
if that one is used.
Could you please clarify that?

A typical example is the order between "event timer fire" and "the
subsequent records of watermark".
Although the semantics of watermarks do not define the sequence
between a watermark and subsequent records, an implicit fact in sync
API is that "event timer fire" would execute before "the subsequent
records of watermark", but in out-of-order mode(async API), the
execution order between them is not guaranteed.
There also are some related discussions in FLIP-423[2,3] proposed by
Yunfeng Zhou and Xintong Song.

[1] 
https://stackoverflow.com/questions/30026824/modifying-local-variable-from-inside-lambda
[2] https://lists.apache.org/thread/djsnybs9whzrt137z3qmxdwn031o93gn
[3] https://lists.apache.org/thread/986zxq1k9rv3vkbk39yw16g24o6h83mz

 于2024年3月21日周四 19:29写道：
>
> Thank you everybody for the questions and answers (especially Yanfei Lei), it 
> was very instructive to go over the discussion.
> I am gonna add some questions on top of what happened and add some thoughts 
> as well below.
>
> 1) About locking VS reference counting:
> I would like to clear out which mechanism prevents what:
> The `KeyAccountingUnit` implements locking behavior on keys and ensures 2 
> state requests on the same key happen in order. Double-locking the same key 
> does not result in deadlocks (thanks to the `previous == record` condition in 
> your pseudo-code), so, the same callback chain can update/read multiple times 
> the same piece of state.
> On the other side we have the reference counting mechanism that is used to 
> understand whether a record has been fully processed, i.e., all state 
> invocations have been carried out.
> Here is the question: am I correct if we say that key accounting is needed 
> for out-of-order while reference counting is needed for checkpointing and 
> watermarking?
>
> 2) Number of mails:
> To expand on what Jing Ge already asked, in the example code in the FLIP:
>
> ```
> state.value().then(
>   val -> {
> return state.update(val + 1).then(
>   empty -> {
> out.collect(val + 1);
>  };
>   }
> }
> ```
> Do you end up having two mails?:
>
> first wrapping `val -> {...}`
> second

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-21 Thread Benchao Li

Jeyhun,

Sorry for the delay. And thanks for the explanation, it sounds good to me!

Jeyhun Karimov  于2024年3月16日周六 05:09写道：
>
> Hi Benchao,
>
> Thanks for your comments.
>
> 1. What the parallelism would you take? E.g., 128 + 256 => 128? What
> > if we cannot have a good greatest common divisor, like 127 + 128,
> > could we just utilize one side's pre-partitioned attribute, and let
> > another side just do the shuffle?
>
>
> There are two cases we need to consider:
>
> 1. Static Partition (no partitions are added during the query execution) is
> enabled AND both sources implement "SupportsPartitionPushdown"
>
> In this case, we are sure that no new partitions will be added at runtime.
> So, we have a chance equalize both sources' partitions and parallelism, IFF
> both sources implement "SupportsPartitionPushdown" interface.
> To achieve so, first we will fetch the existing partitions from source1
> (say p_s1) and from source2 (say p_s2).
> Then, we find the intersection of these two partition sets (say
> p_intersect) and pushdown these partitions:
>
> SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that
> only specific partitions are read
> SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read
> with filtered partitions
>
> Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3)
> all of their downstream operators until (and including) their first common
> ancestor (e.g., join) to be equal to the number of partitions (size of
> p_intersect).
>
> 2. All other cases
>
> In all other cases, the parallelism of both sources and their downstream
> operators until their common ancestor would be equal to the MIN(p_s1,
> p_s2).
> That is, minimum of the partition size of source1 and partition size of
> source2 will be selected as the parallelism.
> Coming back to your example, if source1 parallelism is 127 and source2
> parallelism is 128, then we will first check the partition size of source1
> and source2.
> Say partition size of source1 is 100 and partition size of source2 is 90.
> Then, we would set the parallelism for source1, source2, and all of their
> downstream operators until (and including) the join operator
> to 90 (min(100, 90)).
> We also plan to implement a cost based decision instead of the rule-based
> one (the ones explained above - MIN rule).
> One  possible result of the cost based estimation is to keep the partitions
> on one side and perform the shuffling on another source.
>
>
>
> 2. In our current shuffle remove design (FlinkExpandConversionRule),
> > we don't consider parallelism, we just remove unnecessary shuffles
> > according to the distribution columns. After this FLIP, the
> > parallelism may be bundled with source's partitions, then how will
> > this optimization accommodate with FlinkExpandConversionRule, will you
> > also change downstream operator's parallelisms if we want to also
> > remove subsequent shuffles?
>
>
>
> - From my understanding of FlinkExpandConversionRule, its removal logic is
> agnostic to operator parallelism.
> So, if FlinkExpandConversionRule decides to remove a shuffle operation,
> then this FLIP will search another possible shuffle (the one closest to the
> source) to remove.
> If there is such an opportunity, this FLIP will remove the shuffle. So,
> from my understanding FlinkExpandConversionRule and this optimization rule
> can work together safely.
> Please correct me if I misunderstood your question.
>
>
>
> Regarding the new optimization rule, have you also considered to allow
> > some non-strict mode like FlinkRelDistribution#requireStrict? For
> > example, source is pre-partitioned by a, b columns, if we are
> > consuming this source, and do a aggregate on a, b, c, can we utilize
> > this optimization?
>
>
> - Good point. Yes, there are some cases that non-strict mode will apply.
> For example:
>
> - pre-partitioned columns and aggregate columns are the same but have
> different order (e.g., source pre-partitioned  w.r.t. a,b and aggregate has
> a GROUP BY b,a)
> - columns in the Exchange operator is a list-prefix of pre-partitoned
> columns of source (e.g., source is pre-partitioned w.r.t. a,b,c and
> Exchange's partition columns are a,b)
>
> Please let me know if the above answers your questions or if you have any
> other comments.
>
> Regards,
> Jeyhun
>
> On Thu, Mar 14, 2024 at 12:48 PM Benchao Li  wrote:
>
> > Thanks Jeyhun for bringing up this discussion, it is really exiting,
> > +1 for the general idea.
> >
> > We also introduced a similar concept in Flink Batch internally to cope
> > with bucketed tables in Hive, it is a very important improvement.
> >
> > > One thing to note is that for join queries, the parallelism of each join
> > > source might be different. This might result in
> > > inconsistencies while using the pre-partitioned/pre-divided data (e.g.,
> > > different mappings of partitions to source operators).
> > > Therefore, it is the job of planner to detect this and

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Márton Balassi

+1(binding)

On Thu, Mar 21, 2024 at 1:24 PM Leonard Xu  wrote:

> +1(binding)
>
> Best,
> Leonard
>
> > 2024年3月21日 下午5:21，Martijn Visser  写道：
> >
> > +1 (binding)
> >
> > On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang <
> gongzhongqi...@apache.org>
> > wrote:
> >
> >> +1 (non-binding)
> >>
> >> Bests,
> >> Zhongqiang Gong
> >>
> >> Ferenc Csaky  于2024年3月20日周三 22:11写道：
> >>
> >>> Hello devs,
> >>>
> >>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to
> >>> externalize the Kudu
> >>> connector from the recently retired Apache Bahir project [2] to keep it
> >>> maintainable and
> >>> make it up to date as well. Discussion thread [3].
> >>>
> >>> The vote will be open for at least 72 hours (until 2024 March 23 14:03
> >>> UTC) unless there
> >>> are any objections or insufficient votes.
> >>>
> >>> Thanks,
> >>> Ferenc
> >>>
> >>> [1]
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> >>> [2] https://attic.apache.org/projects/bahir.html
> >>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
> >>
>
>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Yuxin Tan

Congratulations! Thanks for the efforts.


Best,
Yuxin


Samrat Deb  于2024年3月21日周四 20:28写道：

> Congratulations !
>
> Bests
> Samrat
>
> On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy  wrote:
>
> > Congratulations, great work and great news.
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Thu, 21 Mar 2024 at 11:41, Benchao Li  wrote:
> >
> > > Congratulations, and thanks for the great work!
> > >
> > > Yuan Mei  于2024年3月21日周四 18:31写道：
> > > >
> > > > Thanks for driving these efforts!
> > > >
> > > > Congratulations
> > > >
> > > > Best
> > > > Yuan
> > > >
> > > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > > >
> > > > > Congratulations and look forward to its further development!
> > > > >
> > > > > Best Regards,
> > > > > Yu
> > > > >
> > > > > On Thu, 21 Mar 2024 at 15:54, ConradJam 
> wrote:
> > > > > >
> > > > > > Congrattulations!
> > > > > >
> > > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > > >
> > > > > > > Hi devs and users,
> > > > > > >
> > > > > > > We are thrilled to announce that the donation of Flink CDC as a
> > > > > > > sub-project of Apache Flink has completed. We invite you to
> > explore
> > > > > the new
> > > > > > > resources available:
> > > > > > >
> > > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > > - Flink CDC Documentation:
> > > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > > >
> > > > > > > After Flink community accepted this donation[1], we have
> > completed
> > > > > > > software copyright signing, code repo migration, code cleanup,
> > > website
> > > > > > > migration, CI migration and github issues migration etc.
> > > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> > > > > Qingsheng
> > > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other
> contributors
> > > for
> > > > > their
> > > > > > > contributions and help during this process！
> > > > > > >
> > > > > > >
> > > > > > > For all previous contributors: The contribution process has
> > > slightly
> > > > > > > changed to align with the main Flink project. To report bugs or
> > > > > suggest new
> > > > > > > features, please open tickets
> > > > > > > Apache Jira (https://issues.apache.org/jira).  Note that we
> will
> > > no
> > > > > > > longer accept GitHub issues for these purposes.
> > > > > > >
> > > > > > >
> > > > > > > Welcome to explore the new repository and documentation. Your
> > > feedback
> > > > > and
> > > > > > > contributions are invaluable as we continue to improve Flink
> CDC.
> > > > > > >
> > > > > > > Thanks everyone for your support and happy exploring Flink CDC!
> > > > > > >
> > > > > > > Best，
> > > > > > > Leonard
> > > > > > > [1]
> > > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > --
> > > > > > Best
> > > > > >
> > > > > > ConradJam
> > > > >
> > >
> > >
> > >
> > > --
> > >
> > > Best,
> > > Benchao Li
> > >
> >
>

[jira] [Created] (FLINK-34910) Can not plan window join without projections

2024-03-21 Thread Dawid Wysakowicz (Jira)

Dawid Wysakowicz created FLINK-34910:


 Summary: Can not plan window join without projections
 Key: FLINK-34910
 URL: https://issues.apache.org/jira/browse/FLINK-34910
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Affects Versions: 1.19.0
Reporter: Dawid Wysakowicz
Assignee: Dawid Wysakowicz
 Fix For: 1.20.0


When running:
{code}
  @Test
  def testWindowJoinWithoutProjections(): Unit = {
val sql =
  """
|SELECT *
|FROM
|  TABLE(TUMBLE(TABLE MyTable, DESCRIPTOR(rowtime), INTERVAL '15' 
MINUTE)) AS L
|JOIN
|  TABLE(TUMBLE(TABLE MyTable2, DESCRIPTOR(rowtime), INTERVAL '15' 
MINUTE)) AS R
|ON L.window_start = R.window_start AND L.window_end = R.window_end AND 
L.a = R.a
  """.stripMargin
util.verifyRelPlan(sql)
  }
{code}

It fails with:
{code}
FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME_MATERIALIZE(proctime) AS 
proctime, window_start, window_end, window_time, a0, b0, c0, rowtime0, 
PROCTIME_MATERIALIZE(proctime0) AS proctime0, window_start0, window_end0, 
window_time0])
+- FlinkLogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{}])
   :- FlinkLogicalTableFunctionScan(invocation=[TUMBLE(DESCRIPTOR($3), 
90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, VARCHAR(2147483647) b, 
BIGINT c, TIMESTAMP(3) *ROWTIME* rowtime, TIMESTAMP_LTZ(3) *PROCTIME* proctime, 
TIMESTAMP(3) window_start, TIMESTAMP(3) window_end, TIMESTAMP(3) *ROWTIME* 
window_time)])
   :  +- FlinkLogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($3, 
1000:INTERVAL SECOND)])
   : +- FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME() AS proctime])
   :+- FlinkLogicalTableSourceScan(table=[[default_catalog, 
default_database, MyTable]], fields=[a, b, c, rowtime])
   +- 
FlinkLogicalTableFunctionScan(invocation=[TUMBLE(DESCRIPTOR(CAST($3):TIMESTAMP(3)),
 90:INTERVAL MINUTE)], rowType=[RecordType(INTEGER a, VARCHAR(2147483647) 
b, BIGINT c, TIMESTAMP(3) *ROWTIME* rowtime, TIMESTAMP_LTZ(3) *PROCTIME* 
proctime, TIMESTAMP(3) window_start, TIMESTAMP(3) window_end, TIMESTAMP(3) 
*ROWTIME* window_time)])
  +- FlinkLogicalWatermarkAssigner(rowtime=[rowtime], watermark=[-($3, 
1000:INTERVAL SECOND)])
 +- FlinkLogicalCalc(select=[a, b, c, rowtime, PROCTIME() AS proctime])
+- FlinkLogicalTableSourceScan(table=[[default_catalog, 
default_database, MyTable2]], fields=[a, b, c, rowtime])

Failed to get time attribute index from DESCRIPTOR(CAST($3):TIMESTAMP(3)). This 
is a bug, please file a JIRA issue.
Please check the documentation for the set of currently supported SQL features.
{code}

In prior versions this had another problem of ambiguous {{rowtime}} column, but 
this has been fixed by [FLINK-32648]. In versions < 1.19 WindowTableFunctions 
were incorrectly scoped, because they were not extending from Calcite's 
SqlWindowTableFunction and the scoping implemented in 
SqlValidatorImpl#convertFrom was incorrect. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (FLINK-34909) OceanBase事务ID需求

2024-03-21 Thread xiaotouming (Jira)

xiaotouming created FLINK-34909:
---

 Summary: OceanBase事务ID需求
 Key: FLINK-34909
 URL: https://issues.apache.org/jira/browse/FLINK-34909
 Project: Flink
  Issue Type: New Feature
  Components: Flink CDC
Affects Versions: cdc-3.1.0
Reporter: xiaotouming
 Fix For: cdc-3.1.0


可以通过flink data stream方式解析到OceanBase的事务ID



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Samrat Deb

Congratulations !

Bests
Samrat

On Thu, 21 Mar 2024 at 5:52 PM, Ahmed Hamdy  wrote:

> Congratulations, great work and great news.
> Best Regards
> Ahmed Hamdy
>
>
> On Thu, 21 Mar 2024 at 11:41, Benchao Li  wrote:
>
> > Congratulations, and thanks for the great work!
> >
> > Yuan Mei  于2024年3月21日周四 18:31写道：
> > >
> > > Thanks for driving these efforts!
> > >
> > > Congratulations
> > >
> > > Best
> > > Yuan
> > >
> > > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> > >
> > > > Congratulations and look forward to its further development!
> > > >
> > > > Best Regards,
> > > > Yu
> > > >
> > > > On Thu, 21 Mar 2024 at 15:54, ConradJam  wrote:
> > > > >
> > > > > Congrattulations!
> > > > >
> > > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > > >
> > > > > > Hi devs and users,
> > > > > >
> > > > > > We are thrilled to announce that the donation of Flink CDC as a
> > > > > > sub-project of Apache Flink has completed. We invite you to
> explore
> > > > the new
> > > > > > resources available:
> > > > > >
> > > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > > - Flink CDC Documentation:
> > > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > > >
> > > > > > After Flink community accepted this donation[1], we have
> completed
> > > > > > software copyright signing, code repo migration, code cleanup,
> > website
> > > > > > migration, CI migration and github issues migration etc.
> > > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> > > > Qingsheng
> > > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors
> > for
> > > > their
> > > > > > contributions and help during this process！
> > > > > >
> > > > > >
> > > > > > For all previous contributors: The contribution process has
> > slightly
> > > > > > changed to align with the main Flink project. To report bugs or
> > > > suggest new
> > > > > > features, please open tickets
> > > > > > Apache Jira (https://issues.apache.org/jira).  Note that we will
> > no
> > > > > > longer accept GitHub issues for these purposes.
> > > > > >
> > > > > >
> > > > > > Welcome to explore the new repository and documentation. Your
> > feedback
> > > > and
> > > > > > contributions are invaluable as we continue to improve Flink CDC.
> > > > > >
> > > > > > Thanks everyone for your support and happy exploring Flink CDC!
> > > > > >
> > > > > > Best，
> > > > > > Leonard
> > > > > > [1]
> > https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > > >
> > > > > >
> > > > >
> > > > > --
> > > > > Best
> > > > >
> > > > > ConradJam
> > > >
> >
> >
> >
> > --
> >
> > Best,
> > Benchao Li
> >
>

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Leonard Xu

+1(binding)

Best,
Leonard

> 2024年3月21日 下午5:21，Martijn Visser  写道：
> 
> +1 (binding)
> 
> On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang 
> wrote:
> 
>> +1 (non-binding)
>> 
>> Bests,
>> Zhongqiang Gong
>> 
>> Ferenc Csaky  于2024年3月20日周三 22:11写道：
>> 
>>> Hello devs,
>>> 
>>> I would like to start a vote about FLIP-439 [1]. The FLIP is about to
>>> externalize the Kudu
>>> connector from the recently retired Apache Bahir project [2] to keep it
>>> maintainable and
>>> make it up to date as well. Discussion thread [3].
>>> 
>>> The vote will be open for at least 72 hours (until 2024 March 23 14:03
>>> UTC) unless there
>>> are any objections or insufficient votes.
>>> 
>>> Thanks,
>>> Ferenc
>>> 
>>> [1]
>>> 
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
>>> [2] https://attic.apache.org/projects/bahir.html
>>> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
>>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Ahmed Hamdy

Congratulations, great work and great news.
Best Regards
Ahmed Hamdy


On Thu, 21 Mar 2024 at 11:41, Benchao Li  wrote:

> Congratulations, and thanks for the great work!
>
> Yuan Mei  于2024年3月21日周四 18:31写道：
> >
> > Thanks for driving these efforts!
> >
> > Congratulations
> >
> > Best
> > Yuan
> >
> > On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
> >
> > > Congratulations and look forward to its further development!
> > >
> > > Best Regards,
> > > Yu
> > >
> > > On Thu, 21 Mar 2024 at 15:54, ConradJam  wrote:
> > > >
> > > > Congrattulations!
> > > >
> > > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > > >
> > > > > Hi devs and users,
> > > > >
> > > > > We are thrilled to announce that the donation of Flink CDC as a
> > > > > sub-project of Apache Flink has completed. We invite you to explore
> > > the new
> > > > > resources available:
> > > > >
> > > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > > - Flink CDC Documentation:
> > > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > > >
> > > > > After Flink community accepted this donation[1], we have completed
> > > > > software copyright signing, code repo migration, code cleanup,
> website
> > > > > migration, CI migration and github issues migration etc.
> > > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> > > Qingsheng
> > > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors
> for
> > > their
> > > > > contributions and help during this process！
> > > > >
> > > > >
> > > > > For all previous contributors: The contribution process has
> slightly
> > > > > changed to align with the main Flink project. To report bugs or
> > > suggest new
> > > > > features, please open tickets
> > > > > Apache Jira (https://issues.apache.org/jira).  Note that we will
> no
> > > > > longer accept GitHub issues for these purposes.
> > > > >
> > > > >
> > > > > Welcome to explore the new repository and documentation. Your
> feedback
> > > and
> > > > > contributions are invaluable as we continue to improve Flink CDC.
> > > > >
> > > > > Thanks everyone for your support and happy exploring Flink CDC!
> > > > >
> > > > > Best，
> > > > > Leonard
> > > > > [1]
> https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > > >
> > > > >
> > > >
> > > > --
> > > > Best
> > > >
> > > > ConradJam
> > >
>
>
>
> --
>
> Best,
> Benchao Li
>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Benchao Li

Congratulations, and thanks for the great work!

Yuan Mei  于2024年3月21日周四 18:31写道：
>
> Thanks for driving these efforts!
>
> Congratulations
>
> Best
> Yuan
>
> On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:
>
> > Congratulations and look forward to its further development!
> >
> > Best Regards,
> > Yu
> >
> > On Thu, 21 Mar 2024 at 15:54, ConradJam  wrote:
> > >
> > > Congrattulations!
> > >
> > > Leonard Xu  于2024年3月20日周三 21:36写道：
> > >
> > > > Hi devs and users,
> > > >
> > > > We are thrilled to announce that the donation of Flink CDC as a
> > > > sub-project of Apache Flink has completed. We invite you to explore
> > the new
> > > > resources available:
> > > >
> > > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > > - Flink CDC Documentation:
> > > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > > >
> > > > After Flink community accepted this donation[1], we have completed
> > > > software copyright signing, code repo migration, code cleanup, website
> > > > migration, CI migration and github issues migration etc.
> > > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> > Qingsheng
> > > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for
> > their
> > > > contributions and help during this process！
> > > >
> > > >
> > > > For all previous contributors: The contribution process has slightly
> > > > changed to align with the main Flink project. To report bugs or
> > suggest new
> > > > features, please open tickets
> > > > Apache Jira (https://issues.apache.org/jira).  Note that we will no
> > > > longer accept GitHub issues for these purposes.
> > > >
> > > >
> > > > Welcome to explore the new repository and documentation. Your feedback
> > and
> > > > contributions are invaluable as we continue to improve Flink CDC.
> > > >
> > > > Thanks everyone for your support and happy exploring Flink CDC!
> > > >
> > > > Best，
> > > > Leonard
> > > > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > > >
> > > >
> > >
> > > --
> > > Best
> > >
> > > ConradJam
> >



-- 

Best,
Benchao Li

Re: [DISCUSS] FLIP-425: Asynchronous Execution Model

2024-03-21 Thread lorenzo . affetti

Thank you everybody for the questions and answers (especially Yanfei Lei), it 
was very instructive to go over the discussion.
I am gonna add some questions on top of what happened and add some thoughts as 
well below.

1) About locking VS reference counting:
I would like to clear out which mechanism prevents what:
The `KeyAccountingUnit` implements locking behavior on keys and ensures 2 state 
requests on the same key happen in order. Double-locking the same key does not 
result in deadlocks (thanks to the `previous == record` condition in your 
pseudo-code), so, the same callback chain can update/read multiple times the 
same piece of state.
On the other side we have the reference counting mechanism that is used to 
understand whether a record has been fully processed, i.e., all state 
invocations have been carried out.
Here is the question: am I correct if we say that key accounting is needed for 
out-of-order while reference counting is needed for checkpointing and 
watermarking?

2) Number of mails:
To expand on what Jing Ge already asked, in the example code in the FLIP:

```
state.value().then(
  val -> {
    return state.update(val + 1).then(
      empty -> {
        out.collect(val + 1);
     };
  }
}
```
Do you end up having two mails?:

• first wrapping `val -> {...}`
• second wrapping `empty -> {...}`

Did I get it correctly?

3) On the guarantees provided by the async execution framework:
Always referring to your example, say that when `val -> {...}` gets called the 
state value is 0.
The callback will update to 1 and register another callback to collect while 
still holding the lock on that piece of state, and preventing somebody else to 
read the value during the entire process (implementing atomicity for a 
transaction).
Now, say the code changes to:

```
state.value().then(
  val -> {
    state.update(val + 1);
    out.collect(val + 1);
  }
}
```

Would this change something on the consistency guarantees provided?
I guess not, as, the lock is held in any case until the value on the state 
hasn't been updated.

This, instead (similar to your example):

```
int x = 0;

state.value().then( val -> { x = val + 1; } );
state.update(x);
out.collect(x);
```

Could lead to any inconsistency (most probably the state would be updated to 0).

4) On the local variables/attributes:
The last example above exemplifies one of my concerns: what about the values 
enclosed in the callbacks?
That seems a bit counter-intuitive and brittle from a user perspective.
In the example we have a function-level variable, but what about fields:

```
int x = 0;

void processElement(...) {
   state.value().then( val -> { x += val; } );
   ...
   out.collect(x);
}
```

What could happen here?
Every callback would enclose the current value of the field?
Don't know exactly where I am heading, but it seems quite complex/convoluted :)

5) On watermarks:
It seems that, in order to achieve a good throughput, out-of-order mode should 
be used.
In the FLIP I could not understand well how many things could go wrong if that 
one is used.
Could you please clarify that?

Thank you for your availability and your great work!

On Mar 19, 2024 at 10:51 +0100, Yanfei Lei , wrote:
> Hi everyone,
>
> Thanks for your valuable discussion and feedback!
>
> Our discussions have been going on for a while and there have been no
> new comments for several days. So I would like to start a vote after
> 72 hours.
>
> Please let me know if you have any concerns, thanks!
>
> Yanfei Lei  于2024年3月13日周三 12:54写道：
>
> >
> > Hi Jing,
> > Thanks for the reply and follow up.
> >
> > > > What is the benefit for users to build a chain of mails instead of just 
> > > > one mail(it is still async)?
> >
> > Just to make sure we're on the same page, I try to paraphrase your question:
> > A `then()` call will be encapsulated as a callback mail. Your question
> > is whether we can call then() as little as possible to reduce the
> > overhead of encapsulating it into a mail.
> >
> > In general, whether to call `then()` depends on the user's data
> > dependencies. The operations in a chain of `then()` are strictly
> > ordered.
> >
> >
> >
> > The following is an example without data dependencies, if written in
> > the form of a `then` chain:
> > stateA.update(1).then(stateB.update(2).then(stateC.update(3)));
> >
> > The execution order is:
> > ```
> > stateA update 1 -> stateB update 2-> stateC update 3
> > ```
> >
> > If written in the form without `then()` call, they will be placed in a
> > "mail/mailboxDefaultAction", and each state request will still be
> > executed asynchronously:
> > ```
> > stateA.update(1);
> > stateB.update(2);
> > stateC.update(3);
> > ```
> >
> > The order in which they are executed is undefined and may be:
> > ```
> > - stateA update 1 -> stateB update 2-> stateC update 3
> > - stateB update 2 -> stateC update 3-> stateA update 1
> > - stateC update 3 -> stateA update 1-> stateB update 2
> > ...
> > ```
> > And the final results are "stateA =

[jira] [Created] (FLINK-34908) mysql pipeline to doris and starrocks will lost precision for timestamp

2024-03-21 Thread Xin Gong (Jira)

Xin Gong created FLINK-34908:


 Summary: mysql pipeline to doris and starrocks will lost precision 
for timestamp
 Key: FLINK-34908
 URL: https://issues.apache.org/jira/browse/FLINK-34908
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: Xin Gong
 Fix For: cdc-3.1.0


flink cdc pipeline will decide timestamp zone by config of pipeline. I found 
mysql2doris and mysql2starracks will specific datetime format

-MM-dd HH:mm:ss, it will cause lost datatime precision. I think we can 
don't specific datetime format, just return LocalDateTime object.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-433: State Access on DataStream API V2

2024-03-21 Thread Jinzhong Li

+1 (non-binding)

Best,
Jinzhong

On Thu, Mar 21, 2024 at 6:15 PM Zakelly Lan  wrote:

> +1 non-binding
>
>
> Best,
> Zakelly
>
> On Thu, Mar 21, 2024 at 5:34 PM Gyula Fóra  wrote:
>
> > +1 (binding)
> >
> > Gyula
> >
> > On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote:
> >
> > > +1(binding)
> > >
> > > Thanks to Weijie for driving this proposal, which solves the problem
> > that I
> > > raised in FLIP-359.
> > >
> > > Best,
> > > Rui
> > >
> > > On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu 
> > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song  >
> > > > wrote:
> > > >
> > > > > +1 (binding)
> > > > >
> > > > > Best,
> > > > >
> > > > > Xintong
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo <
> > guoweijieres...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > >
> > > > > > Thanks for all the feedback about the FLIP-433: State Access on
> > > > > > DataStream API V2 [1]. The discussion thread is here [2].
> > > > > >
> > > > > >
> > > > > > The vote will be open for at least 72 hours unless there is an
> > > > > > objection or insufficient votes.
> > > > > >
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Weijie
> > > > > >
> > > > > >
> > > > > > [1]
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2
> > > > > >
> > > > > > [2]
> > https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > Best,
> > > > Hangxiang.
> > > >
> > >
> >
>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Yuan Mei

Thanks for driving these efforts!

Congratulations

Best
Yuan

On Thu, Mar 21, 2024 at 4:35 PM Yu Li  wrote:

> Congratulations and look forward to its further development!
>
> Best Regards,
> Yu
>
> On Thu, 21 Mar 2024 at 15:54, ConradJam  wrote:
> >
> > Congrattulations!
> >
> > Leonard Xu  于2024年3月20日周三 21:36写道：
> >
> > > Hi devs and users,
> > >
> > > We are thrilled to announce that the donation of Flink CDC as a
> > > sub-project of Apache Flink has completed. We invite you to explore
> the new
> > > resources available:
> > >
> > > - GitHub Repository: https://github.com/apache/flink-cdc
> > > - Flink CDC Documentation:
> > > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> > >
> > > After Flink community accepted this donation[1], we have completed
> > > software copyright signing, code repo migration, code cleanup, website
> > > migration, CI migration and github issues migration etc.
> > > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong,
> Qingsheng
> > > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for
> their
> > > contributions and help during this process！
> > >
> > >
> > > For all previous contributors: The contribution process has slightly
> > > changed to align with the main Flink project. To report bugs or
> suggest new
> > > features, please open tickets
> > > Apache Jira (https://issues.apache.org/jira).  Note that we will no
> > > longer accept GitHub issues for these purposes.
> > >
> > >
> > > Welcome to explore the new repository and documentation. Your feedback
> and
> > > contributions are invaluable as we continue to improve Flink CDC.
> > >
> > > Thanks everyone for your support and happy exploring Flink CDC!
> > >
> > > Best，
> > > Leonard
> > > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> > >
> > >
> >
> > --
> > Best
> >
> > ConradJam
>

Re: [VOTE] FLIP-433: State Access on DataStream API V2

2024-03-21 Thread Zakelly Lan

+1 non-binding


Best,
Zakelly

On Thu, Mar 21, 2024 at 5:34 PM Gyula Fóra  wrote:

> +1 (binding)
>
> Gyula
>
> On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote:
>
> > +1(binding)
> >
> > Thanks to Weijie for driving this proposal, which solves the problem
> that I
> > raised in FLIP-359.
> >
> > Best,
> > Rui
> >
> > On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu 
> wrote:
> >
> > > +1 (binding)
> > >
> > > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song 
> > > wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Best,
> > > >
> > > > Xintong
> > > >
> > > >
> > > >
> > > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo <
> guoweijieres...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > >
> > > > > Thanks for all the feedback about the FLIP-433: State Access on
> > > > > DataStream API V2 [1]. The discussion thread is here [2].
> > > > >
> > > > >
> > > > > The vote will be open for at least 72 hours unless there is an
> > > > > objection or insufficient votes.
> > > > >
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Weijie
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2
> > > > >
> > > > > [2]
> https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best,
> > > Hangxiang.
> > >
> >
>

[jira] [Created] (FLINK-34907) jobRunningTs should be the timestamp that all tasks are running

2024-03-21 Thread Rui Fan (Jira)

Rui Fan created FLINK-34907:
---

 Summary: jobRunningTs should be the timestamp that all tasks are 
running
 Key: FLINK-34907
 URL: https://issues.apache.org/jira/browse/FLINK-34907
 Project: Flink
  Issue Type: Improvement
  Components: Autoscaler
Reporter: Rui Fan
Assignee: Rui Fan
 Fix For: kubernetes-operator-1.9.0


Currently, we consider the timestamp that JobStatus is changed to RUNNING as 
jobRunningTs. But the JobStatus will be RUNNING once job starts schedule, so it 
doesn't mean all tasks are running. 

It will let the isStabilizing or estimating restart time are not accurate.

Solution: jobRunningTs should be the timestamp that all tasks are running.

It can be got from SubtasksTimesHeaders rest api.




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations

2024-03-21 Thread gongzhongqiang

+1 (non-binding)


Best,
Zhongqiang Gong

Alex Nitavsky  于2024年3月7日周四 23:09写道：

> Hi everyone,
>
> I'd like to start a vote on FLIP-402 [1]. It introduces new configuration
> options for Apache Flink's ZooKeeper integration for high availability by
> reflecting existing Apache Curator configuration options. It has been
> discussed in this thread [2].
>
> I would like to start a vote.  The vote will be open for at least 72 hours
> (until March 10th 18:00 GMT) unless there is an objection or
> insufficient votes.
>
> [1]
>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
> [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z
>
> Thanks
> Alex
>

[jira] [Created] (FLINK-34906) Don't start autoscaling when some tasks are not running

2024-03-21 Thread Rui Fan (Jira)

Rui Fan created FLINK-34906:
---

 Summary: Don't start autoscaling when some tasks are not running
 Key: FLINK-34906
 URL: https://issues.apache.org/jira/browse/FLINK-34906
 Project: Flink
  Issue Type: Improvement
  Components: Autoscaler
Reporter: Rui Fan
Assignee: Rui Fan
 Fix For: 1.9.0
 Attachments: image-2024-03-21-17-40-23-523.png

Currently, the autoscaler will scale a job when the JobStatus is RUNNING. But 
the JobStatus will be RUNNING once job starts schedule, so it doesn't mean all 
tasks are running. Especially, when the resource isn't enough or job recovers 
from large state.

The autoscaler will throw exception and generate the AutoscalerError event when 
tasks are not ready, such as: 

 !image-2024-03-21-17-40-23-523.png! 


Solution: we only scale job that all tasks are running(some of tasks may be 
finished). 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-433: State Access on DataStream API V2

2024-03-21 Thread Gyula Fóra

+1 (binding)

Gyula

On Thu, Mar 21, 2024 at 3:33 AM Rui Fan <1996fan...@gmail.com> wrote:

> +1(binding)
>
> Thanks to Weijie for driving this proposal, which solves the problem that I
> raised in FLIP-359.
>
> Best,
> Rui
>
> On Thu, Mar 21, 2024 at 10:10 AM Hangxiang Yu  wrote:
>
> > +1 (binding)
> >
> > On Thu, Mar 21, 2024 at 10:04 AM Xintong Song 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > Best,
> > >
> > > Xintong
> > >
> > >
> > >
> > > On Wed, Mar 20, 2024 at 8:30 PM weijie guo 
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > >
> > > > Thanks for all the feedback about the FLIP-433: State Access on
> > > > DataStream API V2 [1]. The discussion thread is here [2].
> > > >
> > > >
> > > > The vote will be open for at least 72 hours unless there is an
> > > > objection or insufficient votes.
> > > >
> > > >
> > > > Best regards,
> > > >
> > > > Weijie
> > > >
> > > >
> > > > [1]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-433%3A+State+Access+on+DataStream+API+V2
> > > >
> > > > [2] https://lists.apache.org/thread/8ghzqkvt99p4k7hjqxzwhqny7zb7xrwo
> > > >
> > >
> >
> >
> > --
> > Best,
> > Hangxiang.
> >
>

Re: [VOTE] FLIP-402: Extend ZooKeeper Curator configurations

2024-03-21 Thread Martijn Visser

+1 (binding)

On Wed, Mar 20, 2024 at 1:19 PM Ferenc Csaky 
wrote:

> +1 (non-binding), thanks for driving this!
>
> Best,
> Ferenc
>
>
> On Wednesday, March 20th, 2024 at 10:57, Yang Wang <
> wangyang0...@apache.org> wrote:
>
> >
> >
> > +1 (binding) since ZK HA is still widely used.
> >
> >
> > Best,
> > Yang
> >
> > On Thu, Mar 14, 2024 at 6:27 PM Matthias Pohl
> > matthias.p...@aiven.io.invalid wrote:
> >
> > > Nothing to add from my side. Thanks, Alex.
> > >
> > > +1 (binding)
> > >
> > > On Thu, Mar 7, 2024 at 4:09 PM Alex Nitavsky alexnitav...@gmail.com
> > > wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to start a vote on FLIP-402 [1]. It introduces new
> configuration
> > > > options for Apache Flink's ZooKeeper integration for high
> availability by
> > > > reflecting existing Apache Curator configuration options. It has been
> > > > discussed in this thread [2].
> > > >
> > > > I would like to start a vote. The vote will be open for at least 72
> > > > hours
> > > > (until March 10th 18:00 GMT) unless there is an objection or
> > > > insufficient votes.
> > > >
> > > > [1]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-402%3A+Extend+ZooKeeper+Curator+configurations
> > >
> > > > [2] https://lists.apache.org/thread/gqgs2jlq6bmg211gqtgdn8q5hp5v9l1z
> > > >
> > > > Thanks
> > > > Alex
>

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Martijn Visser

+1 (binding)

On Thu, Mar 21, 2024 at 8:01 AM gongzhongqiang 
wrote:

> +1 (non-binding)
>
> Bests,
> Zhongqiang Gong
>
> Ferenc Csaky  于2024年3月20日周三 22:11写道：
>
> > Hello devs,
> >
> > I would like to start a vote about FLIP-439 [1]. The FLIP is about to
> > externalize the Kudu
> > connector from the recently retired Apache Bahir project [2] to keep it
> > maintainable and
> > make it up to date as well. Discussion thread [3].
> >
> > The vote will be open for at least 72 hours (until 2024 March 23 14:03
> > UTC) unless there
> > are any objections or insufficient votes.
> >
> > Thanks,
> > Ferenc
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> > [2] https://attic.apache.org/projects/bahir.html
> > [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
>

Re: [DISCUSS] FLIP-434: Support optimizations for pre-partitioned data sources

2024-03-21 Thread lorenzo . affetti

Hello Jeyhun,
I really like the proposal and definitely makes sense to me.

I have a couple of nits here and there:

For the interface `SupportsPartitioning`, why returning `Optional`?
If one decides to implement that, partitions must exist (at maximum, return and 
empty list). Returning `Optional` seem just to complicate the logic of the code 
using that interface.

I foresee the using code doing something like: "if the source supports 
partitioning, get the partitions, but if they don't exist, raise a runtime 
exception". Let's simply make that safe at compile time and guarantee the code 
that partitions exist.

Another thing is that you show Hive-like partitioning in your FS structure, do 
you think it makes sense to add a note about auto-discovery of partitions?

In other terms, it looks a bit counterintuitive that the user implementing the 
source has to specify which partitions exist statically (and they can change at 
runtime), while the source itself knows the data provider and can directly 
implement a method `discoverPartitions`. Then Flink would take care of invoking 
that method when needed.
On Mar 15, 2024 at 22:09 +0100, Jeyhun Karimov , wrote:
> Hi Benchao,
>
> Thanks for your comments.
>
> 1. What the parallelism would you take? E.g., 128 + 256 => 128? What
> > if we cannot have a good greatest common divisor, like 127 + 128,
> > could we just utilize one side's pre-partitioned attribute, and let
> > another side just do the shuffle?
>
>
> There are two cases we need to consider:
>
> 1. Static Partition (no partitions are added during the query execution) is
> enabled AND both sources implement "SupportsPartitionPushdown"
>
> In this case, we are sure that no new partitions will be added at runtime.
> So, we have a chance equalize both sources' partitions and parallelism, IFF
> both sources implement "SupportsPartitionPushdown" interface.
> To achieve so, first we will fetch the existing partitions from source1
> (say p_s1) and from source2 (say p_s2).
> Then, we find the intersection of these two partition sets (say
> p_intersect) and pushdown these partitions:
>
> SupportsPartitionPushDown::applyPartitions(p_intersect) // make sure that
> only specific partitions are read
> SupportsPartitioning::applyPartitionedRead(p_intersect) // partitioned read
> with filtered partitions
>
> Lastly, we need to change the parallelism of 1) source1, 2) source2, and 3)
> all of their downstream operators until (and including) their first common
> ancestor (e.g., join) to be equal to the number of partitions (size of
> p_intersect).
>
> 2. All other cases
>
> In all other cases, the parallelism of both sources and their downstream
> operators until their common ancestor would be equal to the MIN(p_s1,
> p_s2).
> That is, minimum of the partition size of source1 and partition size of
> source2 will be selected as the parallelism.
> Coming back to your example, if source1 parallelism is 127 and source2
> parallelism is 128, then we will first check the partition size of source1
> and source2.
> Say partition size of source1 is 100 and partition size of source2 is 90.
> Then, we would set the parallelism for source1, source2, and all of their
> downstream operators until (and including) the join operator
> to 90 (min(100, 90)).
> We also plan to implement a cost based decision instead of the rule-based
> one (the ones explained above - MIN rule).
> One possible result of the cost based estimation is to keep the partitions
> on one side and perform the shuffling on another source.
>
>
>
> 2. In our current shuffle remove design (FlinkExpandConversionRule),
> > we don't consider parallelism, we just remove unnecessary shuffles
> > according to the distribution columns. After this FLIP, the
> > parallelism may be bundled with source's partitions, then how will
> > this optimization accommodate with FlinkExpandConversionRule, will you
> > also change downstream operator's parallelisms if we want to also
> > remove subsequent shuffles?
>
>
>
> - From my understanding of FlinkExpandConversionRule, its removal logic is
> agnostic to operator parallelism.
> So, if FlinkExpandConversionRule decides to remove a shuffle operation,
> then this FLIP will search another possible shuffle (the one closest to the
> source) to remove.
> If there is such an opportunity, this FLIP will remove the shuffle. So,
> from my understanding FlinkExpandConversionRule and this optimization rule
> can work together safely.
> Please correct me if I misunderstood your question.
>
>
>
> Regarding the new optimization rule, have you also considered to allow
> > some non-strict mode like FlinkRelDistribution#requireStrict? For
> > example, source is pre-partitioned by a, b columns, if we are
> > consuming this source, and do a aggregate on a, b, c, can we utilize
> > this optimization?
>
>
> - Good point. Yes, there are some cases that non-strict mode will apply.
> For example:
>
> - pre-partitioned columns and aggregate columns are the same

[jira] [Created] (FLINK-34905) The default length of CHAR/BINARY data type of Add column DDL

2024-03-21 Thread Qishang Zhong (Jira)

Qishang Zhong created FLINK-34905:
-

 Summary: The default length of CHAR/BINARY data type of Add column 
DDL
 Key: FLINK-34905
 URL: https://issues.apache.org/jira/browse/FLINK-34905
 Project: Flink
  Issue Type: Bug
  Components: Flink CDC
Reporter: Qishang Zhong


I run the DDL in mysql
{code:java}
ALTER TABLE test.products ADD Column1 BINARY NULL;  
ALTER TABLE test.products ADD Column2 CHAR NULL; {code}
Encountered the follow error:
{code:java}

Caused by: java.lang.IllegalArgumentException: Binary string length must be 
between 1 and 2147483647 (both inclusive).
at 
org.apache.flink.cdc.common.types.BinaryType.(BinaryType.java:53)
at 
org.apache.flink.cdc.common.types.BinaryType.(BinaryType.java:61)
at org.apache.flink.cdc.common.types.DataTypes.BINARY(DataTypes.java:42)
at 
org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.convertFromColumn(MySqlTypeUtils.java:221)
at 
org.apache.flink.cdc.connectors.mysql.utils.MySqlTypeUtils.fromDbzColumn(MySqlTypeUtils.java:111)
at 
org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.toCdcColumn(CustomAlterTableParserListener.java:256)
at 
org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.lambda$exitAlterByAddColumn$0(CustomAlterTableParserListener.java:126)
at 
io.debezium.connector.mysql.antlr.MySqlAntlrDdlParser.runIfNotNull(MySqlAntlrDdlParser.java:358)
at 
org.apache.flink.cdc.connectors.mysql.source.parser.CustomAlterTableParserListener.exitAlterByAddColumn(CustomAlterTableParserListener.java:98)
at 
io.debezium.ddl.parser.mysql.generated.MySqlParser$AlterByAddColumnContext.exitRule(MySqlParser.java:15459)
at 
io.debezium.antlr.ProxyParseTreeListenerUtil.delegateExitRule(ProxyParseTreeListenerUtil.java:64)
at 
org.apache.flink.cdc.connectors.mysql.source.parser.CustomMySqlAntlrDdlParserListener.exitEveryRule(CustomMySqlAntlrDdlParserListener.java:124)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.exitRule(ParseTreeWalker.java:48)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:30)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28)
at 
org.antlr.v4.runtime.tree.ParseTreeWalker.walk(ParseTreeWalker.java:28)
at io.debezium.antlr.AntlrDdlParser.parse(AntlrDdlParser.java:87)
at 
org.apache.flink.cdc.connectors.mysql.source.MySqlEventDeserializer.deserializeSchemaChangeRecord(MySqlEventDeserializer.java:88)
at 
org.apache.flink.cdc.debezium.event.SourceRecordEventDeserializer.deserialize(SourceRecordEventDeserializer.java:52)
at 
org.apache.flink.cdc.debezium.event.DebeziumEventDeserializationSchema.deserialize(DebeziumEventDeserializationSchema.java:93)
at 
org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitElement(MySqlRecordEmitter.java:119)
at 
org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.processElement(MySqlRecordEmitter.java:96)
at 
org.apache.flink.cdc.connectors.mysql.source.reader.MySqlPipelineRecordEmitter.processElement(MySqlPipelineRecordEmitter.java:120)
at 
org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitRecord(MySqlRecordEmitter.java:73)
at 
org.apache.flink.cdc.connectors.mysql.source.reader.MySqlRecordEmitter.emitRecord(MySqlRecordEmitter.java:46)
at 
org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:160)
at 
org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:419)
at 
org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:68)
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:562)
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:858)
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:807)
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:953)
at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:932)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:746)
at

[jira] [Created] (FLINK-34904) [Feature] submit Flink CDC pipeline job to yarn application cluster.

2024-03-21 Thread ZhengYu Chen (Jira)

ZhengYu Chen created FLINK-34904:


 Summary: [Feature] submit Flink CDC pipeline job to yarn 
application cluster.
 Key: FLINK-34904
 URL: https://issues.apache.org/jira/browse/FLINK-34904
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Affects Versions: 3.1.0
Reporter: ZhengYu Chen
 Fix For: 3.1.0


support flink cdc cli submit pipeline job to yarn application cluster.discuss 
in FLINK-34853



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread Yu Li

Congratulations and look forward to its further development!

Best Regards,
Yu

On Thu, 21 Mar 2024 at 15:54, ConradJam  wrote:
>
> Congrattulations!
>
> Leonard Xu  于2024年3月20日周三 21:36写道：
>
> > Hi devs and users,
> >
> > We are thrilled to announce that the donation of Flink CDC as a
> > sub-project of Apache Flink has completed. We invite you to explore the new
> > resources available:
> >
> > - GitHub Repository: https://github.com/apache/flink-cdc
> > - Flink CDC Documentation:
> > https://nightlies.apache.org/flink/flink-cdc-docs-stable
> >
> > After Flink community accepted this donation[1], we have completed
> > software copyright signing, code repo migration, code cleanup, website
> > migration, CI migration and github issues migration etc.
> > Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng
> > Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their
> > contributions and help during this process！
> >
> >
> > For all previous contributors: The contribution process has slightly
> > changed to align with the main Flink project. To report bugs or suggest new
> > features, please open tickets
> > Apache Jira (https://issues.apache.org/jira).  Note that we will no
> > longer accept GitHub issues for these purposes.
> >
> >
> > Welcome to explore the new repository and documentation. Your feedback and
> > contributions are invaluable as we continue to improve Flink CDC.
> >
> > Thanks everyone for your support and happy exploring Flink CDC!
> >
> > Best，
> > Leonard
> > [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
> >
> >
>
> --
> Best
>
> ConradJam

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread gongzhongqiang

+1 (non-binding)

Bests,
Zhongqiang Gong

Ferenc Csaky  于2024年3月20日周三 22:11写道：

> Hello devs,
>
> I would like to start a vote about FLIP-439 [1]. The FLIP is about to
> externalize the Kudu
> connector from the recently retired Apache Bahir project [2] to keep it
> maintainable and
> make it up to date as well. Discussion thread [3].
>
> The vote will be open for at least 72 hours (until 2024 March 23 14:03
> UTC) unless there
> are any objections or insufficient votes.
>
> Thanks,
> Ferenc
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> [2] https://attic.apache.org/projects/bahir.html
> [3] https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread ConradJam

Congrattulations!

Leonard Xu  于2024年3月20日周三 21:36写道：

> Hi devs and users,
>
> We are thrilled to announce that the donation of Flink CDC as a
> sub-project of Apache Flink has completed. We invite you to explore the new
> resources available:
>
> - GitHub Repository: https://github.com/apache/flink-cdc
> - Flink CDC Documentation:
> https://nightlies.apache.org/flink/flink-cdc-docs-stable
>
> After Flink community accepted this donation[1], we have completed
> software copyright signing, code repo migration, code cleanup, website
> migration, CI migration and github issues migration etc.
> Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng
> Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their
> contributions and help during this process！
>
>
> For all previous contributors: The contribution process has slightly
> changed to align with the main Flink project. To report bugs or suggest new
> features, please open tickets
> Apache Jira (https://issues.apache.org/jira).  Note that we will no
> longer accept GitHub issues for these purposes.
>
>
> Welcome to explore the new repository and documentation. Your feedback and
> contributions are invaluable as we continue to improve Flink CDC.
>
> Thanks everyone for your support and happy exploring Flink CDC!
>
> Best，
> Leonard
> [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
>
>

-- 
Best

ConradJam

Re: [VOTE] Apache Flink Kubernetes Operator Release 1.8.0, release candidate #1

2024-03-21 Thread Márton Balassi

+1 (binding)

As per Gyula's suggestion above verified with "
ghcr.io/apache/flink-kubernetes-operator:91d67d9 ".

- Verified Helm repo works as expected, points to correct image tag, build,
version
- Verified basic examples + checked operator logs everything looks as
expected
- Verified hashes, signatures and source release contains no binaries
- Ran built-in tests, built jars + docker image from source successfully
- Upgraded the operator and the CRD from 1.7.0 to 1.8.0

Best,
Marton

On Wed, Mar 20, 2024 at 9:10 PM Mate Czagany  wrote:

> Hi,
>
> +1 (non-binding)
>
> - Verified checksums
> - Verified signatures
> - Verified no binaries in source distribution
> - Verified Apache License and NOTICE files
> - Executed tests
> - Built container image
> - Verified chart version and appVersion matches
> - Verified Helm chart can be installed with default values
> - Verify that RC repo works as Helm repo
>
> Best Regards,
> Mate
>
> Alexander Fedulov  ezt írta (időpont: 2024.
> márc. 19., K, 23:10):
>
> > Hi Max,
> >
> > +1
> >
> > - Verified SHA checksums
> > - Verified GPG signatures
> > - Verified that the source distributions do not contain binaries
> > - Verified built-in tests (mvn clean verify)
> > - Verified build with Java 11 (mvn clean install -DskipTests -T 1C)
> > - Verified that Helm and operator files contain Apache licenses (rg -L
> > --files-without-match "http://www.apache.org/licenses/LICENSE-2.0; .).
> >  I am not sure we need to
> > include ./examples/flink-beam-example/dependency-reduced-pom.xml
> > and ./flink-autoscaler-standalone/dependency-reduced-pom.xml though
> > - Verified that chart and appVersion matches the target release (91d67d9)
> > - Verified that Helm chart can be installed from the local Helm folder
> > without overriding any parameters
> > - Verified that Helm chart can be installed from the RC repo without
> > overriding any parameters (
> >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1
> > )
> > - Verified docker container build
> >
> > Best,
> > Alex
> >
> >
> > On Mon, 18 Mar 2024 at 20:50, Maximilian Michels  wrote:
> >
> > > @Rui @Gyula Thanks for checking the release!
> > >
> > > >A minor correction is that [3] in the email should point to:
> > > >ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm chart
> > and
> > > > everything is correct. It's a typo in the vote email.
> > >
> > > Good catch. Indeed, for the linked Docker image 8938658 points to
> > > HEAD^ of the rc branch, 91d67d9 is the HEAD. There are no code changes
> > > between those two commits, except for updating the version. So the
> > > votes are not impacted, especially because votes are casted against
> > > the source release which, as you pointed out, contains the correct
> > > image ref.
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Mon, Mar 18, 2024 at 9:54 AM Gyula Fóra 
> wrote:
> > > >
> > > > Hi Max!
> > > >
> > > > +1 (binding)
> > > >
> > > >  - Verified source release, helm chart + checkpoints / signatures
> > > >  - Helm points to correct image
> > > >  - Deployed operator, stateful example and executed upgrade +
> savepoint
> > > > redeploy
> > > >  - Verified logs
> > > >  - Flink web PR looks good +1
> > > >
> > > > A minor correction is that [3] in the email should point to:
> > > > ghcr.io/apache/flink-kubernetes-operator:91d67d9 . But the helm
> chart
> > > and
> > > > everything is correct. It's a typo in the vote email.
> > > >
> > > > Thank you for preparing the release!
> > > >
> > > > Cheers,
> > > > Gyula
> > > >
> > > > On Mon, Mar 18, 2024 at 8:26 AM Rui Fan <1996fan...@gmail.com>
> wrote:
> > > >
> > > > > Thanks Max for driving this release!
> > > > >
> > > > > +1(non-binding)
> > > > >
> > > > > - Downloaded artifacts from dist ( svn co
> > > > >
> > > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/
> > > > > )
> > > > > - Verified SHA512 checksums : ( for i in *.tgz; do echo $i;
> sha512sum
> > > > > --check $i.sha512; done )
> > > > > - Verified GPG signatures : ( $ for i in *.tgz; do echo $i; gpg
> > > --verify
> > > > > $i.asc $i )
> > > > > - Build the source with java-11 and java-17 ( mvn -T 20 clean
> install
> > > > > -DskipTests )
> > > > > - Verified the license header during build the source
> > > > > - Verified that chart and appVersion matches the target release
> (less
> > > the
> > > > > index.yaml and Chart.yaml )
> > > > > - RC repo works as Helm repo( helm repo add
> > > flink-operator-repo-1.8.0-rc1
> > > > >
> > > > >
> > >
> >
> https://dist.apache.org/repos/dist/dev/flink/flink-kubernetes-operator-1.8.0-rc1/
> > > > > )
> > > > > - Verified Helm chart can be installed  ( helm install
> > > > > flink-kubernetes-operator
> > > > > flink-operator-repo-1.8.0-rc1/flink-kubernetes-operator --set
> > > > > webhook.create=false )
> > > > > - Submitted the autoscaling demo, the autoscaler works well with
> > > *memory
> > >

[jira] [Created] (FLINK-34903) Add mysql-pipeline-connector with table.exclude.list option to exclude unnecessary tables

2024-03-21 Thread shiyuyang (Jira)

shiyuyang created FLINK-34903:
-

 Summary: Add mysql-pipeline-connector with  table.exclude.list 
option to exclude unnecessary tables 
 Key: FLINK-34903
 URL: https://issues.apache.org/jira/browse/FLINK-34903
 Project: Flink
  Issue Type: Improvement
  Components: Flink CDC
Reporter: shiyuyang
 Fix For: cdc-3.1.0


    When using the MySQL Pipeline connector for whole-database synchronization, 
users currently cannot exclude unnecessary tables. Taking reference from 
Debezium's parameters, specifically the {*}table.exclude.list{*}, if the 
*table.include.list* is declared, then the *table.exclude.list* parameter will 
not take effect. However, the tables specified in the tables parameter of the 
MySQL Pipeline connector are effectively added to the *table.include.list* in 
Debezium's context.

    In summary, it is necessary to introduce an externally-exposed 
*table.exclude.list* parameter within the MySQL Pipeline connector to 
facilitate the exclusion of tables. This is because the current setup does not 
allow for excluding unnecessary tables when including others through the tables 
parameter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Samrat Deb

+1 (non-binding )

Bests,
Samrat

On Thu, Mar 21, 2024 at 12:55 PM Hang Ruan  wrote:

> +1 (non-binding)
>
> Best,
> Hang
>
> Őrhidi Mátyás  于2024年3月21日周四 00:00写道：
>
> > +1 (binding)
> >
> > On Wed, Mar 20, 2024 at 8:37 AM Gabor Somogyi  >
> > wrote:
> >
> > > +1 (binding)
> > >
> > > G
> > >
> > >
> > > On Wed, Mar 20, 2024 at 3:59 PM Gyula Fóra 
> wrote:
> > >
> > > > +1 (binding)
> > > >
> > > > Thanks!
> > > > Gyula
> > > >
> > > > On Wed, Mar 20, 2024 at 3:36 PM Mate Czagany 
> > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Thank you,
> > > > > Mate
> > > > >
> > > > > Ferenc Csaky  ezt írta (időpont: 2024.
> > > márc.
> > > > > 20., Sze, 15:11):
> > > > >
> > > > > > Hello devs,
> > > > > >
> > > > > > I would like to start a vote about FLIP-439 [1]. The FLIP is
> about
> > to
> > > > > > externalize the Kudu
> > > > > > connector from the recently retired Apache Bahir project [2] to
> > keep
> > > it
> > > > > > maintainable and
> > > > > > make it up to date as well. Discussion thread [3].
> > > > > >
> > > > > > The vote will be open for at least 72 hours (until 2024 March 23
> > > 14:03
> > > > > > UTC) unless there
> > > > > > are any objections or insufficient votes.
> > > > > >
> > > > > > Thanks,
> > > > > > Ferenc
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> > > > > > [2] https://attic.apache.org/projects/bahir.html
> > > > > > [3]
> > https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
> > > > >
> > > >
> > >
> >
>

Re: [ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-21 Thread gongzhongqiang

Congrattulations! Thanks for the great work!


Best,
Zhongqiang Gong

Leonard Xu  于2024年3月20日周三 21:36写道：

> Hi devs and users,
>
> We are thrilled to announce that the donation of Flink CDC as a
> sub-project of Apache Flink has completed. We invite you to explore the new
> resources available:
>
> - GitHub Repository: https://github.com/apache/flink-cdc
> - Flink CDC Documentation:
> https://nightlies.apache.org/flink/flink-cdc-docs-stable
>
> After Flink community accepted this donation[1], we have completed
> software copyright signing, code repo migration, code cleanup, website
> migration, CI migration and github issues migration etc.
> Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng
> Ren, Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their
> contributions and help during this process！
>
>
> For all previous contributors: The contribution process has slightly
> changed to align with the main Flink project. To report bugs or suggest new
> features, please open tickets
> Apache Jira (https://issues.apache.org/jira).  Note that we will no
> longer accept GitHub issues for these purposes.
>
>
> Welcome to explore the new repository and documentation. Your feedback and
> contributions are invaluable as we continue to improve Flink CDC.
>
> Thanks everyone for your support and happy exploring Flink CDC!
>
> Best，
> Leonard
> [1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob
>
>

Re: [VOTE] FLIP-439: Externalize Kudu Connector from Bahir

2024-03-21 Thread Hang Ruan

+1 (non-binding)

Best,
Hang

Őrhidi Mátyás  于2024年3月21日周四 00:00写道：

> +1 (binding)
>
> On Wed, Mar 20, 2024 at 8:37 AM Gabor Somogyi 
> wrote:
>
> > +1 (binding)
> >
> > G
> >
> >
> > On Wed, Mar 20, 2024 at 3:59 PM Gyula Fóra  wrote:
> >
> > > +1 (binding)
> > >
> > > Thanks!
> > > Gyula
> > >
> > > On Wed, Mar 20, 2024 at 3:36 PM Mate Czagany 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Thank you,
> > > > Mate
> > > >
> > > > Ferenc Csaky  ezt írta (időpont: 2024.
> > márc.
> > > > 20., Sze, 15:11):
> > > >
> > > > > Hello devs,
> > > > >
> > > > > I would like to start a vote about FLIP-439 [1]. The FLIP is about
> to
> > > > > externalize the Kudu
> > > > > connector from the recently retired Apache Bahir project [2] to
> keep
> > it
> > > > > maintainable and
> > > > > make it up to date as well. Discussion thread [3].
> > > > >
> > > > > The vote will be open for at least 72 hours (until 2024 March 23
> > 14:03
> > > > > UTC) unless there
> > > > > are any objections or insufficient votes.
> > > > >
> > > > > Thanks,
> > > > > Ferenc
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-439%3A+Externalize+Kudu+Connector+from+Bahir
> > > > > [2] https://attic.apache.org/projects/bahir.html
> > > > > [3]
> https://lists.apache.org/thread/oydhcfkco2kqp4hdd1glzy5vkw131rkz
> > > >
> > >
> >
>

[jira] [Created] (FLINK-34902) INSERT INTO column mismatch leads to IndexOutOfBoundsException

2024-03-21 Thread Timo Walther (Jira)

Timo Walther created FLINK-34902:


 Summary: INSERT INTO column mismatch leads to 
IndexOutOfBoundsException
 Key: FLINK-34902
 URL: https://issues.apache.org/jira/browse/FLINK-34902
 Project: Flink
  Issue Type: Bug
  Components: Table SQL / Planner
Reporter: Timo Walther


SQL:
{code}
INSERT INTO t (a, b) SELECT 1;
{code}

 

Stack trace:
{code}

org.apache.flink.table.api.ValidationException: SQL validation failed. Index 1 
out of bounds for length 1
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.org$apache$flink$table$planner$calcite$FlinkPlannerImpl$$validate(FlinkPlannerImpl.scala:200)
    at 
org.apache.flink.table.planner.calcite.FlinkPlannerImpl.validate(FlinkPlannerImpl.scala:117)
    at
Caused by: java.lang.IndexOutOfBoundsException: Index 1 out of bounds for 
length 1
    at 
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:64)
    at 
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:70)
    at 
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:248)
    at java.base/java.util.Objects.checkIndex(Objects.java:374)
    at java.base/java.util.ArrayList.get(ArrayList.java:459)
    at 
org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$reorder$1(PreValidateReWriter.scala:355)
    at 
org.apache.flink.table.planner.calcite.PreValidateReWriter$.$anonfun$reorder$1$adapted(PreValidateReWriter.scala:355)

{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

50 matches

Mail list logo