from:"Leonard Xu"

Re: Flink kinesis connector 4.3.0 release estimated date

2024-05-22 Thread Leonard Xu

Hey, Vararu

The kinesis connector 4.3.0 release is under vote phase and we hope to finalize 
the release work in this week if everything goes well.


Best,
Leonard


> 2024年5月22日 下午11:51，Vararu, Vadim  写道：
> 
> Hi guys,
>  
> Any idea when the 4.3.0 kinesis connector is estimated to be released?
>  
> Cheers,
> Vadim.

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu

Congratulations !

Thanks Qingsheng for the great work and all contributors involved !!

Best,
Leonard


> 2024年5月17日 下午5:32，Qingsheng Ren  写道：
> 
> The Apache Flink community is very happy to announce the release of
> Apache Flink CDC 3.1.0.
> 
> Apache Flink CDC is a distributed data integration tool for real time
> data and batch data, bringing the simplicity and elegance of data
> integration via YAML to describe the data movement and transformation
> in a data pipeline.
> 
> Please check out the release blog post for an overview of the release:
> https://flink.apache.org/2024/05/17/apache-flink-cdc-3.1.0-release-announcement/
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> Maven artifacts for Flink CDC can be found at:
> https://search.maven.org/search?q=g:org.apache.flink%20cdc
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
> 
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
> 
> Regards,
> Qingsheng Ren

Re: [ANNOUNCE] Apache Flink CDC 3.1.0 released

2024-05-17 Thread Leonard Xu

Congratulations !

Thanks Qingsheng for the great work and all contributors involved !!

Best,
Leonard


> 2024年5月17日 下午5:32，Qingsheng Ren  写道：
> 
> The Apache Flink community is very happy to announce the release of
> Apache Flink CDC 3.1.0.
> 
> Apache Flink CDC is a distributed data integration tool for real time
> data and batch data, bringing the simplicity and elegance of data
> integration via YAML to describe the data movement and transformation
> in a data pipeline.
> 
> Please check out the release blog post for an overview of the release:
> https://flink.apache.org/2024/05/17/apache-flink-cdc-3.1.0-release-announcement/
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> Maven artifacts for Flink CDC can be found at:
> https://search.maven.org/search?q=g:org.apache.flink%20cdc
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12354387
> 
> We would like to thank all contributors of the Apache Flink community
> who made this release possible!
> 
> Regards,
> Qingsheng Ren

[ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Leonard Xu

Hi devs and users,

We are thrilled to announce that the donation of Flink CDC as a sub-project of 
Apache Flink has completed. We invite you to explore the new resources 
available:

- GitHub Repository: https://github.com/apache/flink-cdc
- Flink CDC Documentation: 
https://nightlies.apache.org/flink/flink-cdc-docs-stable

After Flink community accepted this donation[1], we have completed software 
copyright signing, code repo migration, code cleanup, website migration, CI 
migration and github issues migration etc. 
Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng Ren, 
Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their 
contributions and help during this process！


For all previous contributors: The contribution process has slightly changed to 
align with the main Flink project. To report bugs or suggest new features, 
please open tickets 
Apache Jira (https://issues.apache.org/jira).  Note that we will no longer 
accept GitHub issues for these purposes.


Welcome to explore the new repository and documentation. Your feedback and 
contributions are invaluable as we continue to improve Flink CDC.

Thanks everyone for your support and happy exploring Flink CDC!

Best，
Leonard
[1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob

[ANNOUNCE] Donation Flink CDC into Apache Flink has Completed

2024-03-20 Thread Leonard Xu

Hi devs and users,

We are thrilled to announce that the donation of Flink CDC as a sub-project of 
Apache Flink has completed. We invite you to explore the new resources 
available:

- GitHub Repository: https://github.com/apache/flink-cdc
- Flink CDC Documentation: 
https://nightlies.apache.org/flink/flink-cdc-docs-stable

After Flink community accepted this donation[1], we have completed software 
copyright signing, code repo migration, code cleanup, website migration, CI 
migration and github issues migration etc. 
Here I am particularly grateful to Hang Ruan, Zhongqaing Gong, Qingsheng Ren, 
Jiabao Sun, LvYanquan, loserwang1024 and other contributors for their 
contributions and help during this process！


For all previous contributors: The contribution process has slightly changed to 
align with the main Flink project. To report bugs or suggest new features, 
please open tickets 
Apache Jira (https://issues.apache.org/jira).  Note that we will no longer 
accept GitHub issues for these purposes.


Welcome to explore the new repository and documentation. Your feedback and 
contributions are invaluable as we continue to improve Flink CDC.

Thanks everyone for your support and happy exploring Flink CDC!

Best，
Leonard
[1] https://lists.apache.org/thread/cw29fhsp99243yfo95xrkw82s5s418ob

Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Leonard Xu

Congratulations, thanks release managers and all involved for the great work!


Best,
Leonard

> 2024年3月18日 下午4:32，Jingsong Li  写道：
> 
> Congratulations!
> 
> On Mon, Mar 18, 2024 at 4:30 PM Rui Fan <1996fan...@gmail.com> wrote:
>> 
>> Congratulations, thanks for the great work!
>> 
>> Best,
>> Rui
>> 
>> On Mon, Mar 18, 2024 at 4:26 PM Lincoln Lee  wrote:
>>> 
>>> The Apache Flink community is very happy to announce the release of Apache 
>>> Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19 series.
>>> 
>>> Apache Flink® is an open-source stream processing framework for 
>>> distributed, high-performing, always-available, and accurate data streaming 
>>> applications.
>>> 
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>> 
>>> Please check out the release blog post for an overview of the improvements 
>>> for this bugfix release:
>>> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
>>> 
>>> The full release notes are available in Jira:
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
>>> 
>>> We would like to thank all contributors of the Apache Flink community who 
>>> made this release possible!
>>> 
>>> 
>>> Best,
>>> Yun, Jing, Martijn and Lincoln

Re: [ANNOUNCE] Apache Flink 1.19.0 released

2024-03-18 Thread Leonard Xu

Congratulations, thanks release managers and all involved for the great work!


Best,
Leonard

> 2024年3月18日 下午4:32，Jingsong Li  写道：
> 
> Congratulations!
> 
> On Mon, Mar 18, 2024 at 4:30 PM Rui Fan <1996fan...@gmail.com> wrote:
>> 
>> Congratulations, thanks for the great work!
>> 
>> Best,
>> Rui
>> 
>> On Mon, Mar 18, 2024 at 4:26 PM Lincoln Lee  wrote:
>>> 
>>> The Apache Flink community is very happy to announce the release of Apache 
>>> Flink 1.19.0, which is the fisrt release for the Apache Flink 1.19 series.
>>> 
>>> Apache Flink® is an open-source stream processing framework for 
>>> distributed, high-performing, always-available, and accurate data streaming 
>>> applications.
>>> 
>>> The release is available for download at:
>>> https://flink.apache.org/downloads.html
>>> 
>>> Please check out the release blog post for an overview of the improvements 
>>> for this bugfix release:
>>> https://flink.apache.org/2024/03/18/announcing-the-release-of-apache-flink-1.19/
>>> 
>>> The full release notes are available in Jira:
>>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353282
>>> 
>>> We would like to thank all contributors of the Apache Flink community who 
>>> made this release possible!
>>> 
>>> 
>>> Best,
>>> Yun, Jing, Martijn and Lincoln

Re: 退订

2024-02-22 Thread Leonard Xu

可以发送任意内容的邮件到  user-zh-unsubscr...@flink.apache.org   取消订阅来自 
user-zh@flink.apache.org  邮件列表的邮件，邮件列表的订阅管理，可以参考[1]

祝好，
[1] https://flink.apache.org/zh/what-is-flink/community/

> 2024年2月20日 下午4:36，任香帅  写道：
> 
> 退订

Re: [ANNOUNCE] Apache Flink 1.18.1 released

2024-01-21 Thread Leonard Xu

Thanks Jing for driving the release, nice work!

Thanks all who involved this release!

Best,
Leonard

> 2024年1月20日 上午12:01，Jing Ge  写道：
> 
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.18.1, which is the first bugfix release for the Apache Flink 1.18
> series.
> 
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> Please check out the release blog post for an overview of the improvements
> for this bugfix release:
> https://flink.apache.org/2024/01/19/apache-flink-1.18.1-release-announcement/
> 
> Please note: Users that have state compression should not migrate to 1.18.1
> (nor 1.18.0) due to a critical bug that could lead to data loss. Please
> refer to FLINK-34063 for more information.
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640
> 
> We would like to thank all contributors of the Apache Flink community who
> made this release possible! Special thanks to @Qingsheng Ren @Leonard Xu
> @Xintong Song @Matthias Pohl @Martijn Visser for the support during this
> release.
> 
> A Jira task series based on the Flink release wiki has been created for
> 1.18.1 release. Tasks that need to be done by PMC have been explicitly
> created separately. It will be convenient for the release manager to reach
> out to PMC for those tasks. Any future patch release could consider cloning
> it and follow the standard release process.
> https://issues.apache.org/jira/browse/FLINK-33824
> 
> Feel free to reach out to the release managers (or respond to this thread)
> with feedback on the release process. Our goal is to constantly improve the
> release process. Feedback on what could be improved or things that didn't
> go so well are appreciated.
> 
> Regards,
> Jing

Re: [ANNOUNCE] Apache Flink 1.18.1 released

2024-01-21 Thread Leonard Xu

Thanks Jing for driving the release, nice work!

Thanks all who involved this release!

Best,
Leonard

> 2024年1月20日 上午12:01，Jing Ge  写道：
> 
> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.18.1, which is the first bugfix release for the Apache Flink 1.18
> series.
> 
> Apache Flink® is an open-source stream processing framework for
> distributed, high-performing, always-available, and accurate data streaming
> applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html
> 
> Please check out the release blog post for an overview of the improvements
> for this bugfix release:
> https://flink.apache.org/2024/01/19/apache-flink-1.18.1-release-announcement/
> 
> Please note: Users that have state compression should not migrate to 1.18.1
> (nor 1.18.0) due to a critical bug that could lead to data loss. Please
> refer to FLINK-34063 for more information.
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353640
> 
> We would like to thank all contributors of the Apache Flink community who
> made this release possible! Special thanks to @Qingsheng Ren @Leonard Xu
> @Xintong Song @Matthias Pohl @Martijn Visser for the support during this
> release.
> 
> A Jira task series based on the Flink release wiki has been created for
> 1.18.1 release. Tasks that need to be done by PMC have been explicitly
> created separately. It will be convenient for the release manager to reach
> out to PMC for those tasks. Any future patch release could consider cloning
> it and follow the standard release process.
> https://issues.apache.org/jira/browse/FLINK-33824
> 
> Feel free to reach out to the release managers (or respond to this thread)
> with feedback on the release process. Our goal is to constantly improve the
> release process. Feedback on what could be improved or things that didn't
> go so well are appreciated.
> 
> Regards,
> Jing

Re: [ANNOUNCE] Apache Flink 1.17.2 released

2023-11-28 Thread Leonard Xu


Thanks Yun for driving the release.  
Thanks a lot to everyone that has contributed with bug fixes and other 
improvements!

Best,
Leonard


> 2023年11月29日 下午1:05，Yun Tang  写道：
> 
> The Apache Flink community is very happy to announce the release of Apache 
> Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17 
> series.
> 
> Apache Flink® Is a framework and distributed processing engine for stateful 
> computations over unbounded and bounded data streams. Flink has been designed 
> to run in all common cluster environments, perform computations at in-memory 
> speed and at any scale.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
> 
> Please check out the release blog post for an overview of the improvements 
> for this bugfix release: 
> https://flink.apache.org/2023/11/29/apache-flink-1.17.2-release-announcement/ 
> 
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353260
>  
> 
> 
> We would like to thank all contributors of the Apache Flink community who 
> made this release possible!
> 
> 
> Feel free to reach out to the release managers (or respond to this thread) 
> with feedback on the release process. Our goal is to constantly improve the 
> release process. Feedback on what could be improved or things that didn't go 
> so well are appreciated.
> 
> 
> Regards,
> Release Manager

Re: [ANNOUNCE] Apache Flink 1.17.2 released

2023-11-28 Thread Leonard Xu


Thanks Yun for driving the release.  
Thanks a lot to everyone that has contributed with bug fixes and other 
improvements!

Best,
Leonard


> 2023年11月29日 下午1:05，Yun Tang  写道：
> 
> The Apache Flink community is very happy to announce the release of Apache 
> Flink 1.17.2, which is the second bugfix release for the Apache Flink 1.17 
> series.
> 
> Apache Flink® Is a framework and distributed processing engine for stateful 
> computations over unbounded and bounded data streams. Flink has been designed 
> to run in all common cluster environments, perform computations at in-memory 
> speed and at any scale.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
> 
> Please check out the release blog post for an overview of the improvements 
> for this bugfix release: 
> https://flink.apache.org/2023/11/29/apache-flink-1.17.2-release-announcement/ 
> 
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353260
>  
> 
> 
> We would like to thank all contributors of the Apache Flink community who 
> made this release possible!
> 
> 
> Feel free to reach out to the release managers (or respond to this thread) 
> with feedback on the release process. Our goal is to constantly improve the 
> release process. Feedback on what could be improved or things that didn't go 
> so well are appreciated.
> 
> 
> Regards,
> Release Manager

Re: dependency error with latest Kafka connector

2023-11-24 Thread Leonard Xu


> built a fat uber jar from quickstart with Flink 1.18.0 for 
> flink-streaming-java and flink-clients, and flink-connector-kafka version 
> 3.0.1-1.18
> then submitted to local Flink cluster 1.18.0. Things worked as expected and 
> the job ran fine.
Hey，@Gordan 
I guess things may work as expected when you submit your fat jar job to 
cluster, because  flink-connector-base （1.18.0 in this case) has been included 
to flink-dist jar [1] which will appear in your classpath,  but it may meet 
issue when you run in local IDE environment, maybe you can have a local test to 
verify this.

In the end, I think we need to backport FLINK-30400 to the Flink Kafka 
connector 3.0 branch and prepare a 3.0.2 soon.

Best,
Leonard
[1] 
https://github.com/apache/flink/blob/977463cce3ea0f88e2f184c30720bf4e8e97fd4a/flink-dist/pom.xml#L156

Re: dependency error with latest Kafka connector

2023-11-23 Thread Leonard Xu

Hi, Gurnterh

It seems a bug for me that  3.0.1-1.18 flink Kafka connector use  flink  1.17 
dependency which lead to your issue.

I guess we need propose a new release for Kafka connector for fix this issue.

CC: Gordan, Danny, Martijn

Best,
Leonard

> 2023年11月14日 下午6:53，Alexey Novakov via user  写道：
> 
> Hi Günterh,
> 
> It looks like a problem with the Kafka connector release. 
> https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka/3.0.1-1.18
>  
> 
> Compile dependencies are still pointing to Flink 1.17.
> 
> Release person is already contacted about this or will be contacted soon.
> 
> Best regards,
> Alexey
> 
> On Mon, Nov 13, 2023 at 10:42 PM guenterh.lists  > wrote:
> Hello
> 
> I'm getting a dependency error when using the latest Kafka connector in 
> a Scala project.
> 
> Using the 1.17.1 Kafka connector compilation is ok.
> 
> With
> 
> "org.apache.flink" % "flink-connector-kafka" % "3.0.1-1.18"
> 
> I get
> [error] (update) sbt.librarymanagement.ResolveException: Error 
> downloading org.apache.flink:flink-connector-base:
> [error]   Not found
> [error]   Not found
> [error]   not found: 
> /home/swissbib/.ivy2/local/org.apache.flink/flink-connector-base/ivys/ivy.xml
> [error]   not found: 
> https://repo1.maven.org/maven2/org/apache/flink/flink-connector-base//flink-connector-base-.pom
>  
> 
> 
> Seems Maven packaging is not correct.
> 
> My sbt build file:
> 
> ThisBuild / scalaVersion := "3.3.0"
> val flinkVersion = "1.18.0"
> val postgresVersion = "42.2.2"
> 
> lazy val root = (project in file(".")).settings(
>name := "flink-scala-proj",
>libraryDependencies ++= Seq(
>  "org.flinkextended" %% "flink-scala-api" % "1.17.1_1.1.0",
>  "org.apache.flink" % "flink-clients" % flinkVersion % Provided,
>  "org.apache.flink" % "flink-connector-files" % flinkVersion % Provided,
> 
>"org.apache.flink" % "flink-connector-kafka" % "1.17.1",
>//"org.apache.flink" % "flink-connector-kafka" % "3.0.1-1.18",
> 
>//"org.apache.flink" % "flink-connector-jdbc" % "3.1.1-1.17",
>//"org.postgresql" % "postgresql" % postgresVersion,
>"org.apache.flink" % "flink-connector-files" % flinkVersion % Provided,
>//"org.apache.flink" % "flink-connector-base" % flinkVersion % Provided
>)
> )
> 
> 
> 
> Thanks!
> 
> -- 
> Günter Hipler
> https://openbiblio.social/@vog61 
> https://twitter.com/vog61 
>

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Leonard Xu

Congratulations, Well done!

Best,
Leonard

On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee  wrote:

> Thanks for the great work! Congrats all!
>
> Best,
> Lincoln Lee
>
>
> Jing Ge  于2023年10月27日周五 00:16写道：
>
> > The Apache Flink community is very happy to announce the release of
> Apache
> > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> series.
> >
> > Apache Flink® is an open-source unified stream and batch data processing
> > framework for distributed, high-performing, always-available, and
> accurate
> > data applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> improvements
> > for this release:
> >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> >
> > The full release notes are available in Jira:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> >
> > We would like to thank all contributors of the Apache Flink community who
> > made this release possible!
> >
> > Best regards,
> > Konstantin, Qingsheng, Sergey, and Jing
> >
>

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Leonard Xu

Congratulations, Well done!

Best,
Leonard

On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee  wrote:

> Thanks for the great work! Congrats all!
>
> Best,
> Lincoln Lee
>
>
> Jing Ge  于2023年10月27日周五 00:16写道：
>
> > The Apache Flink community is very happy to announce the release of
> Apache
> > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> series.
> >
> > Apache Flink® is an open-source unified stream and batch data processing
> > framework for distributed, high-performing, always-available, and
> accurate
> > data applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> improvements
> > for this release:
> >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> >
> > The full release notes are available in Jira:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> >
> > We would like to thank all contributors of the Apache Flink community who
> > made this release possible!
> >
> > Best regards,
> > Konstantin, Qingsheng, Sergey, and Jing
> >
>

Re: [ANNOUNCE] Apache Flink has won the 2023 SIGMOD Systems Award

2023-07-06 Thread Leonard Xu

Congrats to all !

It will be helpful to promote Apache Flink if we can add a page to our website 
like others[2]. I’ve created an issue to improve this.


Best,
Leonard

[1] https://issues.apache.org/jira/browse/FLINK-32555 
[2] https://spark.apache.org/news/sigmod-system-award.html

Re: Unsubscribe

2023-06-14 Thread Leonard Xu

Please send email to user-unsubscr...@flink.apache.org if you want to 
unsubscribe the mail from user@flink.apache.org, and you can refer [1][2] for 
more details.
请发送任意内容的邮件到 user-unsubscr...@flink.apache.org 地址来取消订阅来自 user@flink.apache.org 
邮件组的邮件，你可以参考[1][2] 管理你的邮件订阅。

Best,
Leonard
[1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

> On Jun 15, 2023, at 10:40 AM, yanglele via user  wrote:
> 
> 
> 
> Unsubscribe
> 
> 
> 
> 
> 
> 
> 
> 
> - 原始邮件 -
> 
> 
> 
> 发件人：Robin Cassan via user
> 
> 发送时间：2023-06-14 23:13:09
> 
> 收件人：Gyula Fóra
> 
> 抄送：user
> 
> 主 题：Re: Kubernetes operator: config for taskmanager.memory.process.size 
> ignored
> 
> Thanks again, maybe the jvm overhead param will act as the margin I want, 
> I'll try that :)
> Robin
> 
> 
> Le mer. 14 juin 2023 à 15:28, Gyula Fóra  > a écrit :
> Again, this has absolutely nothing to do with the Kubernetes Operator, but 
> simply how Flink Kubernetes Memory configs work:
> https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/memory/mem_tuning/#configure-memory-for-containers
>  
> 
>  
> 
>  
> You can probably play around with:  jobmanager.memory.jvm-overhead.fraction
> 
>  
> You can set a larger memory size in the TM spec and increase the jvm overhead 
> fraction.
> 
>  
> Gyula
> 
>  
> On Wed, Jun 14, 2023 at 2:46 PM Robin Cassan  > wrote:
>  
> Thanks Gyula for your answer! I'm wondering about your claim:
>  > In Flink kubernetes the process is the pod so pod memory is always 
> equal to process memory
> Why should the flink TM process use the whole container (and so, the whole 
> pod) memory?
> 
>
> Before migrating to the k8s operator, we still used Flink on kubernetes 
> (without the operator) and left a little bit of margin between the process 
> memory and the pod memory, which helped stability. It looks like it cannot be 
> done with the k8s operator though and I wonder why the choice of removing 
> this granularity in the settings
> 
>
> Robin
> 
>
> Le mer. 14 juin 2023 à 12:20, Gyula Fóra  > a écrit :
>
> Basically what happens is that whatever you set to the 
> spec.taskManager.resource.memory will be set in the config as process memory.
> In Flink kubernetes the process is the pod so pod memory is always equal to 
> process memory.
> 
>
> So basically the spec is a config shorthand, there is no reason to override 
> it as you won't get a different behaviour at the end of the day.
> 
>
> Gyula
> 
>
> On Wed, Jun 14, 2023 at 11:55 AM Robin Cassan via user  > wrote:
>
> Hello all!
> 
>  
> I am using the flink kubernetes operator and I would like to set the value 
> for `taskmanager.memory.process.size`. I set the desired value in the 
> flinkdeployment resource specs (here, I want 55gb), however it looks like the 
> value that is effectively passed to the taskmanager is the same as the pod 
> memory setting (which is set to 59gb).
> 
>  
> For example, this flinkdeployment configuration:
>  
> ```
> Spec:
>  Flink Configuration:
>  
> taskmanager.memory.process.size:  55gb
>  
>   Task Manager:
>Resource:
>  Cpu: 6
>  Memory:  59Gb
>  
> ```
> will create a pod with 59Gb total memory (as expected) but will also give 
> 59Gb to the memory.process.size instead of 55Gb, as seen in this TM log: 
> `Loading configuration property: taskmanager.memory.process.size, 59Gb`
>  
> 
>  Maybe this part of the flink k8s operator code is responsible:
>  
> https://github.com/apache/flink-kubernetes-operator/blob/d43e1ca9050e83b492b2e16b0220afdba4ffa646/flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/config/FlinkConfigBuilder.java#L393
>  
> 
>   
> 
>  
> If so, I wonder what is the rationale for forcing the flink process memory to 
> be the same as the pod memory?
>Is there a way to bypass that, for example by setting the desired 
> process.memory configuration differently?
> 
>  
> Thanks!
> 
> 
> 
>  
> 
>

[ANNOUNCE] Apache flink-connector-pulsar v3.0.1 released

2023-06-07 Thread Leonard Xu

The Apache Flink community is very happy to announce the release of Apache 
flink-connector-pulsar v3.0.1. 
This release is compatible with Flink 1.16.x series.

Apache Flink® is an open-source stream processing framework for distributed, 
high-performing, always-available, and accurate data streaming applications.

The release is available for download at:
https://flink.apache.org/downloads.html

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352640

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

Regards,
Leonard

Re: 提问

2023-05-22 Thread Leonard Xu

(1)可以检查下是不是其他作业或者同步工具使用了对应的server-id
(2) server-id 可以尝试用机器IP+时间戳来生成，这样能尽可能避免冲突

祝好，
雪尽

> On May 22, 2023, at 3:34 PM, 曹明勤  wrote:
> 
> 在我提交的flink-cdc-mysql的任务中，需要flink同步多张表的数据，但是我遇到了server-id重复的问题。我尝试过设置随机数，但是server-id有一定的取值范围，并且随机数还是有可能重复。官方文档建议我将server-id设置为一个范围，比如5400-6400，并且设置flink的并行度。这些我都做了，但是当我同步表的数量较多时，还是会出现server-id重复的问题导致任务提交失败。我需要如何设置才能如何避免这种错误？
> 
> 
> 
> 
> In the Flinks-cdc-mysql task I submitted, flink was required to synchronize 
> data of multiple tables, but I encountered the problem of server-id 
> duplication. I tried to set a random number, but server-id has a range of 
> values, and random numbers can be repeated. The official documentation 
> advised me to set server-id to a range, such as 5400-6400, and set flink's 
> parallelism. I did all of this, but when I synchronized a large number of 
> tables, I still had the problem of server-id duplication, which caused the 
> task submission to fail. What do I need to set up to avoid this error?

Re: checkpoint Kafka Offset commit failed

2023-05-04 Thread Leonard Xu

可以发送任意内容的邮件到  user-zh-unsubscr...@flink.apache.org   取消订阅来自 
user-zh@flink.apache.org  邮件列表的邮件，邮件列表的订阅管理，可以参考[1]

祝好，
Leonard
[1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8

> 2023年5月4日 下午9:00，wuzhongxiu  写道：
> 
> 退订
> 
> 
> 
> | |
> go574...@163.com
> |
> |
> 邮箱：go574...@163.com
> |
> 
> 
> 
> 
>  回复的原邮件 
> | 发件人 | zhan...@eastcom-sw.com |
> | 日期 | 2023年05月04日 14:54 |
> | 收件人 | user-zh |
> | 抄送至 | |
> | 主题 | checkpoint Kafka Offset commit failed |
> hi，请问在flink(1.14、1.16) checkpoint（10s）提交 kafka偏移量提示 The coordinator is not 
> available  
> 
> 查看kafka集群日志都是正常的，手动也可以正确提交偏移量，重启flink job后也可以正常提交，运行一段时间后又会失败，请问有参数可以优化一下吗？
> 
> flink 日志如下：
> 2023-05-04 11:31:02,636 WARN  
> org.apache.flink.connector.kafka.source.reader.KafkaSourceReader [] - Failed 
> to commit consumer offsets for checkpoint 69153
> org.apache.kafka.clients.consumer.RetriableCommitFailedException: Offset 
> commit failed with a retriable exception. You should retry committing the 
> latest consumed offsets.
> Caused by: org.apache.kafka.common.errors.CoordinatorNotAvailableException: 
> The coordinator is not available.

Re: 退订

2023-05-04 Thread Leonard Xu



如果需要取消订阅 user-zh@flink.apache.org 邮件组，请发送任意内容的邮件到 
user-zh-unsubscr...@flink.apache.org ,参考[1]

[1] https://flink.apache.org/zh/community/

> 2023年4月21日 上午10:52，琴师 <1129656...@qq.com.INVALID> 写道：
> 
> 退订
> 
> 
> 琴师
> 1129656...@qq.com
> 
> 
> 
>

Re: 取消订阅

2023-04-18 Thread Leonard Xu


可以发送任意内容的邮件到  user-unsubscr...@flink.apache.org   取消订阅来自 user@flink.apache.org  
邮件列表的邮件，其他邮件列表的订阅和退订管理也类似，邮件列表的订阅管理，可以参考[1]

祝好，
Leonard Xu
[1] https://flink.apache.org/community.html#how-to-subscribe-to-a-mailing-list

> On Apr 18, 2023, at 2:53 PM, wangw...@sina.cn wrote:
> 
> 取消订阅

Re: Issue with the flink version 1.10.1

2023-03-27 Thread Leonard Xu

Hi, Kiran

To be honest, both 1.10 and 1.9 are pretty old version, it’s hard to fix and 
release a bugfix version for 1.10.1 even the community help troubleshoot your 
issue. 
So, could you try latest versions like Flink 1.16.1 or 1.17.0?

Best,
Leonard

> On Mar 27, 2023, at 8:28 PM, Kiran Kumar Kathe  
> wrote:
> 
> When I submit a job using flink version 1.10.1 ,  it is not upadating the 
> jobs that are running and completed successfully in the Web UI of YARN 
> resource manager . But When I use flink version 1.9.3 it is working fine and 
> I am able to see the jobs that are running and completed in 
> YARN resource manager Web UI . And to find why this is happening I just tried 
> with replacing the application folders and in lib folder when I use the 
> flink_dist jar of version 1.9.3 in place of flink_dist of version 1.10.1 it 
> is running fine and I am able to see the jobs running and completed. Is it 
> the right way , if not will I face any compatible issues in future with this 
> change of flink_dist jar in lib folder.

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Leonard Xu

Congratulations!


Best,
Leonard

> On Mar 27, 2023, at 5:23 PM, Yu Li  wrote:
> 
> Dear Flinkers,
> 
> As you may have noticed, we are pleased to announce that Flink Table Store 
> has joined the Apache Incubator as a separate project called Apache 
> Paimon(incubating) [1] [2] [3]. The new project still aims at building a 
> streaming data lake platform for high-speed data ingestion, change data 
> tracking and efficient real-time analytics, with the vision of supporting a 
> larger ecosystem and establishing a vibrant and neutral open source community.
> 
> We would like to thank everyone for their great support and efforts for the 
> Flink Table Store project, and warmly welcome everyone to join the 
> development and activities of the new project. Apache Flink will continue to 
> be one of the first-class citizens supported by Paimon, and we believe that 
> the Flink and Paimon communities will maintain close cooperation.
> 
> 亲爱的Flinkers，
> 
> 正如您可能已经注意到的，我们很高兴地宣布，Flink Table Store 已经正式加入 Apache 孵化器独立孵化 [1] [2] 
> [3]。新项目的名字是 Apache 
> Paimon(incubating)，仍致力于打造一个支持高速数据摄入、流式数据订阅和高效实时分析的新一代流式湖仓平台。此外，新项目将支持更加丰富的生态，并建立一个充满活力和中立的开源社区。
> 
> 在这里我们要感谢大家对 Flink Table Store 项目的大力支持和投入，并热烈欢迎大家加入新项目的开发和社区活动。Apache Flink 
> 将继续作为 Paimon 支持的主力计算引擎之一，我们也相信 Flink 和 Paimon 社区将继续保持密切合作。
> 
> Best Regards,
> Yu (on behalf of the Apache Flink PMC and Apache Paimon PPMC)
> 
> 致礼，
> 李钰（谨代表 Apache Flink PMC 和 Apache Paimon PPMC）
> 
> [1] https://paimon.apache.org/ 
> [2] https://github.com/apache/incubator-paimon 
> 
> [3] https://cwiki.apache.org/confluence/display/INCUBATOR/PaimonProposal 
>

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Leonard Xu

Congratulations!


Best,
Leonard

> On Mar 27, 2023, at 5:23 PM, Yu Li  wrote:
> 
> Dear Flinkers,
> 
> As you may have noticed, we are pleased to announce that Flink Table Store 
> has joined the Apache Incubator as a separate project called Apache 
> Paimon(incubating) [1] [2] [3]. The new project still aims at building a 
> streaming data lake platform for high-speed data ingestion, change data 
> tracking and efficient real-time analytics, with the vision of supporting a 
> larger ecosystem and establishing a vibrant and neutral open source community.
> 
> We would like to thank everyone for their great support and efforts for the 
> Flink Table Store project, and warmly welcome everyone to join the 
> development and activities of the new project. Apache Flink will continue to 
> be one of the first-class citizens supported by Paimon, and we believe that 
> the Flink and Paimon communities will maintain close cooperation.
> 
> 亲爱的Flinkers，
> 
> 正如您可能已经注意到的，我们很高兴地宣布，Flink Table Store 已经正式加入 Apache 孵化器独立孵化 [1] [2] 
> [3]。新项目的名字是 Apache 
> Paimon(incubating)，仍致力于打造一个支持高速数据摄入、流式数据订阅和高效实时分析的新一代流式湖仓平台。此外，新项目将支持更加丰富的生态，并建立一个充满活力和中立的开源社区。
> 
> 在这里我们要感谢大家对 Flink Table Store 项目的大力支持和投入，并热烈欢迎大家加入新项目的开发和社区活动。Apache Flink 
> 将继续作为 Paimon 支持的主力计算引擎之一，我们也相信 Flink 和 Paimon 社区将继续保持密切合作。
> 
> Best Regards,
> Yu (on behalf of the Apache Flink PMC and Apache Paimon PPMC)
> 
> 致礼，
> 李钰（谨代表 Apache Flink PMC 和 Apache Paimon PPMC）
> 
> [1] https://paimon.apache.org/ 
> [2] https://github.com/apache/incubator-paimon 
> 
> [3] https://cwiki.apache.org/confluence/display/INCUBATOR/PaimonProposal 
>

Re: 项目中引入 flink-sql-connector-oracle-cdc-2.3.0.jar 后启动报解析配置异常

2023-03-25 Thread Leonard Xu

flink-sql-connector-xx 都是uber jar, 不应该在项目中直接uber jar，你在项目中应该引入 
flink-connector-xx 依赖并自己管理。


Best,
Leonard

> On Mar 25, 2023, at 3:25 PM, casel.chen  wrote:
> 
> 项目中引入 flink-sql-connector-oracle-cdc-2.3.0.jar 
> 后启动过程中报如下异常，查了一下该jar下有oracle.xml.jaxp.JXDocumentBuilderFactory类，有什么办法解决么？
> 
> 
> ERROR StatusLogger Caught javax.xml.parsers.ParserConfigurationException 
> setting feature http://xml.org/sax/features/external-general-entities to 
> false on DocumentBuilderFactory 
> oracle.xml.jaxp.JXDocumentBuilderFactory@68dc098b: 
> javax.xml.parsers.ParserConfigurationException
> javax.xml.parsers.ParserConfigurationException
> at 
> oracle.xml.jaxp.JXDocumentBuilderFactory.setFeature(JXDocumentBuilderFactory.java:374)
> at 
> org.apache.logging.log4j.core.config.xml.XmlConfiguration.setFeature(XmlConfiguration.java:204)
> at 
> org.apache.logging.log4j.core.config.xml.XmlConfiguration.disableDtdProcessing(XmlConfiguration.java:197)
> at 
> org.apache.logging.log4j.core.config.xml.XmlConfiguration.newDocumentBuilder(XmlConfiguration.java:186)
> at 
> org.apache.logging.log4j.core.config.xml.XmlConfiguration.(XmlConfiguration.java:89)
> at 
> org.apache.logging.log4j.core.config.xml.XmlConfigurationFactory.getConfiguration(XmlConfigurationFactory.java:46)
> at 
> org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:558)
> at 
> org.apache.logging.log4j.core.config.ConfigurationFactory$Factory.getConfiguration(ConfigurationFactory.java:482)
> at 
> org.apache.logging.log4j.core.config.ConfigurationFactory.getConfiguration(ConfigurationFactory.java:322)
> at 
> org.apache.logging.log4j.core.LoggerContext.reconfigure(LoggerContext.java:695)
>

[ANNOUNCE] Apache Flink 1.17.0 released

2023-03-23 Thread Leonard Xu

The Apache Flink community is very happy to announce the release of Apache 
Flink 1.17.0, which is the first release for the Apache Flink 1.17 series.

Apache Flink® is an open-source unified stream and batch data processing 
framework for distributed, high-performing, always-available, and accurate data 
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements for 
this release:
https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585
 

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

Best regards,
Qingsheng, Martijn, Matthias and Leonard

[ANNOUNCE] Apache Flink 1.17.0 released

2023-03-23 Thread Leonard Xu

The Apache Flink community is very happy to announce the release of Apache 
Flink 1.17.0, which is the first release for the Apache Flink 1.17 series.

Apache Flink® is an open-source unified stream and batch data processing 
framework for distributed, high-performing, always-available, and accurate data 
applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements for 
this release:
https://flink.apache.org/2023/03/23/announcing-the-release-of-apache-flink-1.17/

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351585
 

We would like to thank all contributors of the Apache Flink community who made 
this release possible!

Best regards,
Qingsheng, Martijn, Matthias and Leonard

Re: IntervalJoin invisibly becomes a regular Join - why?

2023-03-15 Thread Leonard Xu

> 
> CREATE TEMPORARY VIEW filteredResults AS
> SELECT * from suspiciousOrders WHERE small_ts > large_ts;

Looks like after added the condition, the final expanded query should not match 
the condition[1] of an interval join that leads to the planner recognize it as 
an interval join. It’s not a bug, interval join is a special case of regular 
join, thus the result would be still correct.

[1] 
https://nightlies.apache.org/flink/flink-docs-release-1.14/zh/docs/dev/datastream/operators/joining/#interval-join

Re: is there any detrimental side-effect if i set the max parallelismas 32768

2023-03-15 Thread Leonard Xu


> 
> 退订
请发送任意邮件到 user-unsubscr...@flink.apache.org 取消 订阅来自 user@flink.apache.org  
邮件列表的邮件，发送到 user@flink.apache.org 是不会取消订阅的。


> 发自我的iPhone
> 
> 
> -- Original --
> From: Tony Wei 
> Date: Tue,Mar 14,2023 1:11 PM
> To: David Anderson 
> Cc: Hangxiang Yu , user 
> Subject: Re: is there any detrimental side-effect if i set the max 
> parallelismas 32768
> 
> Hi Hangxiang, David,
> 
> Thank you for your replies. Your responses are very helpful.
> 
> Best regards,
> Tony Wei
> 
> David Anderson mailto:dander...@apache.org>> 於 
> 2023年3月14日 週二 下午12:12寫道：
> I believe there is some noticeable overhead if you are using the
> heap-based state backend, but with RocksDB I think the difference is
> negligible.
> 
> David
> 
> On Tue, Mar 7, 2023 at 11:10 PM Hangxiang Yu  > wrote:
> >
> > Hi, Tony.
> > "be detrimental to performance" means that some extra space overhead of the 
> > field of the key-group may influence performance.
> > As we know, Flink will write the key group as the prefix of the key to 
> > speed up rescaling.
> > So the format will be like: key group | key len | key | ..
> > You could check the relationship between max parallelism and bytes of key 
> > group as below:
> > --
> > max parallelism   bytes of key group
> >1281
> >   32768 2
> > --
> > So I think the cost will be very small if the real key length >> 2 bytes.
> >
> > On Wed, Mar 8, 2023 at 1:06 PM Tony Wei  > > wrote:
> >>
> >> Hi experts,
> >>
> >>> Setting the maximum parallelism to a very large value can be detrimental 
> >>> to performance because some state backends have to keep internal data 
> >>> structures that scale with the number of key-groups (which are the 
> >>> internal implementation mechanism for rescalable state).
> >>>
> >>> Changing the maximum parallelism explicitly when recovery from original 
> >>> job will lead to state incompatibility.
> >>
> >>
> >> I read the section above from Flink official document [1], and I'm 
> >> wondering what the detail is regarding to the side-effect.
> >>
> >> Suppose that I have a Flink SQL job with large state, large parallelism 
> >> and using RocksDB as my state backend.
> >> I would like to set the max parallelism as 32768, so that I don't bother 
> >> if the max parallelism can be divided by the parallelism whenever I want 
> >> to scale my job,
> >> because the number of key groups will not differ too much between each 
> >> subtask.
> >>
> >> I'm wondering if this is a good practice, because based on the official 
> >> document it is not recommended actually.
> >> If possible, I would like to know the detail about this side-effect. Which 
> >> state backend will have this issue? and Why?
> >> Please give me an advice. Thanks in advance.
> >>
> >> Best regards,
> >> Tony Wei
> >>
> >> [1] 
> >> https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/execution/parallel/#setting-the-maximum-parallelism
> >>  
> >> 
> >
> >
> >
> > --
> > Best,
> > Hangxiang.

[SUMMARY] Flink 1.17 Release Sync 3/7/2023

2023-03-07 Thread Leonard Xu

Hi devs and users,

I'd like to share some highlights from Flink 1.17 release sync on 3/7/2023. 

1.17 Blockers:
  - Currently, there is one blocker issue (FLINK-31351[1]) that needs to be 
resolved before we can create a votable RC1. Our contributors are working hard 
to fix it as soon as possible.

CI Instabilities & other issues:
  - Ubuntu mirror instabilities: FLINK-30921 Mattias is looking into this ticket
  - API backwards compatibility: FLINK-31167 Leonard will check the ticket again

Release management:
   - The release managers have completed the first circle review for the 
release announcement[2], and we are inviting contributors to add more features 
that have already been finalized in version 1.17.

The next release meeting will be on Mar 14, 2023. Feel free to join us!

Google Meet: https://meet.google.com/wcx-fjbt-hhz
Dial-in: https://tel.meet/wcx-fjbt-hhz?pin=1940846765126

Best regards,
Martijn, Qingsheng, Matthias and Leonard
[1] https://issues.apache.org/jira/browse/FLINK-31351
[2] 
https://docs.google.com/document/d/1aao4ATNcDBlDNdZ7VFFfrjrTGfUDcSTsvVzZrSSked4

Re: 退订

2023-03-02 Thread Leonard Xu

Please send an email to user-unsubscr...@flink.apache.org 
 to unsubscribe

> On Mar 3, 2023, at 8:42 AM, zhangjunjie  wrote:
> 
> 退订
> 
>

Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released

2023-01-31 Thread Leonard Xu

Thanks Yanfei for driving the release ! !


Best,
Leonard

> On Jan 31, 2023, at 3:43 PM, Yun Tang  wrote:
> 
> Thanks Yuanfei for driving the frocksdb release!
> 
> Best
> Yun Tang
> From: Yuan Mei 
> Sent: Tuesday, January 31, 2023 15:09
> To: Jing Ge 
> Cc: Yanfei Lei ; d...@flink.apache.org 
> ; user ; 
> user-zh@flink.apache.org 
> Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released
>  
> Thanks Yanfei for driving the release!
> 
> Best
> Yuan
> 
> On Mon, Jan 30, 2023 at 8:46 PM Jing Ge via user  > wrote:
> Hi Yanfei,
> 
> Thanks for your effort. Looking forward to checking it.
> 
> Best regards,
> Jing
> 
> On Mon, Jan 30, 2023 at 1:42 PM Yanfei Lei  > wrote:
> It is very happy to announce the release of FRocksDB 6.20.3-ververica-2.0.
> 
> Compiled files for Linux x86, Linux arm, Linux ppc64le, MacOS x86,
> MacOS arm, and Windows are included in FRocksDB 6.20.3-ververica-2.0
> jar, and the FRocksDB in Flink 1.17 would be updated to
> 6.20.3-ververica-2.0.
> 
> Release highlights:
> - [FLINK-30457] Add periodic_compaction_seconds option to RocksJava[1].
> - [FLINK-30321] Upgrade ZLIB of FRocksDB to 1.2.13[2].
> - Avoid expensive ToString() call when not in debug[3].
> - [FLINK-24932] Support build FRocksDB Java on Apple silicon[4].
> 
> Maven artifacts for FRocksDB can be found at:
> https://mvnrepository.com/artifact/com.ververica/frocksdbjni 
> 
> 
> We would like to thank all efforts from the Apache Flink community
> that made this release possible!
> 
> [1] https://issues.apache.org/jira/browse/FLINK-30457 
> 
> [2] https://issues.apache.org/jira/browse/FLINK-30321 
> 
> [3] https://github.com/ververica/frocksdb/pull/55 
> 
> [4] https://issues.apache.org/jira/browse/FLINK-24932 
> 
> 
> Best regards,
> Yanfei
> Ververica(Alibaba)

Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released

2023-01-31 Thread Leonard Xu

Thanks Yanfei for driving the release ! !


Best,
Leonard

> On Jan 31, 2023, at 3:43 PM, Yun Tang  wrote:
> 
> Thanks Yuanfei for driving the frocksdb release!
> 
> Best
> Yun Tang
> From: Yuan Mei 
> Sent: Tuesday, January 31, 2023 15:09
> To: Jing Ge 
> Cc: Yanfei Lei ; d...@flink.apache.org 
> ; user ; 
> user...@flink.apache.org 
> Subject: Re: [ANNOUNCE] FRocksDB 6.20.3-ververica-2.0 released
>  
> Thanks Yanfei for driving the release!
> 
> Best
> Yuan
> 
> On Mon, Jan 30, 2023 at 8:46 PM Jing Ge via user  > wrote:
> Hi Yanfei,
> 
> Thanks for your effort. Looking forward to checking it.
> 
> Best regards,
> Jing
> 
> On Mon, Jan 30, 2023 at 1:42 PM Yanfei Lei  > wrote:
> It is very happy to announce the release of FRocksDB 6.20.3-ververica-2.0.
> 
> Compiled files for Linux x86, Linux arm, Linux ppc64le, MacOS x86,
> MacOS arm, and Windows are included in FRocksDB 6.20.3-ververica-2.0
> jar, and the FRocksDB in Flink 1.17 would be updated to
> 6.20.3-ververica-2.0.
> 
> Release highlights:
> - [FLINK-30457] Add periodic_compaction_seconds option to RocksJava[1].
> - [FLINK-30321] Upgrade ZLIB of FRocksDB to 1.2.13[2].
> - Avoid expensive ToString() call when not in debug[3].
> - [FLINK-24932] Support build FRocksDB Java on Apple silicon[4].
> 
> Maven artifacts for FRocksDB can be found at:
> https://mvnrepository.com/artifact/com.ververica/frocksdbjni 
> 
> 
> We would like to thank all efforts from the Apache Flink community
> that made this release possible!
> 
> [1] https://issues.apache.org/jira/browse/FLINK-30457 
> 
> [2] https://issues.apache.org/jira/browse/FLINK-30321 
> 
> [3] https://github.com/ververica/frocksdb/pull/55 
> 
> [4] https://issues.apache.org/jira/browse/FLINK-24932 
> 
> 
> Best regards,
> Yanfei
> Ververica(Alibaba)

[SUMMARY] Flink 1.17 Release Sync 12/13/2022

2022-12-13 Thread Leonard Xu

Hi devs and users,

I’d like to share the highlights about the 1.17 release sync on 12/13/2022.

- Release tracking page:
 -  1.17 development is moving forward [1], we have 5 weeks remaining
 - @committers Please continuously update the the progress in the 1.17 page

- Externalized Connectors :
  - flink-connector-aws v4.0.0 released
  - flink-connector-pulsar v3.0.0 started the VOTE
  - flink-connector-kafka PR is reviewing

- Blockers:
- Blockers FLINK-28766 and FLINK-29461 have been FIXED
- FLINK-29405 - InputFormatCacheLoaderTest is unstable OPEN 
  Qingsheng will have a look at the PR
- FLINK-26974 - Python EmbeddedThreadDependencyTests.test_add_python_file 
failed on azure OPEN 
  Xingbo is working on this
- FLINK-18356 - flink-table-planner Exit code 137 returned from process 
REOPENED 
  Leonard will ping Godfrey to take a look
- FLINK-27916 - HybridSourceReaderTest.testReader failed with 
AssertionError REOPENED
  Martijn will ping Thomas once more

- How to have monitoring and quality control for the externalized connectors? 
  Martijn will make a proposal and open a dev discussion.

- Will the feature freeze date be affected by COVID and Spring Festival 
holiday?   
  Leonard will discuss with Chinese devs firstly and then make a proposal if 
needed before the next release sync.

The next release sync will be on December 27th, 2022, feel free to join us  if 
you are interested!

Google Meet: https://meet.google.com/wcx-fjbt-hhz
Dial-in: https://tel.meet/wcx-fjbt-hhz?pin=1940846765126  

Best regards,
Martijn, Qingsheng, Matthias and Leonard 

[1] https://cwiki.apache.org/confluence/display/FLINK/1.17+Release

Re: FlinkCDC可以读到MySQL变更数据，但是插不到新的MySQL表里

2022-11-29 Thread Leonard Xu



> On Nov 4, 2022, at 2:34 PM, 左岩 <13520871...@163.com> wrote:
> 
> tenv.executeSql("xxx);
> env.execute();


这样使用是不对的，你可以看下这两个方法的java doc

祝好，
Leonard

Re: flink sql接cdc数据源按最新数据统计问题

2022-11-29 Thread Leonard Xu



> On Nov 29, 2022, at 8:32 AM, casel.chen  wrote:
> 
> 业务需求是mysql订单表按天按供应商实时统计交易金额，订单表会发生修改和删除，用flink 
> sql要如何实现呢？开窗取最新一条记录再聚合吗？如果遇到delete记录会不会减去相应的price呢？试着写了如下flink sql不知道对不对


会的，可以看下flink sql相关的原理文章，百度/谷歌一搜一大把。

祝好
Leonard


> 
> 
> select 
>  s.biddate, 
>  s.supplier, 
>  sum(s.price) 
> from 
>  (
>select 
>  * 
>from 
>  (
>select 
>  biddate, 
>  supplier, 
>  price, 
>  ROW_NUMBER() OVER (
>PARTITION BY biddate, 
>supplier 
>ORDER BY 
>  bidtime DESC
>  ) as rownum 
>from 
>  (
>select 
>  bidtime, 
>  date_format(bidtime, '-MM-dd-HH') as biddate, 
>  supplier, 
>  price 
>from 
>  orders
>  )
>  ) as t 
>where 
>  t.rownum = 1
>  ) as s 
> group by 
>  s.biddate, 
>  s.supplier
> ;
>

Re: debezium-json数据timestamp类型时区问题

2022-11-24 Thread Leonard Xu

你在Oracle 数据库中的数据类型是TIMESTAMP 还是 TIMESTAMP WITH LOCAL TIME ZONE？ 
我猜是后者，如果是后者直接在Flink SQL 里TIMESTAMP_LTZ 类型去映射就可以了
Oracle 的TIMESTAMP LTZ 类型和Flink SQL的TIMESTAMP LTZ类型含义和存储都是一致的语义，即epoch 
mills，存储时不需要时区。这两个类型都是在各自的系统中在在需要查看这些数据时，需要用 session 时区从epoch mills 
转换成可读timestamp格式的字符串。

Oracle 设置session 时区的命令是：
ALTER SESSION SET TIME_ZONE='Asia/Shanghai';

Flink SQL 设置session 时区的命令是:
Flink SQL> SET 'table.local-time-zone' = 'Asia/Shanghai';

祝好，
Leonard


> On Nov 22, 2022, at 4:32 PM, Kyle Zhang  wrote:
> 
> Hi all，
>我们有一个场景，是把oracle数据通过debezium-oracle-cdc插件抽到kafka中，后面接flink
> sql分析，现在遇到一个时区的问题，比如数据库中有一个timestamp类型的字段，值是‘2022-11-17
> 16:16:44’，但是debezium处理的时候用了int64保存，还不带时区信息，变成1668701804000，导致flink
> sql中用FROM_UNIXTIME处理后变成‘2022-11-18 00:16:44
> ’，差了8小时，需要手工再减8h。请问有没有一种统一的方式处理这种情况？
> 
> Best

Re: Weird Flink SQL error

2022-11-24 Thread Leonard Xu

Do not trust the line number from sql parser exception, you should use ROW in your DDL when you declare a composite row type, try the 
following:
CREATE TABLE test_content_metrics (
   dt STRING NOT NULL,
   `body` ROW<
   `platform_id` BIGINT,
   `content_id` STRING
   >
) PARTITIONED BY (dt) WITH (
   'connector' = 'filesystem',
   'path' = 'etl/test_content_metrics',
   'format' = 'json',
)

Best,
Leonard


> On Nov 25, 2022, at 11:20 AM, Dan Hill  wrote:
> 
> Also, if I try to do an aggregate inside the ROW, I get an error.  I don't 
> get the error if it's not wrapped in.a Row.
> 
> ROW(
> SUM(view_count)
> ) AS body,
> 
>  Caused by: org.apache.flink.table.api.SqlParserException: SQL parse 
> failed. Encountered "SUM" at line 8, column 5.
> Was expecting one of:
>  
> "EXCEPT" ...
> "FETCH" ...
> "FROM" ...
> "INTERSECT" ...
> "LIMIT" ...
> "OFFSET" ...
> "ORDER" ...
> "MINUS" ...
> "UNION" ...
> "," ...
> 
>
> org.apache.flink.table.planner.parse.CalciteParser.parse(CalciteParser.java:56)
>
> org.apache.flink.table.planner.delegation.ParserImpl.parse(ParserImpl.java:98)
>
> org.apache.flink.table.api.internal.TableEnvironmentImpl.executeSql(TableEnvironmentImpl.java:736)
>
> ai.promoted.metrics.logprocessor.job.contentmetrics.ContentMetricsJob.executeSqlFromResource(ContentMetricsJob.java:148)
>[...]
>  Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered 
> "SUM" at line 8, column 5.
> Was expecting one of:
>  
> "EXCEPT" ...
> "FETCH" ...
> "FROM" ...
> "INTERSECT" ...
> "LIMIT" ...
> "OFFSET" ...
> "ORDER" ...
> "MINUS" ...
> "UNION" ...
> "," ...
> 
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:462)
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:225)
>
> org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:140)
>org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:155)
>org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:180)
>[...]
>  Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
> "SUM" at line 8, column 5.
> Was expecting one of:
>  
> "EXCEPT" ...
> "FETCH" ...
> "FROM" ...
> "INTERSECT" ...
> "LIMIT" ...
> "OFFSET" ...
> "ORDER" ...
> "MINUS" ...
> "UNION" ...
> "," ...
> 
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:40981)
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.jj_consume_token(FlinkSqlParserImpl.java:40792)
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.SqlStmtEof(FlinkSqlParserImpl.java:3981)
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.parseSqlStmtEof(FlinkSqlParserImpl.java:273)
>org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:153)
>[...]
> 
> On Thu, Nov 24, 2022 at 6:41 PM Dan Hill  > wrote:
> Here's the full stack trace.
> 
> => org.apache.flink.table.api.SqlParserException: SQL parse failed. 
> Encountered "." at line 1, column 336.
> Was expecting one of:
> ")" ...
> "," ...
> 
>
> org.apache.flink.table.planner.parse.CalciteParser.parse(CalciteParser.java:56)
>
> org.apache.flink.table.planner.calcite.FlinkPlannerImpl$ToRelContextImpl.expandView(FlinkPlannerImpl.scala:270)
>
> org.apache.calcite.plan.ViewExpanders$1.expandView(ViewExpanders.java:52)
>
> org.apache.flink.table.planner.catalog.SqlCatalogViewTable.convertToRel(SqlCatalogViewTable.java:58)
>
> org.apache.flink.table.planner.plan.schema.ExpandingPreparingTable.expand(ExpandingPreparingTable.java:59)
>[...]
>  Caused by: org.apache.calcite.sql.parser.SqlParseException: Encountered 
> "." at line 1, column 336.
> Was expecting one of:
> ")" ...
> "," ...
> 
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.convertException(FlinkSqlParserImpl.java:462)
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.normalizeException(FlinkSqlParserImpl.java:225)
>
> org.apache.calcite.sql.parser.SqlParser.handleException(SqlParser.java:140)
>org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:155)
>org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:180)
>[...]
>  Caused by: org.apache.flink.sql.parser.impl.ParseException: Encountered 
> "." at line 1, column 336.
> Was expecting one of:
> ")" ...
> "," ...
> 
>
> org.apache.flink.sql.parser.impl.FlinkSqlParserImpl.generateParseException(FlinkSqlParserImpl.java:40981)
>
>

[SUMMARY] Flink 1.17 Release Sync 11/15/2022

2022-11-15 Thread Leonard Xu

Hi devs and users,

I’d like to share some highlights about the 1.17 release sync on 11/15/2022.

- Release tracking page:
 - The community has collected some great features on the 1.17 page[1]
 - @committers Please continuously update the page in the coming week
 
- JIRA account apply :
  - Martijn updated the issue tracking flow[2][3]
  - Users without JIRA account can follow this doc[2][3] to apply JIRA 
account as well join as Flink contributor

- Blockers:
- Blocker FLINK-29387 has been fixed
- PR for blocker FLINK-29315 is opened and waiting for review.
- Blocker FLINK-29818 is reopened, Yang Wang is looking into this ticket

- Build stability: Number of growing test stability issues with “Exit code 137 
errors”
- Matthias and Qingsheng investigated the memory issue due to multiple 
azure agents on one machine use too much resources
- We’ve reduced the agents number from 7 to 5, let’s keep an eyes on this 
issue.
- Leonard offered a workaround to skip slack clickable issue in slack 
#builds channel

The next release sync will be on November 29th, 2022.

Google Meet: https://meet.google.com/wcx-fjbt-hhz
Dial-in: https://tel.meet/wcx-fjbt-hhz?pin=1940846765126  

Best regards,
Martijn, Qingsheng, Matthias and Leonard 

[1] https://cwiki.apache.org/confluence/display/FLINK/1.17+Release
[2] https://flink.apache.org/community.html
[3] https://flink.apache.org/zh/community.html

Re: [ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Leonard Xu

Thanks Chesnay and Martijn for the great work!   I believe the 
flink-connector-shared-utils[1] you built will help Flink connector developers 
a lot.


Best,
Leonard
[1] https://github.com/apache/flink-connector-shared-utils

> 2022年11月10日 下午9:53，Martijn Visser  写道：
> 
> Really happy with the first externalized connector for Flink. Thanks a lot to 
> all of you involved!
> 
> On Thu, Nov 10, 2022 at 12:51 PM Chesnay Schepler  > wrote:
> The Apache Flink community is very happy to announce the release of 
> Apache Flink Elasticsearch Connector 3.0.0.
> 
> Apache Flink® is an open-source stream processing framework for 
> distributed, high-performing, always-available, and accurate data 
> streaming applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
> 
> This release marks the first time we have released a connector 
> separately from the main Flink release.
> Over time more connectors will be migrated to this release model.
> 
> This release is equivalent to the connector version released alongside 
> Flink 1.16.0 and acts as a drop-in replacement.
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12352291 
> 
> 
> We would like to thank all contributors of the Apache Flink community 
> who made this release possible!
> 
> Regards,
> Chesnay

Re: [ACCOUNCE] Apache Flink Elasticsearch Connector 3.0.0 released

2022-11-10 Thread Leonard Xu

Thanks Chesnay and Martijn for the great work!   I believe the 
flink-connector-shared-utils[1] you built will help Flink connector developers 
a lot.


Best,
Leonard
[1] https://github.com/apache/flink-connector-shared-utils

> 2022年11月10日 下午9:53，Martijn Visser  写道：
> 
> Really happy with the first externalized connector for Flink. Thanks a lot to 
> all of you involved!
> 
> On Thu, Nov 10, 2022 at 12:51 PM Chesnay Schepler  > wrote:
> The Apache Flink community is very happy to announce the release of 
> Apache Flink Elasticsearch Connector 3.0.0.
> 
> Apache Flink® is an open-source stream processing framework for 
> distributed, high-performing, always-available, and accurate data 
> streaming applications.
> 
> The release is available for download at:
> https://flink.apache.org/downloads.html 
> 
> 
> This release marks the first time we have released a connector 
> separately from the main Flink release.
> Over time more connectors will be migrated to this release model.
> 
> This release is equivalent to the connector version released alongside 
> Flink 1.16.0 and acts as a drop-in replacement.
> 
> The full release notes are available in Jira:
> https://issues.apache.org/jira/projects/FLINK/versions/12352291 
> 
> 
> We would like to thank all contributors of the Apache Flink community 
> who made this release possible!
> 
> Regards,
> Chesnay

Re: UDFs classloading changes in 1.16

2022-11-04 Thread Leonard Xu

Thanks Alexander for reporting this issue, Could you open a jira ticket as well?

CC: Shengkai, please take a look this ticket, looks like an incompatibility 
change.

Best,
Leonard



> 2022年11月4日 下午6:15，Alexander Fedulov  写道：
> 
> Hi everyone,
> 
> 1.16 introduced quite a lot of changes with respect to classloading in the 
> Table API. The way UDFs could previously be loaded from JARs in 1.15 does not 
> work in 1.16 anymore - it fails with the ClassNotFound exception when UDFs 
> are used at runtime. 
> 
> Here is a repository with a reproducible example:
> https://github.com/afedulov/udfs-flink-1.16/blob/main/src/test/java/com/example/UDFTest.java
>  
> 
> 
> It works as is (Flink 1.15.2) and fails when switching the dependencies to 
> 1.16.0.
> 
> Here are some of the PRs that, I believe, might be related to the issue:
> https://github.com/apache/flink/pull/20001 
> 
> https://github.com/apache/flink/pull/19845 
> 
> https://github.com/apache/flink/pull/20211 
>  (fixes a similar issue 
> introduced after classloading changes in 1.16)
> 
> How can UDF JARs be loaded in 1.16?
> 
> Best,
> Alexander Fedulov

Re: flinkcdc 读不到mysql中数据

2022-11-02 Thread Leonard Xu

Flink CDC 社区有提供1.14支持的，2.2.1版本即可。你这个好像是没有开启checkpoint, 开启下就好了。
// enable checkpoint
env.enableCheckpointing(1000);


祝好，
Leonard

> 2022年11月3日 上午11:34，左岩 <13520871...@163.com> 写道：
> 
> 我用的是flink1.14 
> ，因为官方没有匹配的版本，所以自己编译的flinkCDC，binlog也开启了，然后也没报错，读不到mysql的数据，idea控制台不报错也不输出数据，可能是什么原因呢（运行日志见附件）
> public static void main(String[] args) throws Exception {
> Configuration conf = new Configuration();
> conf.setInteger("rest.port", 10041);
> StreamExecutionEnvironment env = 
> StreamExecutionEnvironment.getExecutionEnvironment(conf);
> StreamTableEnvironment tenv = StreamTableEnvironment.create(env);
> 
> env.enableCheckpointing(1000, CheckpointingMode.EXACTLY_ONCE);
> 
> env.getCheckpointConfig().setCheckpointStorage("file:///d:/checkpoint3");
> 
> //StreamTableEnvironment tenv = StreamTableEnvironment.create(env);
> 
> env.setParallelism(4);
> 
> // 建表
> tenv.executeSql("CREATE TABLE flink_t_stu ( " +
> "  userid INT, " +
> "  username string, " +
> "  age string, " +
> "  `partition` INT, " +
> " PRIMARY KEY(userid) NOT ENFORCED " +
> " ) WITH ( " +
> " 'connector' = 'mysql-cdc', " +
> " 'server-id' = '5401-5404', " +
> " 'hostname' = '192.168.0.220', " +
> " 'port' = '3306', " +
> " 'username' = 'root', " +
> " 'password' = 'root', " +
> " 'database-name' = 'zy', " +
> " 'table-name' = 't_stu' " +
> ")");
> 
> // 查询
> tenv.executeSql("select * from flink_t_stu").print();
> 
> env.execute();
> 
> }
>

Re: Flink CDC 打宽表

2022-11-02 Thread Leonard Xu

是的，如果是双流join打宽，ttl设置得过短，state里的历史数据被清理掉了，后续的更新数据进入join节点关联不上就可能会下发nul

祝好，
Leonard

> 2022年11月2日 上午11:49，Fei Han  写道：
> 
> 大家好！关于 Flink CDC 打宽表有如下疑问：
> 启动一个任务后，刚开始个字段是有值的。但跑一段时间或者跨天后，字段无缘无故是null值了。用其他引擎跑数据却是正常的。
> 比如第一天启动任务，A字段是有值的。但是第二天发现A字段全部是NULL值了。但用presto查询却是正常的。但我猜测是不是和TTL设置有关系呢？我设置了1天。

Re: Flink CDC2.2.1 设置server id范围

2022-10-31 Thread Leonard Xu


> 2022年10月31日 下午4:57，林影  写道：
> 
> Hi, Leonard.
> 
> 我也有类似的疑惑。
> 
> 有个线上的Flink Application之前配置的serverid 是
> 6416-6418，并行度之前是3，后来缩容的时候并行度改成2了，在这种场景下serverid的范围需要进行调整吗？

缩容并不需要的，你的case里只会用6416 和 6417这两个id，只有扩容需要考虑，并且扩容时如果没有夸大范围，目前是会报错提示的。

祝好，
Leonard




> 
> casel.chen  于2022年10月31日周一 16:50写道：
> 
>> 
>> 
>> 
>> 
>> server-id配置范围对于后面修改并发度是不是不太友好？每改一次并发度就得重新调整server-id范围么？还是说先配置一个较大的server-id范围，在在这个较大的范围内调整并发度？
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 在 2022-10-31 16:04:32，"Leonard Xu"  写道：
>>> Hi,
>>> 
>> 
>>> 你5张表对应的source并发是多少呀？如果是多并发需要把server-id设置成一个范围，范围和并发数匹配，比如4个并发，应该类似’1101-1104’.
>>> 另外 server-id 是全局唯一的，你需要确保下你使用的server-id 和其他作业、其他同步工具都不冲突才可以。
>>> 
>>> 
>>> Best,
>>> Leonard
>>> 
>>> 
>>>> 2022年10月31日 下午4:00，Fei Han  写道：
>>>> 
>>>> 大家好！
>>>> 现在我在 Flink CDC2.2.1设置了server id。有5张表且server id的范围都不同，通过Flink CDC
>> 打宽表。但是在任务跑一段时间后，还是出现如下报错：
>>>> Caused by: com.github.shyiko.mysql.binlog.network.ServerException: A
>> slave with the same server_uuid/server_id as this slave has connected to
>> the master;
>>>> 请教下各位，还有什么解决方案没有
>>> 
>>

Re: Flink CDC2.2.1 设置server id范围

2022-10-31 Thread Leonard Xu


> server-id配置范围对于后面修改并发度是不是不太友好？每改一次并发度就得重新调整server-id范围么？还是说先配置一个较大的server-id范围，在在这个较大的范围内调整并发度？

作业起来后修改并发是需要调整的，建议这块可以放到平台里去设计，这样可以让写sql的用户知道with参数里参数的作用。

祝好，
Leonard


> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 在 2022-10-31 16:04:32，"Leonard Xu"  写道：
>> Hi, 
>> 
>> 你5张表对应的source并发是多少呀？如果是多并发需要把server-id设置成一个范围，范围和并发数匹配，比如4个并发，应该类似’1101-1104’.
>> 另外 server-id 是全局唯一的，你需要确保下你使用的server-id 和其他作业、其他同步工具都不冲突才可以。
>> 
>> 
>> Best,
>> Leonard
>> 
>> 
>>> 2022年10月31日 下午4:00，Fei Han  写道：
>>> 
>>> 大家好！
>>> 现在我在 Flink CDC2.2.1设置了server id。有5张表且server id的范围都不同，通过Flink CDC 
>>> 打宽表。但是在任务跑一段时间后，还是出现如下报错：
>>> Caused by: com.github.shyiko.mysql.binlog.network.ServerException: A slave 
>>> with the same server_uuid/server_id as this slave has connected to the 
>>> master;
>>> 请教下各位，还有什么解决方案没有
>>

Re: flink sql client取消sql-clients-default.yaml后那些预置catalogs建议在哪里定义呢？

2022-10-31 Thread Leonard Xu

Hi,

我记得有个-i 参数可以指定初始化sql文件，你贴你的初始化sql在文件里加进去就可以了。

祝好，
Leonard




> 2022年10月31日 下午4:52，casel.chen  写道：
> 
> flink新版本已经找不到sql-clients-default.yaml文件了，那么之前配置的那些预置catalogs建议在哪里定义呢？通过初始化sql么？

Re: OutOfMemoryError (java heap space) on small, local test

2022-10-31 Thread Leonard Xu

Hi, Matt

I’ve checked your job is pretty simple, I've CC Xingbo who is a PyFlink expert 
to help take a quick look. 


Best，
Leonard

> 2022年10月31日 上午11:47，Matt Fysh  写道：
> 
> Hi there,
> 
> I am running a local test with:
> * source = env.from_collection
> * sink = datastream.execute_and_collect
> with a map function between, and two very small data points in the collection
> 
> I'm able to generate an OutOfMemoryError, and due to the nature of this test 
> using simple source and sink, plus not having large data size requirements, I 
> suspect this is due to a bug.
> 
> I'm running v1.13.2 and have created a docker-based reproduction repository 
> here: https://github.com/mattfysh/pyflink-oom 
> 
> 
> Please take a look and let me know what you think
> 
> Thanks!
> Matt

Re: Flink CDC2.2.1 设置server id范围

2022-10-31 Thread Leonard Xu

Hi, 

你5张表对应的source并发是多少呀？如果是多并发需要把server-id设置成一个范围，范围和并发数匹配，比如4个并发，应该类似’1101-1104’.
另外 server-id 是全局唯一的，你需要确保下你使用的server-id 和其他作业、其他同步工具都不冲突才可以。


Best,
Leonard


> 2022年10月31日 下午4:00，Fei Han  写道：
> 
> 大家好！
> 现在我在 Flink CDC2.2.1设置了server id。有5张表且server id的范围都不同，通过Flink CDC 
> 打宽表。但是在任务跑一段时间后，还是出现如下报错：
> Caused by: com.github.shyiko.mysql.binlog.network.ServerException: A slave 
> with the same server_uuid/server_id as this slave has connected to the master;
> 请教下各位，还有什么解决方案没有

Re: Performing left join between two streams

2022-10-30 Thread Leonard Xu

Hi, Lalwani

Flink does not support outer join on two data streams now[1], you can use the 
DataStream API ds1.coGroup(ds2) as a workaround. Flink SQL support outer joins 
well, you can also try SQL way[2].
 

Best,
Leonard

[1] https://issues.apache.org/jira/browse/FLINK-4187
[2] 
https://nightlies.apache.org/flink/flink-docs-master/api/java/org/apache/flink/api/common/functions/CoGroupFunction.html
[2] 
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/joins/#outer-equi-join
 



> 2022年10月28日 下午3:11，Surendra Lalwani via user  写道：
> 
> Hi Team,
> 
> Is it possible in Flink to perform Left Outer join between two streams as it 
> is possible in Spark. I checked for Internal join it only supports inner join 
> as of now.
> 
> Thanks and Regards ,
> Surendra Lalwani
> 
> 
> IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
> confidential information and is intended only for the addressee(s) named 
> above. If you are not the intended recipient(s), you should not disseminate, 
> distribute, or copy this e-mail. Please notify the sender by reply e-mail 
> immediately if you have received this e-mail in error and permanently delete 
> all copies of the original message from your system. E-mail transmission 
> cannot be guaranteed to be secure as it could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or contain viruses. Company 
> accepts no liability for any damage or loss of confidential information 
> caused by this email or due to any virus transmitted by this email or 
> otherwise.

Re: Flink SQL 问题请教

2022-10-22 Thread Leonard Xu

你好，你的Flink 版本多少呀？我记得低版本才有这个问题。
另外SQL可以贴下嘛？

祝好，
Leonard



> 2022年10月22日 上午11:11，邮件帮助中心  写道：
> 
> 大家好！
>最近在开发一个项目时，在使用CDC表和维表表做Temporal Table 
> JOIN时，发现2个表Join时join字段的类型必须一致，否则提交时提示如下的错误
>The main method caused an error: Temporal table join requires an equality 
> condition on fields of table.
>为了解决上述问题，我们做了如下尝试：
> 1：在join时，对维表要关联的字段使用cast转换，如： JOIN ON CAST(tableA.filedA  AS INT) = 
> cdc_table_b.fieldB,将2个关联表的关联字段类型保持一致
> 2：在维表上建立一个视图，在视图定义字段的类型和select时使用cast转换，然后视图和cdc表进行join, 
> 此时join时字段类型理论上是一致的，
>很可惜，上述2个解决办法未能解决问题，都是提示上述同样的错误（The main method caused an error: Temporal 
> table join requires an equality condition on fields of 
> table），如果在DDL中将维表要jion的字段和CDC表join的字段定义成相同的类型时，提交时不报上述错误，但在运行过程中处理数据时会出现castException，请教下大家上述问题可以怎么解决？不胜感激！

Re: [DISCUSS] Reverting sink metric name changes made in 1.15

2022-10-10 Thread Leonard Xu

Thanks Qingsheng for starting this thread.

+1 on reverting sink metric name and releasing 1.15.3 to fix this inconsistent 
behavior.


Best,
Leonard





> 2022年10月10日 下午3:06，Jark Wu  写道：
> 
> Thanks for discovering this problem, Qingsheng!
> 
> I'm also +1 for reverting the breaking changes. 
> 
> IIUC, currently, the behavior of "numXXXOut" metrics of the new and old sink 
> is inconsistent. 
> We have to break one of them to have consistent behavior. Sink V2 is an 
> evolving API which is just introduced in 1.15. 
> I think it makes sense to break the unstable API instead of the stable API 
> which many connectors and users depend on.
> 
> Best,
> Jark
> 
> 
> 
> On Mon, 10 Oct 2022 at 11:36, Jingsong Li  > wrote:
> Thanks for driving, Qingsheng.
> 
> +1 for reverting sink metric name.
> 
> We often forget that metric is also one of the important APIs.
> 
> +1 for releasing 1.15.3 to fix this.
> 
> Best,
> Jingsong
> 
> On Sun, Oct 9, 2022 at 11:35 PM Becket Qin  > wrote:
> >
> > Thanks for raising the discussion, Qingsheng,
> >
> > +1 on reverting the breaking changes.
> >
> > In addition, we might want to release a 1.15.3 to fix this and update the 
> > previous release docs with this known issue, so that users can upgrade to 
> > 1.15.3 when they hit it. It would also be good to add some backwards 
> > compatibility tests on metrics to avoid unintended breaking changes like 
> > this in the future.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Sun, Oct 9, 2022 at 10:35 AM Qingsheng Ren  > > wrote:
> >>
> >> Hi devs and users,
> >>
> >> I’d like to start a discussion about reverting a breaking change about 
> >> sink metrics made in 1.15 by FLINK-26126 [1] and FLINK-26492 [2].
> >>
> >> TL;DR
> >>
> >> All sink metrics with name “numXXXOut” defined in FLIP-33 are replace by 
> >> “numXXXSend” in FLINK-26126 and FLINK-26492. Considering metric names are 
> >> public APIs, this is a breaking change to end users and not backward 
> >> compatible. Also unfortunately this breaking change was not discussed in 
> >> the mailing list before.
> >>
> >> Background
> >>
> >> As defined previously in FLIP-33 (the FLIP page has been changed so please 
> >> refer to the old version [3] ), metric “numRecordsOut” is used for 
> >> reporting the total number of output records since the sink started 
> >> (number of records written to the external system), and similarly for 
> >> “numRecordsOutPerSecond”, “numBytesOut”, “numBytesOutPerSecond” and 
> >> “numRecordsOutError”. Most sinks are following this naming and definition. 
> >> However, these metrics are ambiguous in the new Sink API as “numXXXOut” 
> >> could be used by the output of SinkWriterOperator for reporting number of 
> >> Committables delivered to SinkCommitterOperator. In order to resolve the 
> >> conflict, FLINK-26126 and FLINK-26492 changed names of these metrics with 
> >> “numXXXSend”.
> >>
> >> Necessity of reverting this change
> >>
> >> - Metric names are actually public API, as end users need to configure 
> >> metric collecting and alerting system with metric names. Users have to 
> >> reset all configurations related to affected metrics.
> >> - This could also affect custom and external sinks not maintained by 
> >> Flink, which might have implemented with numXXXOut metrics.
> >> - The number of records sent to external system is way more important than 
> >> the number of Committables sent to SinkCommitterOperator, as the latter 
> >> one is just an internal implementation of sink. We could have a new metric 
> >> name for the latter one instead.
> >> - We could avoid splitting the project by version (like “plz use numXXXOut 
> >> before 1.15 and use numXXXSend after”) if we revert it ASAP, cosidering 
> >> 1.16 is still not released for now.
> >>
> >> As a consequence, I’d like to hear from devs and users about your opinion 
> >> on changing these metrics back to “numXXXOut”.
> >>
> >> Looking forward to your reply!
> >>
> >> [1] https://issues.apache.org/jira/browse/FLINK-26126 
> >> 
> >> [2] https://issues.apache.org/jira/browse/FLINK-26492 
> >> 
> >> [1] FLIP-33, version 18: 
> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=211883136 
> >> 
> >>
> >> Best,
> >> Qingsheng

Re: 看了官方文档的Versioned Table，有一些小疑惑希望可以得到解答

2022-08-08 Thread Leonard Xu




> 2022年8月8日 下午3:34，林影  写道：
> 
> 先上链接， Versioned Table
> 
> 从文档描述中可知，以Upsert-Kafka作为Source，以debezium或canal作为format时，可被认为是Versioned
> Table Source。
> 
> 1. 那么flink cdc所提供的connector下，是否也可以被认定为一种Versioned Table Source?
可以，cdc 流上定义了 pk 和 watermark就可以作为 versioned table


> 2. Versioned Table 在转化成DataStream时，转化后是否必定是一个restract stream？
是的，所有cdc流（即changelog流）从SQL API转到Datastream时都是一个retractStream

> 3. 是否所有的Versioned Table，都可以发送往带有撤销能力的sink（如MySQL/ES/Hudi等等）？


是的，只要sink支持回撤（retract），那么sink就支持消费changelog流


祝好，
Leonard

Re: accuracy validation of streaming pipeline

2022-05-24 Thread Leonard Xu

Hi, vtygoss

> I'm working on migrating from full-data-pipeline(with spark) to 
> incremental-data-pipeline(with flink cdc), and i met a problem about accuracy 
> validation between pipeline based flink and spark.

Glad to hear that !



> For bounded data, it's simple to validate the two result sets are consitent 
> or not. 
> But, for unbouned data and event-driven application, how to make sure the 
> data stream produced is correct, especially when there are some retract 
> functions with high impactions, e.g. row_number. 
> 
> Is there any document for this preblom?  Thanks for your any suggestions or 
> replies. 

The validation feature belongs data quality scope from my understanding, it’s 
usually provided by the platform e.g. the Data Integration Platform. As the 
underlying pipeline engine/tools, Flink CDC should expose more metrics or data 
quality checking abilities but we didn’t offers them yet, and these 
enhancements is on our roadmap.  Currently, you can use Flink source/sink 
operator’s metric as a rough validation, you can also compare the records count 
in your source database and sink system multiple times for more accurate 
validation.

Best,
Leonard

Re: table.local-time-zone not working

2022-05-10 Thread Leonard Xu

I guess the ‘values’ you means is the underlying instant value fo TIMESTAMP_LTZ 
datatype, they are the epoch times which are same in different timezones, 
that’s the epoch semantics.  The session timezone does not affect the 
underlying value, it only affects the display string in a session, you can try 
this in the SQL Client refer [1]

Best,
Leonard
[1]  
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/timezone/


> 2022年5月10日 下午4:13，Surendra Lalwani  写道：
> 
> I have also tried to set it in flink-conf.yaml, it is visible in Job Manager 
> Configurations but still NOW() and PROCTIME() are returning values in UTC.
> 
> Thanks and Regards ,
> Surendra Lalwani
> 
> 
> On Tue, May 10, 2022 at 1:09 PM Surendra Lalwani  <mailto:surendra.lalw...@swiggy.in>> wrote:
> Hi Leonard,
> 
> Flink Version is 1.13.6 and I am adding property as follows:
> 
> Configuration tconf = tenv.getConfig().getConfiguration();
>   tconf.setString("table.local-time-zone", "America/Los_Angeles");
>   
> tenv.getConfig().setLocalTimeZone(ZoneId.of("America/Los_Angeles"));
> 
> Thanks and Regards ,
> Surendra Lalwani
> 
> 
> On Tue, May 10, 2022 at 1:07 PM Leonard Xu  <mailto:xbjt...@gmail.com>> wrote:
> Hi, Lalwani
> 
> Could you share how you set this property and your Flink version?
> 
> Best,
> Leonard
> 
> 
>> 2022年5月10日 下午3:01，Surendra Lalwani > <mailto:surendra.lalw...@swiggy.in>> 写道：
>> 
>> Hi Team,
>> 
>> I have tried using this property 
>> table.local-time-zone
>> 
>> But it seems like it is not making any impact, on calling functions like 
>> PROCTIME() and NOW(), it always returns UTC value. Any help would be 
>> appreciated.
>> 
>> Thanks and Regards ,
>> Surendra Lalwani
>> 
>> 
>> IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
>> confidential information and is intended only for the addressee(s) named 
>> above. If you are not the intended recipient(s), you should not disseminate, 
>> distribute, or copy this e-mail. Please notify the sender by reply e-mail 
>> immediately if you have received this e-mail in error and permanently delete 
>> all copies of the original message from your system. E-mail transmission 
>> cannot be guaranteed to be secure as it could be intercepted, corrupted, 
>> lost, destroyed, arrive late or incomplete, or contain viruses. Company 
>> accepts no liability for any damage or loss of confidential information 
>> caused by this email or due to any virus transmitted by this email or 
>> otherwise.
> 
> 
> 
> IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
> confidential information and is intended only for the addressee(s) named 
> above. If you are not the intended recipient(s), you should not disseminate, 
> distribute, or copy this e-mail. Please notify the sender by reply e-mail 
> immediately if you have received this e-mail in error and permanently delete 
> all copies of the original message from your system. E-mail transmission 
> cannot be guaranteed to be secure as it could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or contain viruses. Company 
> accepts no liability for any damage or loss of confidential information 
> caused by this email or due to any virus transmitted by this email or 
> otherwise.

Re: table.local-time-zone not working

2022-05-10 Thread Leonard Xu

Hi, Lalwani

Could you share how you set this property and your Flink version?

Best,
Leonard


> 2022年5月10日 下午3:01，Surendra Lalwani  写道：
> 
> Hi Team,
> 
> I have tried using this property 
> table.local-time-zone
> 
> But it seems like it is not making any impact, on calling functions like 
> PROCTIME() and NOW(), it always returns UTC value. Any help would be 
> appreciated.
> 
> Thanks and Regards ,
> Surendra Lalwani
> 
> 
> IMPORTANT NOTICE: This e-mail, including any attachments, may contain 
> confidential information and is intended only for the addressee(s) named 
> above. If you are not the intended recipient(s), you should not disseminate, 
> distribute, or copy this e-mail. Please notify the sender by reply e-mail 
> immediately if you have received this e-mail in error and permanently delete 
> all copies of the original message from your system. E-mail transmission 
> cannot be guaranteed to be secure as it could be intercepted, corrupted, 
> lost, destroyed, arrive late or incomplete, or contain viruses. Company 
> accepts no liability for any damage or loss of confidential information 
> caused by this email or due to any virus transmitted by this email or 
> otherwise.

Re: Flink-SQL returning duplicate rows for some records

2022-05-07 Thread Leonard Xu

Hi Joost

Could you share your flink version and the two records in debezium-json format 
which produced by two MS SQL UPDATE statement ?

Best,
Leonard

> 2022年5月2日 下午9:59，Joost Molenaar  写道：
> 
> Hello all,
> 
> I'm trying to use Flink-SQL to monitor a Kafka topic that's populated by
> Debezium, which is in turn monitoring a MS-SQL CDC table. For some reason,
> Flink-SQL shows a new row when I update the boolean field, but updates the
> row in place when I update the text field, and I'm not understanding why
> this happens. My ultimate goal is to use Flink-SQL to do a join on records
> that come from both sides of a 1:N relation in the foreign database, to
> expose a more ready to consume JSON object to downstream consumers.
> 
> The source table is defined like this in MS-SQL:
> 
>CREATE TABLE todo_list (
>id int IDENTITY NOT NULL,
>done bit NOT NULL DEFAULT 0,
>name varchar(MAX) NOT NULL,
>CONSTRAINT PK_todo_list PRIMARY KEY (id)
>);
> 
> This is the configuration I'm sending to Debezium, note that I'm not
> including the
> JSON-schema in both keys and values:
> 
>{
>"name": "todo-connector",
>"config": {
>"connector.class":
> "io.debezium.connector.sqlserver.SqlServerConnector",
>"tasks.max": "1",
>"database.server.name": "mssql",
>"database.hostname": "10.88.10.1",
>"database.port": "1433",
>"database.user": "sa",
>"database.password": "...",
>"database.dbname": "todo",
>"database.history.kafka.bootstrap.servers": "10.88.10.10:9092",
>"database.history.kafka.topic": "schema-changes.todo",
>"key.converter": "org.apache.kafka.connect.json.JsonConverter",
>"key.converter.schemas.enable": false,
>"value.converter": "org.apache.kafka.connect.json.JsonConverter",
>"value.converter.schemas.enable": false
>}
>}
> 
> So Debezium is publishing events to Kafka with keys like this:
> 
>{"id":3}
> 
> And values like this (whitespace added for readability), this is updating the
> value of the 'name' field:
> 
>{
>  "before": {
>"id": 3,
>"done": false,
>"name": "test"
>  },
>  "after": {
>"id": 3,
>"done": false,
>"name": "test2"
>  },
>  "source": {
>"version": "1.9.0.Final",
>"connector": "sqlserver",
>"name": "mssql",
>"ts_ms": 1651497653043,
>"snapshot": "false",
>"db": "todo",
>"sequence": null,
>"schema": "dbo",
>"table": "todo_list",
>"change_lsn": "0025:0d58:0002",
>"commit_lsn": "0025:0d58:0003",
>"event_serial_no": 2
>  },
>  "op": "u",
>  "ts_ms": 1651497654127,
>  "transaction": null
>}
> 
> (I verified this using a Python script that follows the relevant Kafka topic.)
> 
> Next, I'm trying to follow this CDC stream in Flink by adding the
> Kafka connector
> for Flink SQL, defining a source table and starting a job in the Flink-SQL 
> CLI:
> 
>ADD JAR '/opt/flink/opt/flink-sql-connector-kafka_2.11-1.14.4.jar';
> 
>CREATE TABLE todo_list (
>k_id BIGINT,
>done BOOLEAN,
>name STRING
>)
>WITH (
>'connector'='kafka',
>'topic'='mssql.dbo.todo_list',
>'properties.bootstrap.servers'='10.88.10.10:9092',
>'properties.group.id'='flinksql-todo-list',
>'scan.startup.mode'='earliest-offset',
>'key.format'='json',
>'key.fields-prefix'='k_',
>'key.fields'='k_id',
>'value.format'='debezium-json',
>'value.debezium-json.schema-include'='false',
>'value.fields-include'='EXCEPT_KEY'
>);
> 
>SELECT * FROM todo_list;
> 
> Now, when I perform a query like this in the MS-SQL database:
> 
>UPDATE todo_list SET name='test2' WHERE id=3;
> 
> Now I see that the Flink-SQL client updates the row with id=3 to have the new
> value "test2" for the 'name' field, as I was expecting. However, when I
> duplicate the 'done' field to have a different value, Flink-SQL seems to leave
> the old row with values (3, False, 'test2') intact, and shows a new row with
> values (3, True, 'test2').
> 
> I tried to append a `PRIMARY KEY (k_id) NOT ENFORCED` line between the first
> parentheses in the CREATE TABLE statement, but this seems to make no
> difference, except when running `DESCRIBE todo_list` in Flink-SQL.
> 
> I have no idea why the boolean field would cause different behavior than the
> text field. Am I missing some piece of configuration, are my expectations
> wrong?
> 
> 
> Regards,
> Joost Molenaar

Re: flink table store

2022-04-07 Thread Leonard Xu


项目是开源的[1], 最近快要发布第一个版本了，可以关注下

Best,
Leonard
[1] https://github.com/apache/flink-table-store 




> 2022年4月7日 上午9:54，Xianxun Ye  写道：
> 
> 这里有 flink table store 的设计文档，你可以了解下。
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-188%3A+Introduce+Built-in+Dynamic+Table+Storage
> 
> 
> Best regards,
> 
> 
> Xianxun
> 
> 
> On 04/6/2022 16:56，LuNing Wang wrote：
> Hi,
> 
> Table store是存储，应和数据湖类似
> 
> Best,
> LuNing Wang
> 
> yidan zhao  于2022年4月6日周三 16:55写道：
> 
> 看官网出来个 flink table store，和 flink table、flink sql 那一套有啥区别呢？
>

Re: [ANNOUNCE] Apache Flink 1.1.4.4 released

2022-03-15 Thread Leonard Xu

Thanks a lot for being our release manager Konstantin and everyone who 
involved! 

Best,
Leonard

> 2022年3月15日 下午9:34，Martijn Visser  写道：
> 
> Thank you Konstantin and everyone who contributed!

Re: [ANNOUNCE] Apache Flink 1.1.4.4 released

2022-03-15 Thread Leonard Xu

Thanks a lot for being our release manager Konstantin and everyone who 
involved! 

Best,
Leonard

> 2022年3月15日 下午9:34，Martijn Visser  写道：
> 
> Thank you Konstantin and everyone who contributed!

Re: flinkcdc:slave with the same server_uuid/server_id as this slave has connected to the master;

2022-03-14 Thread Leonard Xu

Please see the FAQ document [1]

Best,
Leonard

[1] 
https://github.com/ververica/flink-cdc-connectors/wiki/FAQ(ZH)#q10-%E4%BD%9C%E4%B8%9A%E6%8A%A5%E9%94%99-connectexception-a-slave-with-the-same-server_uuidserver_id-as-this-slave-has-connected-to-the-master%E6%80%8E%E4%B9%88%E5%8A%9E%E5%91%A2

> 2022年3月14日 下午7:05，maker_d...@foxmail.com 写道：
> 
> 时隔一个月又遇到了这个问题，现在有人能帮忙解决一下吗？
> 
> 
> 
> maker_d...@foxmail.com
> 
> 发件人： maker_d...@foxmail.com
> 发送时间： 2022-02-15 14:13
> 收件人： user-zh@flink.apache.org
> 主题： flinkcdc:slave with the same server_uuid/server_id as this slave has 
> connected to the master;
> flink version：flink-1.13.5
> cdc version：2.1.1
> 
> 在使用flinkcdc同步多个表时遇到报错：
> org.apache.flink.runtime.JobException: Recovery is suppressed by 
> FixedDelayRestartBackoffTimeStrategy(maxNumberRestartAttempts=3, 
> backoffTimeMS=1)
> at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.handleFailure(ExecutionFailureHandler.java:138)
> at 
> org.apache.flink.runtime.executiongraph.failover.flip1.ExecutionFailureHandler.getFailureHandlingResult(ExecutionFailureHandler.java:82)
> at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.handleTaskFailure(DefaultScheduler.java:216)
> at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.maybeHandleTaskFailure(DefaultScheduler.java:206)
> at 
> org.apache.flink.runtime.scheduler.DefaultScheduler.updateTaskExecutionStateInternal(DefaultScheduler.java:197)
> at 
> org.apache.flink.runtime.scheduler.SchedulerBase.updateTaskExecutionState(SchedulerBase.java:682)
> at 
> org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:79)
> at 
> org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:435)
> at sun.reflect.GeneratedMethodAccessor135.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:305)
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:212)
> at 
> org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:77)
> at 
> org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:158)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:26)
> at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:21)
> at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
> at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
> at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:21)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
> at akka.actor.Actor.aroundReceive(Actor.scala:517)
> at akka.actor.Actor.aroundReceive$(Actor.scala:515)
> at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:225)
> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:592)
> at akka.actor.ActorCell.invoke(ActorCell.scala:561)
> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:258)
> at akka.dispatch.Mailbox.run(Mailbox.scala:225)
> at akka.dispatch.Mailbox.exec(Mailbox.scala:235)
> at akka.dispatch.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
> at 
> akka.dispatch.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
> at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
> at 
> akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
> Caused by: java.lang.RuntimeException: One or more fetchers have encountered 
> exception
> at 
> org.apache.flink.connector.base.source.reader.fetcher.SplitFetcherManager.checkErrors(SplitFetcherManager.java:223)
> at 
> org.apache.flink.connector.base.source.reader.SourceReaderBase.getNextFetch(SourceReaderBase.java:154)
> at 
> org.apache.flink.connector.base.source.reader.SourceReaderBase.pollNext(SourceReaderBase.java:116)
> at 
> org.apache.flink.streaming.api.operators.SourceOperator.emitNext(SourceOperator.java:294)
> at 
> org.apache.flink.streaming.runtime.io.StreamTaskSourceInput.emitNext(StreamTaskSourceInput.java:69)
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:684)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:639)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:650)
> at 
>

Re: [Table API] [JDBC CDC] Caching Configuration from MySql instance needed in multiple Flink Jobs

2022-02-21 Thread Leonard Xu

Hello, Dan

> 2022年2月21日 下午9:11，Dan Serb  写道：
> 1.Have a processor that uses Flink JDBC CDC Connector over the table that 
> stores the information I need. (This is implemented currently - working)

You mean you’ve implemented a Flink JDBC Connector？ Maybe the Flink CDC 
Connectors[1] would help you.


> 2.Find a way to store that Stream Source inside a table inside Flink. (I 
> tried with the approach to create a MySql JDBC Catalog – but apparently, I 
> can only create Postgres Catalog programmatically) – This is the question – 
> What api do I need to use to facilitate saving inside Flink in a SQL Table, 
> the data retrieved by the CDC Source?
>  3.The solution from point 2. Needs to be done in a way that I can query that 
> table, for each record I receive in a different Job that has a Kafka Source 
> as the entrypoint.


The Flink JDBC Catalog only provides the Postgres implementation, you need to 
implement your Catalog e.g mysql catalog which provides a CDC TableSource, you 
can encapsulate a mysql-cdc source[2] in your catalog implementation

> I’m just worried that I might need to reuse this data sets from the sql 
> database in future jobs, so this is why I’d like to have something decoupled 
> and available for the entire cluster.

If you want to reuse the data set for avoiding capturing the database table 
multiple times,  you can send the CDC data to message queue like Kafka/Pulsar 
and then consume the changelogs from message queue in different Flink jobs. 

Hope above information can help you. 

Best,
Leonard
[1]https://ververica.github.io/flink-cdc-connectors/master/content/about.html
[2] 
https://github.com/ververica/flink-cdc-connectors/blob/master/flink-connector-mysql-cdc/src/main/java/com/ververica/cdc/connectors/mysql/table/MySqlTableSource.java

Re: CDC using Query

2022-02-07 Thread Leonard Xu

Hello, mohan

> 1. Does flink have any support to track any missed source Jdbc CDC records ? 

Flink CDC Connector provides Exactly once semantics which means they won’t miss 
records. Tips: The Flink JDBC Connector only 
Scan the database once which can not continuously read CDC stream.

> 2. What is the equivalent of Kafka consumer groups ?

Different database has different CDC mechanism, it’s serverId which used to 
mark a slave for MySQL/MariaDB, it’s slot name for PostgresSQL. 


> 3. Delivering to kafka from flink is not exactly once. Is that right ?

No, both Flink CDC Connector and Flink Kafka Connector provide exactly once 
implementation.

BTW, if your destination is Elasticsearch, the quick start demo[1] may help you.

Best,
Leonard

[1] 
https://ververica.github.io/flink-cdc-connectors/master/content/quickstart/mysql-postgres-tutorial.html


> 
> Thanks
> 
> On Friday, February 4, 2022, mohan radhakrishnan 
> mailto:radhakrishnan.mo...@gmail.com>> wrote:
> Hello,
>So the jdbc source connector is  kafka and transformation is 
> done by flink (flink sql) ? But that connector can miss records. I thought. 
> Started looking at flink for this and other use cases.
> Can I see the alternative to spring cloudstreams( kafka streams )? Since I am 
> learning flink, kafka streams' changelog topics and exactly-once delivery and 
> dlqs seemed good for our cŕitical push notifications.
> 
> We also needed a  elastic  sink.
> 
> Thanks
> 
> On Friday, February 4, 2022, Dawid Wysakowicz  > wrote:
> Hi Mohan,
> 
> I don't know much about Kafka Connect, so I will not talk about its features 
> and differences to Flink. Flink on its own does not have a capability to read 
> a CDC stream directly from a DB. However there is the flink-cdc-connectors[1] 
> projects which embeds the standalone Debezium engine inside of Flink's source 
> and can process DB changelog with all processing guarantees that Flink 
> provides.
> 
> As for the idea of processing further with Kafka Streams. Why not process 
> data with Flink? What do you miss in Flink?
> 
> Best,
> 
> Dawid
> 
> [1] https://github.com/ververica/flink-cdc-connectors 
> 
> 
> On 04/02/2022 13:55, mohan radhakrishnan wrote:
> Hi,
>  When I was looking for CDC I realized Flink uses Kafka Connector to 
> stream to Flink. The idea is to send it forward to Kafka and consume it using 
> Kafka Streams.
> 
> Are there source DLQs or additional mechanisms to detect failures to read 
> from the DB ?
> 
> We don't want to use Debezium and our CDC is based on queries.
> 
> What mechanisms does Flink have that a Kafka Connect worker does not ? Kafka 
> Connect workers can go down and source data can be lost.
> 
> Does the idea  to send it forward to Kafka and consume it using Kafka Streams 
> make sense ? The checkpointing feature of Flink can help ? I plan to use 
> Kafka Streams for 'Exactly-once Delivery' and changelog topics.
> 
> Could you point out relevant material to read ?
> 
> Thanks,
> Mohan

Re: [ANNOUNCE] Apache Flink 1.14.2 / 1.13.5 / 1.12.7 / 1.11.6 released

2021-12-16 Thread Leonard Xu

I guess this is related to publishers everywhere are updating their artifacts 
in response to the log4shell vulnerability[1].

All we can do and need to do is wait. ☕️

Best,
Leonard
[1] https://issues.sonatype.org/browse/OSSRH-76300 




> 2021年12月17日 下午2:21，Jingsong Li  写道：
> 
> Not found in 
> https://repo1.maven.org/maven2/org/apache/flink/flink-table-api-java/
> 
> I guess too many people sent versions, resulting in maven central
> repository synchronization being slower.
> 
> Best,
> Jingsong
> 
> On Fri, Dec 17, 2021 at 2:00 PM casel.chen  wrote:
>> 
>> I can NOT find flink 1.13.5 related jar in maven central repository, did you 
>> upload them onto there already? Thanks!
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> At 2021-12-17 01:26:19, "Chesnay Schepler"  wrote:
>>> The Apache Flink community has released emergency bugfix versions of
>>> Apache Flink for the 1.11, 1.12, 1.13 and 1.14 series.
>>> 
>>> These releases include a version upgrade for Log4j to address
>>> [CVE-2021-44228](https://nvd.nist.gov/vuln/detail/CVE-2021-44228) and
>>> [CVE-2021-45046](https://nvd.nist.gov/vuln/detail/CVE-2021-45046).
>>> 
>>> We highly recommend all users to upgrade to the respective patch release.
>>> 
>>> The releases are available for download at:
>>> https://flink.apache.org/downloads.html
>>> 
>>> Please check out the release blog post for further details:
>>> https://flink.apache.org/news/2021/12/16/log4j-patch-releases.html
>>> 
>>> 
>>> Regards,
>>> Chesnay
>> 
>> 
>> 
>> 
> 
> 
> 
> -- 
> Best, Jingsong Lee

Re: flink结合历史数据怎么处理

2021-12-06 Thread Leonard Xu

MySQL CDC connector 
支持并发读取的，读取过程也不会用锁，600万的数据量很小了，百亿级的分库分表我们和社区用户测试下都是ok的，你可以自己试试。

祝好，
Leonard


> 2021年12月6日 下午3:54，张阳 <705503...@qq.com.INVALID> 写道：
> 
> 因为数据量有600w 所以担心初始化时间太长 或者性能问题
> 
> 
> 
> 
> --原始邮件--
> 发件人:  
>   "user-zh"   
>  
>  发送时间:2021年12月6日(星期一) 下午2:38
> 收件人:"user-zh" 
> 主题:Re: flink结合历史数据怎么处理
> 
> 
> 
> 如果你的数据源是 数据库，你可以尝试下 Flink CDC Connectors[1], 这些Connector 就是 hybrid source， 
> 先读历史全量数据，再读增量数据，
> 历史和增量阶段是无缝衔接的。
> 
> 祝好，
> Leonard 
> [1] 
> https://ververica.github.io/flink-cdc-connectors/release-2.1/content/connectors/mysql-cdc.html
> 
> 
>  2021年12月2日 下午2:40，张阳   
>  统计的指标有大量的历史数据，怎么把历史的数据和今天的实时数据进行汇总呢。

Re: flink hang : es_rejected_execution_exception导致的flink 假死问题

2021-12-05 Thread Leonard Xu

Hi, ren

I think the root cause is you didn’t set proper FailureHandler for 
ElasticSearch connector, the `RetryRejectedExecutionFailureHandler` can resolve 
your issue, you can see ElasticSearch connector docs[1] for more information. 
You can also set 'connector.failure-handler to 'retry-rejected’ in your 
Elasticsearch table DDL if you’re using Flink SQL rather than DataStream 
Application.

Btw, please use English in user@flink.apache.org or send Chinese email to 
user...@flink.apache.org for better communication.

Best,
Leonard
[1]https://nightlies.apache.org/flink/flink-docs-release-1.10/dev/connectors/elasticsearch.html



> 2021年12月3日 下午2:26，淘宝龙安  写道：
> 
> hi， all
>   
> 我遇到了一个非常难解决的问题，我的场景是一个非常简单和常见的场景，从kafka消费数据，然后写入es，但是当es的集群负载较高，发生写拒绝的时候(es_rejected_execution_exception)，整个flink任务就会hang住，不再消费数据，也不重启，所有的checkpoint都会失败，我找到一些线索，但是始终找不到问题在哪里。
> 
> 现象： 
> 发生写入拒绝时候，flink报错的日志(TaskManager), 在这之后，整个job就会卡死,不再消费数据，无论es集群是否正常。
> 
> 2021-10-11 08:07:28,804 I/O dispatcher 6 ERROR 
> org.apache.flink.streaming.connectors.elasticsearch.ElasticsearchSinkBase  - 
> Failed Elasticsearch item request: ElasticsearchException[Elasticsearch 
> exception [type=es_rejected_execution_exception, reason=rejected execution of 
> processing of [1477079498][indices:data/write/bulk[s][p]]: request: 
> BulkShardRequest [[acs_period_stl_index_2021_03_30_19_06_21][0]] containing 
> [100] requests, target allocation id: rnOmk_VtSgGAOT8e0dCefQ, primary term: 1 
> on EsThreadPoolExecutor[name = VECS014179/write, queue capacity = 200, 
> org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@640a33fe[Running,
>  pool size = 8, active threads = 8, queued tasks = 226, completed tasks = 
> 1171462808
> ElasticsearchException[Elasticsearch exception 
> [type=es_rejected_execution_exception, reason=rejected execution of 
> processing of [1477079498][indices:data/write/bulk[s][p]]: request: 
> BulkShardRequest [[acs_period_stl_index_2021_03_30_19_06_21][0]] containing 
> [100] requests, target allocation id: rnOmk_VtSgGAOT8e0dCefQ, primary term: 1 
> on EsThreadPoolExecutor[name = VECS014179/write, queue capacity = 200, 
> org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@640a33fe[Running,
>  pool size = 8, active threads = 8, queued tasks = 226, completed tasks = 
> 1171462808
> at 
> org.elasticsearch.ElasticsearchException.innerFromXContent(ElasticsearchException.java:491)
> at 
> org.elasticsearch.ElasticsearchException.fromXContent(ElasticsearchException.java:402)
> at 
> org.elasticsearch.action.bulk.BulkItemResponse.fromXContent(BulkItemResponse.java:139)
> at 
> org.elasticsearch.action.bulk.BulkResponse.fromXContent(BulkResponse.java:199)
> at 
> org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1727)
> at 
> org.elasticsearch.client.RestHighLevelClient.lambda$performRequestAsyncAndParseEntity$10(RestHighLevelClient.java:1520)
> at 
> org.elasticsearch.client.RestHighLevelClient$1.onSuccess(RestHighLevelClient.java:1598)
> at 
> org.elasticsearch.client.RestClient$FailureTrackingResponseListener.onSuccess(RestClient.java:556)
> at org.elasticsearch.client.RestClient$1.completed(RestClient.java:300)
> at org.elasticsearch.client.RestClient$1.completed(RestClient.java:294)
> at 
> com.huster.hidden.org.apache.http.concurrent.BasicFuture.completed(BasicFuture.java:122)
> at 
> com.huster.hidden.org.apache.http.impl.nio.client.DefaultClientExchangeHandlerImpl.responseCompleted(DefaultClientExchangeHandlerImpl.java:181)
> at 
> com.huster.hidden.org.apache.http.nio.protocol.HttpAsyncRequestExecutor.processResponse(HttpAsyncRequestExecutor.java:448)
> at 
> com.huster.hidden.org.apache.http.nio.protocol.HttpAsyncRequestExecutor.inputReady(HttpAsyncRequestExecutor.java:338)
> at 
> com.huster.hidden.org.apache.http.impl.nio.DefaultNHttpClientConnection.consumeInput(DefaultNHttpClientConnection.java:265)
> at 
> com.huster.hidden.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:81)
> at 
> com.huster.hidden.org.apache.http.impl.nio.client.InternalIODispatch.onInputReady(InternalIODispatch.java:39)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.AbstractIODispatch.inputReady(AbstractIODispatch.java:114)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.BaseIOReactor.readable(BaseIOReactor.java:162)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvent(AbstractIOReactor.java:337)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.AbstractIOReactor.processEvents(AbstractIOReactor.java:315)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:276)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104)
> at 
> com.huster.hidden.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591)
> at

Re: flink结合历史数据怎么处理

2021-12-05 Thread Leonard Xu

如果你的数据源是 数据库，你可以尝试下 Flink CDC Connectors[1], 这些Connector 就是 hybrid source， 
先读历史全量数据，再读增量数据，
历史和增量阶段是无缝衔接的。

祝好，
Leonard 
[1] 
https://ververica.github.io/flink-cdc-connectors/release-2.1/content/connectors/mysql-cdc.html


> 2021年12月2日 下午2:40，张阳  写道：
> 
> 统计的指标有大量的历史数据，怎么把历史的数据和今天的实时数据进行汇总呢。

Re: 退订

2021-11-23 Thread Leonard Xu

你好，取消订阅是发送到 user-zh-unsubscr...@flink.apache.org 
 , 参考 
https://flink.apache.org/zh/community.html#section 


祝好

> 在 2021年11月24日，14:33，Gauler Tan  写道：
> 
> 你好，已经发了很多次退订了，发啥还在源源不断的给我发邮件？
> 
> 谢谢

Re: FlinkSQL ES7连接器无法使用

2021-11-22 Thread Leonard Xu

这是个依赖问题，你检查下你环境中是否只使用sql connector 的jar，即 flink-sql-connector-elasticsearch7， 
如果不是 datastream 作业是不需要 flink-connector-elasticsearch7 这个 
jar包的。如果不是这个问题，你可以分析下你作业里使用的 es 相关依赖，可以参考异常栈确定类再去确定jar包，看下是不是多加了一些无用的jar。

祝好，
Leonard
 

> 在 2021年11月22日，12:30，mispower  写道：
> 
> 你好，咨询一下后续你这个问题是如何解决的？
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> At 2021-06-10 10:15:58, "mokaful" <649713...@qq.com> wrote:
>> org.apache.flink.streaming.runtime.tasks.StreamTaskException: Cannot
>> instantiate user function.
>>  at
>> org.apache.flink.streaming.api.graph.StreamConfig.getStreamOperatorFactory(StreamConfig.java:338)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperator(OperatorChain.java:653)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:626)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOperatorChain(OperatorChain.java:616)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.createOutputCollector(OperatorChain.java:566)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.OperatorChain.(OperatorChain.java:181)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:548)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at
>> org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:759)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
>> ~[flink-dist_2.11-1.13.1.jar:1.13.1]
>>  at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_181]
>> Caused by: java.io.InvalidClassException:
>> org.apache.flink.streaming.connectors.elasticsearch.table.Elasticsearch7DynamicSink$AuthRestClientFactory;
>> local class incompatible: stream classdesc serialVersionUID =
>> -2564582543942331131, local class serialVersionUID = -2353232579685349916
>>  at java.io.ObjectStreamClass.initNonProxy(ObjectStreamClass.java:699)
>> ~[?:1.8.0_181]
>>  at 
>> java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1885)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1751)
>> ~[?:1.8.0_181]
>>  at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2042)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> ~[?:1.8.0_181]
>>  at 
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>> ~[?:1.8.0_181]
>>  at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> ~[?:1.8.0_181]
>>  at 
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>> ~[?:1.8.0_181]
>>  at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> ~[?:1.8.0_181]
>>  at 
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:2211)
>> ~[?:1.8.0_181]
>>  at
>> java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:2069)
>> ~[?:1.8.0_181]
>>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1573)
>> ~[?:1.8.0_181]
>>  at 
>> java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:2287)
>> ~[?:1.8.0_181]
>>

Re: 退订

2021-11-03 Thread Leonard Xu

如果需要取消订阅 user-zh@flink.apache.org 邮件组，请发送任意内容的邮件到 
user-zh-unsubscr...@flink.apache.org

> 在 2021年11月2日，14:15，李芳奎  写道：
> 
> 退订
> 
> felix 
> 
> felix_...@163.com

Re: flinksql 读取orc文件失败

2021-10-29 Thread Leonard Xu

图挂了，试着直接贴下日志文本，或者用图床工具。


> 在 2021年10月28日，16:54，陈卓宇 <2572805...@qq.com.INVALID> 写道：
> 
> 
> 
> 
> 
> flink版本:1.12.2
> jdk:1.8
> 场景:flinksql 读取hdfs的orc文件
> 请教大神这个报错是什么原因
> 
> 
> 宇
>

Re: 一些关于flink rabbitmq connector的疑问

2021-10-28 Thread Leonard Xu

Hi, Peng

There’s no doubt that RabbitMQ is a good open source community with active 
users. 
I understand what @renqschn means is that Flink RabbitMQ  Connector is one 
connector with few users among the many connectors in the Flink project.  From 
my observation, the connector that is used more in the Flink project should be 
Kafka. Filesystem, JDBC and so on. So, please help us to promote Flink in the 
RabbitMQ community and let more RabbitMQ users know and then use the Flink 
RabbitMQ Connector, which will give the Flink community more motivation to 
improve the Flink RabbitMQ Connector.

Best,
Leonard

> 在 2021年10月29日，11:13，Ken Peng  写道：
> 
> I am one of the Forum Moderators for RabbitMQ, which does have a lot of
> active users. :)
> If you have any questions about RMQ please subscribe to its official group
> and ask there.
> rabbitmq-users+subscr...@googlegroups.com
> 
> Regards.
> 
> 
> On Fri, Oct 29, 2021 at 11:09 AM 任庆盛  wrote:
> 
>> 您好，
>> 
>> 从代码来看 RabbitMQ Sink 的确没有语义保证。目前 RabbitMQ
>> 由于社区用户不多，相对的维护频率也比较低，如果感兴趣的话也欢迎您参与社区的贡献～
>> 
>> 
>> 
>>> 2021年10月28日 下午7:53，wx liao  写道：
>>> 
>>> 你好：
>>> 
>> 冒昧打扰，最近项目在使用flink,sink端是rabbitmq,但是查看项目源码发现RMQSink好像并没有对消息不丢失做保证，没有看到有使用waitForConfirm()或者是confirm
>> listener,想问一下RMQSink部分是否没有保证at least once?希望可以得到解答，谢谢。
>> 
>>

Re: window join in flink sql

2021-10-28 Thread Leonard Xu


Tips: 

The documentation in  https://ci.apache.org/projects/flink 
 is no longer updated any more,
The new documentation site is https://nightlies.apache.org/flink/ 
 , please use the new one.

Best,
Leonard


> 在 2021年10月29日，10:41，Caizhi Weng  写道：
> 
> Hi!
> 
> Window join in Flink SQL is supported since Flink 1.14, see document [1].
> 
> [1] 
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/queries/window-join/
>  
> 
> Lu Niu mailto:qqib...@gmail.com>> 于2021年10月29日周五 上午2:12写道：
> Hi, Flink users
> 
> How to express multiple stream window join in flink sql? in datastream api, 
> that's
> stream.join(otherStream)
> .where()
> .equalTo()
> .window()
> .apply()
> (https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/operators/joining/
>  
> 
>  ) 
> 
> For example, in flinksql, how to join two streams in tumbling window and 
> evaluate a udf joinFunction? 
> 
> Best
> Lu

Re: database as stream source issue

2021-10-28 Thread Leonard Xu

Hi, Qihua

JDBC connector support Postgres dailect, but it is implemented as a bounded 
source which means it only captures the snapshot data(the existed records) and 
then finished its work, the new adding transaction log records (as known as 
MySql bin-log) won’t be captured. You should receive all the snapshot data If 
your program works fine.

BTW, if you want capture both snapshot + transaction log events, you can try 
`postgres-cdc` connector[1], it offers SQL API and DataStream API, you can 
refer the documentation[2] for quick start.


Best,
Leonard
[1] https://github.com/ververica/flink-cdc-connectors
[2] 
https://ververica.github.io/flink-cdc-connectors/release-2.0/content/connectors/postgres-cdc.html




> 在 2021年10月28日，13:24，Qihua Yang  写道：
> 
> Hi,
> 
> I am trying to use postgres DB as the stream data source and push to kafka 
> topic. Here is how I config database source. Looks like it didn't read out 
> any data. But I didn't see any error from the flink log. 
> I did a test, tried to insert wrong data to database, I saw flink throw below 
> error. Looks like flink try to read data from database.
> java.lang.ClassCastException: class java.lang.Long cannot be cast to class 
> java.lang.Integer (java.lang.Long and java.lang.Integer are in module 
> java.base of loader 'bootstrap')
> 
> I saw  job manager shows switched from DEPLOYING to RUNNING. and switched 
> from RUNNING to FINISHED immediately. 
> Can anyone help understand why?
> Did I config anything wrong? Or I missed anything?
> 
> 2021-10-28 02:49:52.777 [flink-akka.actor.default-dispatcher-3] INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph  - Sink: test-sink 
> (2/2) (7aa24e97a11bbd831941d636910fe84f) switched from DEPLOYING to RUNNING.
> 
> 2021-10-28 02:49:53.245 [flink-akka.actor.default-dispatcher-15] INFO  
> org.apache.flink.runtime.executiongraph.ExecutionGraph  - Source: 
> JdbcTableSource(id, name, address) -> Map -> Map -> Filter (1/2) 
> (558afdc9dd911684833d0e51943eef92) switched from RUNNING to FINISHED.
> 
> val options = JdbcOptions.builder()
> // .setDBUrl("jdbc:derby:memory:mydb")
> .setDBUrl("")
> .setTableName("test_store")
> .setDriverName("org.postgresql.Driver")
> .setUsername("dbUser")
> .setPassword("123")
> .build()
> val readOptions = JdbcReadOptions.builder()
> .setPartitionColumnName("id")
> .setPartitionLowerBound(-1)
> .setPartitionUpperBound(DB_SIZE)
> .setNumPartitions(PARTITIONS)
> //.setFetchSize(0)
> .build()
> val lookupOptions = JdbcLookupOptions.builder()
> .setCacheMaxSize(-1)
> .setCacheExpireMs(CACHE_SIZE)
> .setMaxRetryTimes(2)
> .build()
> val dataSource = JdbcTableSource.builder()
> .setOptions(options)
> .setReadOptions(readOptions)
> .setLookupOptions(lookupOptions)
> .setSchema(storeSchema)
> .build().getDataStream(env)

Re: [ANNOUNCE] Apache Flink 1.13.3 released

2021-10-21 Thread Leonard Xu

Thanks to Chesnay & Martijn and everyone who made this release happen.


> 在 2021年10月21日，20:08，Martijn Visser  写道：
> 
> Thank you Chesnay, Leonard and all contributors! 
> 
> On Thu, 21 Oct 2021 at 13:40, Jingsong Li  > wrote:
> Thanks, Chesnay & Martijn
> 
> 1.13.3 really solves many problems.
> 
> Best,
> Jingsong
> 
> On Thu, Oct 21, 2021 at 6:46 PM Konstantin Knauf  > wrote:
> >
> > Thank you, Chesnay & Martijn, for managing this release!
> >
> > On Thu, Oct 21, 2021 at 10:29 AM Chesnay Schepler  > >
> > wrote:
> >
> > > The Apache Flink community is very happy to announce the release of
> > > Apache Flink 1.13.3, which is the third bugfix release for the Apache
> > > Flink 1.13 series.
> > >
> > > Apache Flink® is an open-source stream processing framework for
> > > distributed, high-performing, always-available, and accurate data
> > > streaming applications.
> > >
> > > The release is available for download at:
> > > https://flink.apache.org/downloads.html 
> > > 
> > >
> > > Please check out the release blog post for an overview of the
> > > improvements for this bugfix release:
> > > https://flink.apache.org/news/2021/10/19/release-1.13.3.html 
> > > 
> > >
> > > The full release notes are available in Jira:
> > >
> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12350329
> > >  
> > > 
> > >
> > > We would like to thank all contributors of the Apache Flink community
> > > who made this release possible!
> > >
> > > Regards,
> > > Chesnay
> > >
> > >
> >
> > --
> >
> > Konstantin Knauf
> >
> > https://twitter.com/snntrable 
> >
> > https://github.com/knaufk 
> 
> 
> 
> -- 
> Best, Jingsong Lee

Re: [DISCUSS] Creating an external connector repository

2021-10-18 Thread Leonard Xu

Hi, all

I understand very well that the maintainers of the community want to move the 
connector to an external system. Indeed, the development and maintenance of the 
connector requires a lot of energy, and these do not involve the Flink core 
framework, which can reduce the maintenance pressure on the community side.

I only have one concern. Once we migrate these connectors to external projects, 
how can we ensure them with high quality? All the built-in connectors of Flink 
are developed or reviewed by the committers. The reported connector bugs from 
JIRA and mailing lists will be quick fixed currently, how does the Flink 
community ensure the development rhythm of the connector after the move? In 
other words, are these connectors still first-class citizens of the Flink 
community? If it is how we guarantee.

Recently, I have maintained a series of cdc connectors in the Flink CDC project 
[1]. My feeling is that it is not easy to develop and maintain connectors. 
Contributors to the Flink CDC project have done some approaches in this way, 
such as building connector integration tests [2], document management [3]. 
Personally, I don’t have a strong tendency to move the built-in connectors out 
or keep them. If the final decision of this thread discussion  turns out to 
move out, I’m happy to share our experience and provide help in the new 
connector project. .

Best,
Leonard
[1]https://github.com/ververica/flink-cdc-connectors
[2]https://github.com/ververica/flink-cdc-connectors/runs/3902664601
[3]https://ververica.github.io/flink-cdc-connectors/master/

> 在 2021年10月18日，19:00，David Morávek  写道：
> 
> We are mostly talking about the freedom this would bring to the connector 
> authors, but we still don't have answers for the important topics:
> 
> - How exactly are we going to maintain the high quality standard of the 
> connectors?
> - How would the connector release cycle to look like? Is this going to affect 
> the Flink release cycle?
> - How would the documentation process / generation look like?
> - Not all of the connectors rely solely on the Stable APIs. Moving them 
> outside of the Flink code-base will make any refactoring on the Flink side 
> significantly more complex as potentially needs to be reflected into all 
> connectors. There are some possible solutions, such as Gradle's included 
> builds, but we're far away from that. How are we planning to address this?
> - How would we develop connectors against unreleased Flink version? Java 
> snapshots have many limits when used for the cross-repository development.
> - With appropriate tooling, this whole thing is achievable even with the 
> single repository that we already have. It just matter of having a more 
> fine-grained build / release process. Have you tried to research this option?
> 
> I'd personally strongly suggest against moving the connectors out of the ASF 
> umbrella. The ASF brings legal guarantees, hard gained trust of the users and 
> high quality standards to the table. I still fail to see any good reason for 
> giving this up. Also this decision would be hard to reverse, because it would 
> most likely require a new donation to the ASF (would this require a consent 
> from all contributors as there is no clear ownership?).
> 
> Best,
> D.
> 
> 
> On Mon, Oct 18, 2021 at 12:12 PM Qingsheng Ren  > wrote:
> Thanks for driving this discussion Arvid! I think this will be one giant leap 
> for Flink community. Externalizing connectors would give connector developers 
> more freedom in developing, releasing and maintaining, which can attract more 
> developers for contributing their connectors and expand the Flink ecosystems.
> 
> Considering the position for hosting connectors, I prefer to use an 
> individual organization outside Apache umbrella. If we keep all connectors 
> under Apache, I think there’s not quite difference comparing keeping them in 
> the Flink main repo. Connector developers still require permissions from 
> Flink committers to contribute, and release process should follow Apache 
> rules, which are against our initial motivations of externalizing connectors.
> 
> Using an individual Github organization will maximum the freedom provided to 
> developers. An ideal structure in my mind would be like 
> "github.com/flink-connectors/flink-connector-xxx 
> ". The new 
> established flink-extended org might be another choice, but considering the 
> amount of connectors, I prefer to use an individual org for connectors to 
> avoid flushing other repos under flink-extended.
> 
> In the meantime, we need to provide a well-established standard / guideline 
> for contributing connectors, including CI, testing, docs (maybe we can’t 
> provide resources for running them, but we should give enough guide on how to 
> setup one) to keep the high quality of connectors. I’m happy to help building 
> these fundamental bricks. Also since

Re: 退订

2021-09-27 Thread Leonard Xu

如果需要取消订阅 user-zh@flink.apache.org  
邮件组，请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 
  即可


> 在 2021年9月27日，14:43，rzy1107  写道：
> 
> 退订

Re: GroupWindowAggregate doesn't support consuming update and delete changes which is produced by node Deduplicate

2021-09-27 Thread Leonard Xu

hi, 报错详情可以在邮件贴下吗？


> 在 2021年9月27日，11:36，lzy139...@outlook.com 写道：
> 
> 使用ROW_NUMBER过滤数据后，进行开窗聚合计算报错

Re: mysql cdc入hudi湖出错

2021-09-26 Thread Leonard Xu

Hi, chan

完整的日志能贴下吗？这个日志还看不出来。



> 在 2021年9月24日，18:23，casel.chen  写道：
> 
> SELECT `id`, `name`, `birthday`, `ts`, DATE_FORMAT(`birthday`, 'MMdd') AS 
> `partition` FROM mysql_users;

Re: 退订

2021-09-26 Thread Leonard Xu

如果需要取消订阅 user-zh@flink.apache.org  
邮件组，请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 
  即可

Best,
Leonard

> 在 2021年9月26日，14:25，luoye <13033709...@163.com> 写道：
> 
> 退订

Re: Can't access Debezium metadata fields in Kafka table

2021-09-26 Thread Leonard Xu

Hi,  Harshvardhan

The format debezium-avro-confluent  doesn’t support read metadata yet[1], the 
supported formats including debezium-json, canal-json and maxwell-json, you can 
try the supported formats.

Best,
Leonard
[1] https://issues.apache.org/jira/browse/FLINK-20454 




> 在 2021年9月24日，14:00，Harshvardhan Shinde  写道：
> 
> Hi,
> Here's the complete error log:
>  
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.flink.table.api.ValidationException: Invalid metadata key 
> 'value.ingestion-timestamp' in column 'origin_ts' of table 
> 'flink_hive.harsh_test.testflink'. The DynamicTableSource class 
> 'org.apache.flink.streaming.connectors.kafka.table.KafkaDynamicSource' 
> supports the following metadata keys for reading:
> topic
> partition
> headers
> leader-epoch
> offset
> timestamp
> timestamp-type
> 
> I'll need some more time to test it with debezium-json format.
> 
> On Fri, Sep 24, 2021 at 12:52 AM Roman Khachatryan  > wrote:
> Hi,
> could you please share the full error message?
> I think it should list the supported metadata columns.
> 
> Do you see the same error with 'debezium-json' format instead of
> 'debezium-avro-confluent' ?
> 
> Regards,
> Roman
> 
> 
> On Wed, Sep 22, 2021 at 5:12 PM Harshvardhan Shinde
> mailto:harshvardhan.shi...@oyorooms.com>> 
> wrote:
> >
> > Hi,
> > I'm trying to access the metadata columns from the debezium source 
> > connector as documented here.
> > However I'm getting the following error when I try to select the rows from 
> > the kafka table:
> >
> > flink.table.api.ValidationException: Invalid metadata key 
> > 'value.ingestion-timestamp' in column 'origin_ts'
> >
> > Getting the same issue for all the virtual columns. Please let me know what 
> > I'm doing wrong.
> >
> > Here's my table creation query:
> >
> > CREATE TABLE testFlink (
> >   origin_ts TIMESTAMP(3) METADATA FROM 'value.ingestion-timestamp' VIRTUAL,
> >   event_time TIMESTAMP(3) METADATA FROM 'value.source.timestamp' VIRTUAL,
> >   origin_database STRING METADATA FROM 'value.source.database' VIRTUAL,
> >   origin_schema STRING METADATA FROM 'value.source.schema' VIRTUAL,
> >   origin_table STRING METADATA FROM 'value.source.table' VIRTUAL,
> >   origin_properties MAP METADATA FROM 
> > 'value.source.properties' VIRTUAL,
> >   id BIGINT,
> >   number BIGINT,
> >   created_at BIGINT,
> >   updated_at BIGINT
> > ) WITH (
> >   'connector' = 'kafka',
> >   'topic' = 'source-staging-postgres-flink_test-82-2021-09-20.public.test',
> >   'properties.bootstrap.servers' = ':9092',
> >   'properties.group.id ' = 'testGroup',
> >   'scan.startup.mode' = 'earliest-offset',
> >   'value.format' = 'debezium-avro-confluent',
> >   'value.debezium-avro-confluent.schema-registry.url' = 
> > ':8081'
> > );
> >
> > Thanks.
> 
> 
> -- 
> Thanks and Regards,
> Harshvardhan
> Data Platform

Re: flink-1.12.0 ddl设置watermark error，但是1.13.2没有报错

2021-09-25 Thread Leonard Xu

这是个已知bug[1]， 在1.13.0 和 1.12.3上都修复了， 你可以用下flink 1.12.5 或 1.13.2的小版本

[1]https://issues.apache.org/jira/browse/FLINK-22082

祝好

> 在 2021年9月25日，21:29，kcz <573693...@qq.com.INVALID> 写道：
> 
> SQL定义如下，当1.12.0将watermark语句移除之后，就不报错了。
> CREATE TABLE KafkaTable (
>  test array  gatherTime STRING,
>  log_ts as TO_TIMESTAMP(FROM_UNIXTIME(CAST(gatherTime AS 
> bigint)/1000,'-MM-dd HH:mm:ss')),
>  WATERMARK FOR log_ts AS log_ts - INTERVAL '5' SECOND
> ) WITH (
>  'connector' = 'kafka',
>  'topic' = 'user_behavior',
>  'properties.bootstrap.servers' = 'localhost:9092',
>  'properties.group.id' = 'testGroup',
>  'scan.startup.mode' = 'earliest-offset',
>  'format' = 'json'
> );
> 
> SELECT test[1].signalValue from KafkaTable;
> 
> 
> 
> 
> Exception in thread "main" scala.MatchError: ITEM($0, 1) (of class 
> org.apache.calcite.rex.RexCall)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedSchemaExtractor.internalVisit$1(NestedProjectionUtil.scala:273)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedSchemaExtractor.visitFieldAccess(NestedProjectionUtil.scala:283)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedSchemaExtractor.visitFieldAccess(NestedProjectionUtil.scala:269)
>   at org.apache.calcite.rex.RexFieldAccess.accept(RexFieldAccess.java:92)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedProjectionUtil$$anonfun$build$1.apply(NestedProjectionUtil.scala:112)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedProjectionUtil$$anonfun$build$1.apply(NestedProjectionUtil.scala:111)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedProjectionUtil$.build(NestedProjectionUtil.scala:111)
>   at 
> org.apache.flink.table.planner.plan.utils.NestedProjectionUtil.build(NestedProjectionUtil.scala)
>   at 
> org.apache.flink.table.planner.plan.rules.logical.ProjectWatermarkAssignerTransposeRule.getUsedFieldsInTopLevelProjectAndWatermarkAssigner(ProjectWatermarkAssignerTransposeRule.java:127)
>   at 
> org.apache.flink.table.planner.plan.rules.logical.ProjectWatermarkAssignerTransposeRule.matches(ProjectWatermarkAssignerTransposeRule.java:62)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:284)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.matchRecurse(VolcanoRuleCall.java:411)
>   at 
> org.apache.calcite.plan.volcano.VolcanoRuleCall.match(VolcanoRuleCall.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.fireRules(VolcanoPlanner.java:985)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1245)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:84)
>   at 
> org.apache.calcite.rel.AbstractRelNode.onRegister(AbstractRelNode.java:268)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.registerImpl(VolcanoPlanner.java:1132)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.register(VolcanoPlanner.java:589)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.ensureRegistered(VolcanoPlanner.java:604)
>   at 
> org.apache.calcite.plan.volcano.VolcanoPlanner.changeTraits(VolcanoPlanner.java:486)
>   at 
> org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:309)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkVolcanoProgram.optimize(FlinkVolcanoProgram.scala:64)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:62)
>   at 
> org.apache.flink.table.planner.plan.optimize.program.FlinkChainedProgram$$anonfun$optimize$1.apply(FlinkChainedProgram.scala:58)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at 
> scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:891)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1334)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at 
> scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
>   at scala.collection.AbstractTraversable.foldLeft(Traversable.scala:104)
>   at 
>

Re: flink cdc SQL2ES，GC overhead limit exceeded

2021-09-15 Thread Leonard Xu

应该和Flink CDC无关，CDC只是source，这个异常栈是从join节点抛出来的，你可以贴下你的SQL和配置
这样大家才好分析一点
Best,
Leonard


> 在 2021年9月15日，15:01，wukon...@foxmail.com 写道：
> 
> hi LIYUAN:
> 请描述下如何使用的flink，以及什么场景下 会造成这个报错， 这样方便大家帮助你定位问题。
> 
> 
> 
> wukon...@foxmail.com
> 
> 发件人： LI YUAN
> 发送时间： 2021-09-09 20:38
> 收件人： user-zh
> 主题： flink cdc SQL2ES，GC overhead limit exceeded
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at org.rocksdb.RocksIterator.key0(Native Method)
> at org.rocksdb.RocksIterator.key(RocksIterator.java:37)
> at 
> org.apache.flink.contrib.streaming.state.RocksIteratorWrapper.key(RocksIteratorWrapper.java:99)
> at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState$RocksDBMapIterator.loadCache(RocksDBMapState.java:670)
> at 
> org.apache.flink.contrib.streaming.state.RocksDBMapState$RocksDBMapIterator.hasNext(RocksDBMapState.java:585)
> at 
> org.apache.flink.table.runtime.operators.join.stream.state.OuterJoinRecordStateViews$InputSideHasNoUniqueKey$1.hasNext(OuterJoinRecordStateViews.java:285)
> at 
> org.apache.flink.table.runtime.operators.join.stream.AbstractStreamingJoinOperator$AssociatedRecords.of(AbstractStreamingJoinOperator.java:199)
> at 
> org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator.processElement(StreamingJoinOperator.java:211)
> at 
> org.apache.flink.table.runtime.operators.join.stream.StreamingJoinOperator.processElement2(StreamingJoinOperator.java:129)
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.processRecord2(StreamTwoInputProcessorFactory.java:221)
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory.lambda$create$1(StreamTwoInputProcessorFactory.java:190)
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$$Lambda$363/366743523.accept(Unknown
>  Source)
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitRecord(StreamTwoInputProcessorFactory.java:291)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:134)
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:105)
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:66)
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessor.processInput(StreamTwoInputProcessor.java:98)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:423)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$$Lambda$208/999379751.runDefaultAction(Unknown
>  Source)
> at 
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:204)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:681)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.executeInvoke(StreamTask.java:636)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask$$Lambda$625/601779266.run(Unknown
>  Source)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
> at 
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:620)
> at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:779)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:566)
> at java.lang.Thread.run(Thread.java:748)
> 
> Environment :
> 
> Flink version : 1.13.1
> 
> Flink CDC version: 1.4.0
> 
> Database and version: Mysql 7.0

Re: Streaming SQL support for redis streaming connector

2021-09-15 Thread Leonard Xu

Hi, Osada

Just want to offer some material here.The flink-cdc-connectors project [1] 
maybe also help you, we supports the document db MongoDB[2] recently.

Best,
Leonard
 
[1] https://github.com/ververica/flink-cdc-connectors 

[2] 
https://ververica.github.io/flink-cdc-connectors/master/content/connectors/mongodb-cdc.html
 

 

> 在 2021年9月14日，20:56，Osada Paranaliyanage  写道：
> 
> Hi All, We are looking to use flink to build a materialized view of a 
> relation db and a document db using cdc streams. For this purpose we would 
> like to use redis for hosting the materialized view. Can we do this in 
> streaming SQL? We have worked through 
> https://github.com/ververica/flink-sql-CDC 
>  and can see how this will work 
> with ES as a sink. But can we use redis as the sink? Where do we find the 
> syntax for that?
>  
> Thanks,
> Osada.
>  
> 
> 
> 
> This e-mail is confidential. It may also be legally privileged. If you are 
> not the intended recipient or have received it in error, please delete it and 
> all copies from your system and notify the sender immediately by return 
> e-mail. Any unauthorized reading, reproducing, printing or further 
> dissemination of this e-mail or its contents is strictly prohibited and may 
> be unlawful. Internet communications cannot be guaranteed to be timely, 
> secure, error or virus-free. The sender does not accept liability for any 
> errors or omissions.

Re: flink-connector-postgres-cdc：no changes will be captured 无数据捕获到

2021-09-06 Thread Leonard Xu

没看到你的附件呢，你也可以在Flink CDC 项目里建个issue, 图贴 issue里，
Github 贴图比邮件方便点

> 在 2021年9月6日，19:08，Fisher Xiang  写道：
> 
> Thank u Leonard.
> 我把运行日志放在附件了，麻烦看下
> 使用的用户是：role 'repl' [superuser: true, replication: true, inherit: true, create 
> role: false, create db: false, can log in: true]
> 
> 
> BR
> Fisher
> 
> 
> On Mon, Sep 6, 2021 at 5:55 PM Leonard Xu  <mailto:xbjt...@gmail.com>> wrote:
> Hello, Fisher
> 
> 图挂了，可以用图床工具贴下吗？
> 我可以帮忙看看
> 
> 祝好，
> Leonard
> 
> > 在 2021年9月6日，17:48，Fisher Xiang  > <mailto:fisherxia...@gmail.com>> 写道：
> > 
> > hi,
> > 
> > 在使用  flink-connector-postgres-cdc时（版本从1.1.1 ~ 1.4.0都试过）， 出现一个警告：
> > WARN io.debezium.relational.RelationalDatabaseSchema - After applying the 
> > include/exclude list filters, no changes will be captured. Please check 
> > your configuration!
> > 启动配置是,Starting PostgresConnectorTask with configuration ：
> > 
> > 
> > 
> > 然后，往这些表里面（  table.whitelist = stud,data_input,data_output 
> > ）insert和delete记录时，没有捕获到变更数据，只有以下输出，求看是什么问题：
> > 
> > 
> > 
> > 使用的用户是：role 'repl' [superuser: true, replication: true, inherit: true, 
> > create role: false, create db: false, can log in: true]
> > 
> > BR
> > Fisher
>

Re: flink-connector-postgres-cdc：no changes will be captured 无数据捕获到

2021-09-06 Thread Leonard Xu

Hello, Fisher

图挂了，可以用图床工具贴下吗？
我可以帮忙看看

祝好，
Leonard

> 在 2021年9月6日，17:48，Fisher Xiang  写道：
> 
> hi,
> 
> 在使用  flink-connector-postgres-cdc时（版本从1.1.1 ~ 1.4.0都试过）， 出现一个警告：
> WARN io.debezium.relational.RelationalDatabaseSchema - After applying the 
> include/exclude list filters, no changes will be captured. Please check your 
> configuration!
> 启动配置是,Starting PostgresConnectorTask with configuration ：
> 
> 
> 
> 然后，往这些表里面（  table.whitelist = stud,data_input,data_output 
> ）insert和delete记录时，没有捕获到变更数据，只有以下输出，求看是什么问题：
> 
> 
> 
> 使用的用户是：role 'repl' [superuser: true, replication: true, inherit: true, create 
> role: false, create db: false, can log in: true]
> 
> BR
> Fisher

[ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread Leonard Xu

Hi, all

The mailing list archive service Nabble Archive was broken at the end of June, 
the Flink community has migrated the mailing lists archives[1] to Apache 
Archive service by commit[2], you can refer [3] to know more mailing lists 
archives of Flink.

Apache Archive service is maintained by ASF thus the stability is guaranteed, 
it’s a web-based mail archive service which allows you to browse, search, 
interact, subscribe, unsubscribe, etc. with mailing lists.

Apache Archive service shows mails of the last month by default, you can 
specify the date range to browse, search the history mails.  


Hope it would be helpful.

Best,
Leonard

[1] The Flink mailing lists in Apache archive service
dev mailing list archives: 
https://lists.apache.org/list.html?d...@flink.apache.org 
user mailing list archives : 
https://lists.apache.org/list.html?u...@flink.apache.org  
user-zh mailing list archives : 
https://lists.apache.org/list.html?user-zh@flink.apache.org
[2] 
https://github.com/apache/flink-web/commit/9194dda862da00d93f627fd315056471657655d1
[3] https://flink.apache.org/community.html#mailing-lists

[ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread Leonard Xu

Hi, all

The mailing list archive service Nabble Archive was broken at the end of June, 
the Flink community has migrated the mailing lists archives[1] to Apache 
Archive service by commit[2], you can refer [3] to know more mailing lists 
archives of Flink.

Apache Archive service is maintained by ASF thus the stability is guaranteed, 
it’s a web-based mail archive service which allows you to browse, search, 
interact, subscribe, unsubscribe, etc. with mailing lists.

Apache Archive service shows mails of the last month by default, you can 
specify the date range to browse, search the history mails.  


Hope it would be helpful.

Best,
Leonard

[1] The Flink mailing lists in Apache archive service
dev mailing list archives: 
https://lists.apache.org/list.html?d...@flink.apache.org 
user mailing list archives : 
https://lists.apache.org/list.html?user@flink.apache.org  
user-zh mailing list archives : 
https://lists.apache.org/list.html?user...@flink.apache.org
[2] 
https://github.com/apache/flink-web/commit/9194dda862da00d93f627fd315056471657655d1
[3] https://flink.apache.org/community.html#mailing-lists

Re: 退订

2021-08-31 Thread Leonard Xu

Hi,
 
  Please send email to dev-unsubscr...@flink.apache.org 
 if you want to unsubscribe the mail 
from d...@flink.apache.org  .
  Please send email to user-unsubscr...@flink.apache.org 
 if you want to unsubscribe the mail 
from u...@flink.apache.org  .
  Please send email to user-zh-unsubscr...@flink.apache.org 
 if you want to unsubscribe the 
mail from user-zh@flink.apache.org  .
 
You can refer[1] for more details. 

[1] https://flink.apache.org/community.html#mailing-lists 
  

Best,
Leonard

> 在 2021年8月31日，22:06，kindragos <6230...@163.com> 写道：
> 
> 退订

Re: 退订

2021-08-31 Thread Leonard Xu

Hi,
 
  Please send email to dev-unsubscr...@flink.apache.org 
 if you want to unsubscribe the mail 
from d...@flink.apache.org  .
  Please send email to user-unsubscr...@flink.apache.org 
 if you want to unsubscribe the mail 
from user@flink.apache.org  .
  Please send email to user-zh-unsubscr...@flink.apache.org 
 if you want to unsubscribe the 
mail from user...@flink.apache.org  .
 
You can refer[1] for more details. 

[1] https://flink.apache.org/community.html#mailing-lists 
  

Best,
Leonard

> 在 2021年8月31日，22:06，kindragos <6230...@163.com> 写道：
> 
> 退订

Re: 【Announce】Zeppelin 0.10.0 is released, Flink on Zeppelin Improved

2021-08-25 Thread Leonard Xu

Thanks Jeff for the great work !

Best,
Leonard 

> 在 2021年8月25日，22:48，Jeff Zhang  写道：
> 
> Hi Flink users,
> 
> We (Zeppelin community) are very excited to announce Zeppelin 0.10.0 is 
> officially released. In this version, we made several improvements on Flink 
> interpreter.  Here's the main features of Flink on Zeppelin:
> Support multiple versions of Flink
> Support multiple versions of Scala
> Support multiple languages
> Support multiple execution modes
> Support Hive
> Interactive development
> Enhancement on Flink SQL
> Multi-tenancy
> Rest API Support
> Take a look at this document for more details:  
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html 
> 
> The quickest way to try Flink on Zeppelin is via its docker image 
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html#play-flink-in-zeppelin-docker
>  
> 
> 
> Besides these, here’s one blog about how to run Flink sql cookbook on 
> Zeppelin, 
> https://medium.com/analytics-vidhya/learn-flink-sql-the-easy-way-d9d48a95ae57 
> 
>   The easy way to learn Flink Sql.
> 
> Hope it would be helpful for you and welcome to join our community to discuss 
> with others. http://zeppelin.apache.org/community.html 
> 
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang

Re: 【Announce】Zeppelin 0.10.0 is released, Flink on Zeppelin Improved

2021-08-25 Thread Leonard Xu

Thanks Jeff for the great work !

Best,
Leonard 

> 在 2021年8月25日，22:48，Jeff Zhang  写道：
> 
> Hi Flink users,
> 
> We (Zeppelin community) are very excited to announce Zeppelin 0.10.0 is 
> officially released. In this version, we made several improvements on Flink 
> interpreter.  Here's the main features of Flink on Zeppelin:
> Support multiple versions of Flink
> Support multiple versions of Scala
> Support multiple languages
> Support multiple execution modes
> Support Hive
> Interactive development
> Enhancement on Flink SQL
> Multi-tenancy
> Rest API Support
> Take a look at this document for more details:  
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html 
> 
> The quickest way to try Flink on Zeppelin is via its docker image 
> https://zeppelin.apache.org/docs/0.10.0/interpreter/flink.html#play-flink-in-zeppelin-docker
>  
> 
> 
> Besides these, here’s one blog about how to run Flink sql cookbook on 
> Zeppelin, 
> https://medium.com/analytics-vidhya/learn-flink-sql-the-easy-way-d9d48a95ae57 
> 
>   The easy way to learn Flink Sql.
> 
> Hope it would be helpful for you and welcome to join our community to discuss 
> with others. http://zeppelin.apache.org/community.html 
> 
> 
> 
> -- 
> Best Regards
> 
> Jeff Zhang

Re: mini-batch 设置后没效果

2021-08-25 Thread Leonard Xu


> 如何退订这个邮件订阅了

如果需要取消订阅 user-zh@flink.apache.org 邮件组，请发送任意内容的邮件到 
user-zh-unsubscr...@flink.apache.org 
  即可

Best,
Leonard

Re: Flink SQL Api不支持TIMESTAMP(p) WITH TIME ZONE 类型的列

2021-08-19 Thread Leonard Xu

Hello,

Flink 还不支持 TIMESTAMP WITH TIME ZONE 类型，

目前支持的有： 
TIMESTAMP WITHOUT TIME ZONE, 缩写为 TIMESTAMP
TIMESTAMP WITH LOCAL TIME ZONE，缩写为TIMESTAMP_LTZ

祝好，
Leonard

> 在 2021年8月19日，20:51，changfeng  写道：
> 
> ` TIMESTAMP(6) WITH TIME ZONE

Re: 请教下Flink时间戳问题

2021-08-15 Thread Leonard Xu

Hi,
你贴的图都挂了，需要传图可以用下图床工具，代码少可以直接贴代码。
TIMESTAMP 类型中 显示的T 没有任何含义，只是 format 一个时间戳时的一个分割符，你最终把 TIMESTAMP 
写入到你的sink，你自己的sink（比如mysql）会有其自己的format。
第二个问题，看不到你的图，你看下你flink的版本，1.13后这个TIMESTAMP_LTZ类型支持才完善的。

祝好，
Leonard


> 在 2021年8月16日，10:27，Geoff nie  写道：
> 
> 问题一：flink timestamp时间戳为何中间多了个T，怎么才能少去中间T呢？
>

Re: Flink SQL向下兼容吗？

2021-08-11 Thread Leonard Xu

这里的SQL是指DDL还是DML，通常 DML都是兼容的，且一般不会有不兼容的升级，
DDL 语法 各家 SQL 方言都有自己的语法，这个比较灵活，FLINK SQL 的DDL 各个版本稍有不同，但 Flink SQL 新版本都是兼容老的 
DDL的，
只是新版本上的DDL语法如果提供了更丰富的功能，那么老版本的DDL则不能提供 。

所以我理解你关心的兼容性问题是不存在的，但请注意如果你的SQL作业是有状态的，需要带状态升级，这些状态都是跨版本不兼容的。

祝好，
Leonard

> 在 2021年8月10日，11:44，Jason Lee  写道：
> 
> 各位大佬好，
> 
> 
> 请教一个问题，我们最近打算升级Flink 版本，想问一下升级之后的已有任务的SQL会兼容到新版本吗？
> 比如我升级到1.13，那我1.10的SQL语法能被兼容吗？
> 
> 
> 感恩
> 
> 
> | |
> Chuang Li
> |
> |
> jasonlee1...@163.com
> |
> 签名由网易邮箱大师定制
>

Re: 退订

2021-08-11 Thread Leonard Xu

如果需要取消订阅 user-zh@flink.apache.org 邮件组，请发送任意内容的邮件到 
user-zh-unsubscr...@flink.apache.org 
Best,
Leonard

> 在 2021年8月6日，10:49，汪嘉富  写道：
> 
> 退订
>

Re: 退订

2021-08-11 Thread Leonard Xu

如果需要取消订阅 user-zh@flink.apache.org 邮件组，请发送任意内容的邮件到 
user-zh-unsubscr...@flink.apache.org 

Best,
Leonard

> 在 2021年8月11日，08:16，Lee2097  写道：
> 
> 退订

1 2 3 4 5 6 >

1 - 100 of 544 matches

Mail list logo