flink的sql gateway支持自定义的UDF吗?

2023-11-01 文章 RS
Hi
flink的sql gateway支持自定义的UDF吗?包括java和python的,有示例可以参考下吗?

Re:flink 1.18.0的sql gateway支持提交作业到k8s上运行吗?

2023-11-01 文章 RS
Hi,
提交到本地是flink配置文件里面配置的jobmanager的地址,所以肯定也是提交到K8S的吧
yarn的不太清楚。





在 2023-10-30 14:36:23,"casel.chen"  写道:
>想问一下目前flink 1.18.0的sql gateway只支持提交作业到本地运行吗?能否支持提交作业到yarn或k8s上运行呢?


[ANNOUNCE] Apache Flink Kubernetes Operator 1.6.1 released

2023-10-30 文章 Rui Fan
The Apache Flink community is very happy to announce the release of Apache
Flink Kubernetes Operator 1.6.1.

Please check out the release blog post for an overview of the release:
https://flink.apache.org/2023/10/27/apache-flink-kubernetes-operator-1.6.1-release-announcement/

The release is available for download at:
https://flink.apache.org/downloads.html

Maven artifacts for Flink Kubernetes Operator can be found at:
https://search.maven.org/artifact/org.apache.flink/flink-kubernetes-operator

Official Docker image for Flink Kubernetes Operator applications can be
found at:
https://hub.docker.com/r/apache/flink-kubernetes-operator

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12353784

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Rui Fan


Re: Re: flink 1.18.0 使用 hive beeline + jdbc driver连接sql gateway失败

2023-10-30 文章 Benchao Li
hiveserver2 endpoint 就是让 flink gateway 直接变成 hive server2,对外来讲它就是 hive
server2 了,它可以直接跟已有的跟 hive server2 的工具配合一起使用。

但是现在你其实用的是 flink jdbc driver,这个并不是跟 hive server2 交互,它就是跟 flink gateway
交互,所以你用hive server2的模式启动,它就不认识了。

casel.chen  于2023年10月30日周一 14:36写道:
>
> 果然不指定endpoint为hiveserver2类型后使用hive beeline工具连接上了。感谢!
> 不过我仍然有个疑问,看官网文档上有写提供 hiveserver2 endpoint 
> 是为了兼容hive方言,按理也应该可以使用beeline连接上,因为原本beeline支持连接hiveserver2
> 以下是原文:
> HiveServer2 Endpoint is compatible with HiveServer2 wire protocol and allows 
> users to interact (e.g. submit Hive SQL) with Flink SQL Gateway with existing 
> Hive clients, such as Hive JDBC, Beeline, DBeaver, Apache Superset and so on.
> 这里有提到Beeline工具,难道不是 beeline> !connect jdbc:flink://localhost:8083 这样的连接方式了?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> 在 2023-10-30 11:27:15,"Benchao Li"  写道:
> >Hi casel,
> >
> >Flink JDBC 链接到 gateway 目前使用的是 flink 的 gateway 接口,所以你在启动 gateway
> >的时候不用指定 endpoint 为 hiveserver2 类型,用 Flink 默认的 gateway endpoint 类型即可。
> >
> >casel.chen  于2023年10月29日周日 17:24写道:
> >>
> >> 1. 启动flink集群
> >> bin/start-cluster.sh
> >>
> >>
> >> 2. 启动sql gateway
> >> bin/sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2
> >>
> >>
> >> 3. 将flink-sql-jdbc-driver-bundle-1.18.0.jar放到apache-hive-3.1.2-bin/lib目录下
> >>
> >>
> >> 4. 到apache-hive-3.1.2-bin目录下启动beeline连接sql gateway,提示输入用户名和密码时直接按的回车
> >> $ bin/beeline
> >> SLF4J: Class path contains multiple SLF4J bindings.
> >> SLF4J: Found binding in 
> >> [jar:file:/Users/admin/dev/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> >> SLF4J: Found binding in 
> >> [jar:file:/Users/admin/dev/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> >> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> >> explanation.
> >> SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
> >> Beeline version 3.1.2 by Apache Hive
> >> beeline> !connect jdbc:flink://localhost:8083
> >> Connecting to jdbc:flink://localhost:8083
> >> Enter username for jdbc:flink://localhost:8083:
> >> Enter password for jdbc:flink://localhost:8083:
> >> Failed to create the executor.
> >> 0: jdbc:flink://localhost:8083 (closed)> CREATE TABLE T(
> >> . . . . . . . . . . . . . . . . . . . .>   a INT,
> >> . . . . . . . . . . . . . . . . . . . .>   b VARCHAR(10)
> >> . . . . . . . . . . . . . . . . . . . .> ) WITH (
> >> . . . . . . . . . . . . . . . . . . . .>   'connector' = 'filesystem',
> >> . . . . . . . . . . . . . . . . . . . .>   'path' = 'file:///tmp/T.csv',
> >> . . . . . . . . . . . . . . . . . . . .>   'format' = 'csv'
> >> . . . . . . . . . . . . . . . . . . . .> );
> >> Failed to create the executor.
> >> Connection is already closed.
> >>
> >
> >
> >--
> >
> >Best,
> >Benchao Li



-- 

Best,
Benchao Li


Re:Re:Re:回复: flink sql不支持show create catalog 吗?

2023-10-30 文章 casel.chen
谢谢解答,我查了一下目前有两种CatalogStore实现,一个是基于内存的,另一个是基于文件系统的。
请问要如何配置基于文件系统的CatalogStore?这个文件可以在对象存储上吗?flink sql client要如何使用这个CatalogStore? 
谢谢!

















在 2023-10-30 10:28:34,"Xuyang"  写道:
>Hi, CatalogStore 的引入我理解是为了Catalog能被更好地管理、注册和元数据存储,具体motivation可以参考Flip295[1].
>我的理解是倒不是说“引入CatalogStore后才可以提供show create 
>catalog语法支持”,而是之前没有直接存储catalog配置的地方和能力,在CatalogStore之后,天然支持了对catalog配置的存储,因此这个feat就可以直接快速的支持了。
>
>
>
>
>[1] 
>https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations
>
>
>
>
>--
>
>Best!
>Xuyang
>
>
>
>
>
>在 2023-10-29 20:34:52,"casel.chen"  写道:
>>请问flink 1.18引入的CatalogStore是为了解决什么问题呢?为什么引入CatalogStore后才可以提供show create 
>>catalog语法支持?
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>在 2023-10-20 17:03:46,"李宇彬"  写道:
>>>Hi Feng
>>>
>>>
>>>我之前建过一个jira(https://issues.apache.org/jira/browse/FLINK-24939),引入CatalogStore后,实现这个特性的时机应该已经成熟了,我们这边业务场景里用到了很多catalog,管理起来很麻烦,有这个特性会好很多。
>>>| |
>>> 回复的原邮件 
>>>| 发件人 | Feng Jin |
>>>| 发送日期 | 2023年10月20日 13:18 |
>>>| 收件人 |  |
>>>| 主题 | Re: flink sql不支持show create catalog 吗? |
>>>hi casel
>>>
>>>
>>>从 1.18 开始,引入了 CatalogStore,持久化了 Catalog 的配置,确实可以支持 show create catalog 了。
>>>
>>>
>>>Best,
>>>Feng
>>>
>>>On Fri, Oct 20, 2023 at 11:55 AM casel.chen  wrote:
>>>
>>>之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink
>>>sql不支持show create catalog 。
>>>而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?


flink 1.18.0的sql gateway支持提交作业到k8s上运行吗?

2023-10-30 文章 casel.chen
想问一下目前flink 1.18.0的sql gateway只支持提交作业到本地运行吗?能否支持提交作业到yarn或k8s上运行呢?

Re:Re: flink 1.18.0 使用 hive beeline + jdbc driver连接sql gateway失败

2023-10-30 文章 casel.chen
果然不指定endpoint为hiveserver2类型后使用hive beeline工具连接上了。感谢!
不过我仍然有个疑问,看官网文档上有写提供 hiveserver2 endpoint 
是为了兼容hive方言,按理也应该可以使用beeline连接上,因为原本beeline支持连接hiveserver2
以下是原文:
HiveServer2 Endpoint is compatible with HiveServer2 wire protocol and allows 
users to interact (e.g. submit Hive SQL) with Flink SQL Gateway with existing 
Hive clients, such as Hive JDBC, Beeline, DBeaver, Apache Superset and so on.
这里有提到Beeline工具,难道不是 beeline> !connect jdbc:flink://localhost:8083 这样的连接方式了?



















在 2023-10-30 11:27:15,"Benchao Li"  写道:
>Hi casel,
>
>Flink JDBC 链接到 gateway 目前使用的是 flink 的 gateway 接口,所以你在启动 gateway
>的时候不用指定 endpoint 为 hiveserver2 类型,用 Flink 默认的 gateway endpoint 类型即可。
>
>casel.chen  于2023年10月29日周日 17:24写道:
>>
>> 1. 启动flink集群
>> bin/start-cluster.sh
>>
>>
>> 2. 启动sql gateway
>> bin/sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2
>>
>>
>> 3. 将flink-sql-jdbc-driver-bundle-1.18.0.jar放到apache-hive-3.1.2-bin/lib目录下
>>
>>
>> 4. 到apache-hive-3.1.2-bin目录下启动beeline连接sql gateway,提示输入用户名和密码时直接按的回车
>> $ bin/beeline
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in 
>> [jar:file:/Users/admin/dev/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in 
>> [jar:file:/Users/admin/dev/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
>> explanation.
>> SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
>> Beeline version 3.1.2 by Apache Hive
>> beeline> !connect jdbc:flink://localhost:8083
>> Connecting to jdbc:flink://localhost:8083
>> Enter username for jdbc:flink://localhost:8083:
>> Enter password for jdbc:flink://localhost:8083:
>> Failed to create the executor.
>> 0: jdbc:flink://localhost:8083 (closed)> CREATE TABLE T(
>> . . . . . . . . . . . . . . . . . . . .>   a INT,
>> . . . . . . . . . . . . . . . . . . . .>   b VARCHAR(10)
>> . . . . . . . . . . . . . . . . . . . .> ) WITH (
>> . . . . . . . . . . . . . . . . . . . .>   'connector' = 'filesystem',
>> . . . . . . . . . . . . . . . . . . . .>   'path' = 'file:///tmp/T.csv',
>> . . . . . . . . . . . . . . . . . . . .>   'format' = 'csv'
>> . . . . . . . . . . . . . . . . . . . .> );
>> Failed to create the executor.
>> Connection is already closed.
>>
>
>
>-- 
>
>Best,
>Benchao Li


Re: flink sql如何处理脏数据问题?

2023-10-29 文章 ying lin
还有一种做法就是使用datastream,datastream支持sideoutput,但 flink
sql不支持,不过有一种迂回的做法就是flinksql -> datastream -> flink
sql,可以查一下官网资料,flinksql和datastream可以互相转换。

Xuyang  于2023年10月30日周一 10:17写道:

> Flink SQL目前对于脏数据没有类似side output的机制来输出,这个需求用自定义connector应该可以实现。
>
>
>
>
>
>
>
> --
>
> Best!
> Xuyang
>
>
>
>
>
> 在 2023-10-29 10:23:38,"casel.chen"  写道:
> >场景:使用flink
> sql将数据写入下游OLAP系统,如doris,遇到一些异常情况,比如字段值超长或者分区字段值为当前doris表不存在的分区(需要先人为创建)等等,当前写入这些脏数据会使得作业写入报错,进而导致作业失败。我们是希望能够将这些“脏”数据单独发到一个kafka
> topic或者写入一个文件便于事后审查。这个目前有办法做到吗?
>


Re: flink 1.18.0 使用 hive beeline + jdbc driver连接sql gateway失败

2023-10-29 文章 Benchao Li
Hi casel,

Flink JDBC 链接到 gateway 目前使用的是 flink 的 gateway 接口,所以你在启动 gateway
的时候不用指定 endpoint 为 hiveserver2 类型,用 Flink 默认的 gateway endpoint 类型即可。

casel.chen  于2023年10月29日周日 17:24写道:
>
> 1. 启动flink集群
> bin/start-cluster.sh
>
>
> 2. 启动sql gateway
> bin/sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2
>
>
> 3. 将flink-sql-jdbc-driver-bundle-1.18.0.jar放到apache-hive-3.1.2-bin/lib目录下
>
>
> 4. 到apache-hive-3.1.2-bin目录下启动beeline连接sql gateway,提示输入用户名和密码时直接按的回车
> $ bin/beeline
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/Users/admin/dev/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/Users/admin/dev/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
> Beeline version 3.1.2 by Apache Hive
> beeline> !connect jdbc:flink://localhost:8083
> Connecting to jdbc:flink://localhost:8083
> Enter username for jdbc:flink://localhost:8083:
> Enter password for jdbc:flink://localhost:8083:
> Failed to create the executor.
> 0: jdbc:flink://localhost:8083 (closed)> CREATE TABLE T(
> . . . . . . . . . . . . . . . . . . . .>   a INT,
> . . . . . . . . . . . . . . . . . . . .>   b VARCHAR(10)
> . . . . . . . . . . . . . . . . . . . .> ) WITH (
> . . . . . . . . . . . . . . . . . . . .>   'connector' = 'filesystem',
> . . . . . . . . . . . . . . . . . . . .>   'path' = 'file:///tmp/T.csv',
> . . . . . . . . . . . . . . . . . . . .>   'format' = 'csv'
> . . . . . . . . . . . . . . . . . . . .> );
> Failed to create the executor.
> Connection is already closed.
>


-- 

Best,
Benchao Li


Re:Re:回复: flink sql不支持show create catalog 吗?

2023-10-29 文章 Xuyang
Hi, CatalogStore 的引入我理解是为了Catalog能被更好地管理、注册和元数据存储,具体motivation可以参考Flip295[1].
我的理解是倒不是说“引入CatalogStore后才可以提供show create 
catalog语法支持”,而是之前没有直接存储catalog配置的地方和能力,在CatalogStore之后,天然支持了对catalog配置的存储,因此这个feat就可以直接快速的支持了。




[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-295%3A+Support+lazy+initialization+of+catalogs+and+persistence+of+catalog+configurations




--

Best!
Xuyang





在 2023-10-29 20:34:52,"casel.chen"  写道:
>请问flink 1.18引入的CatalogStore是为了解决什么问题呢?为什么引入CatalogStore后才可以提供show create 
>catalog语法支持?
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>在 2023-10-20 17:03:46,"李宇彬"  写道:
>>Hi Feng
>>
>>
>>我之前建过一个jira(https://issues.apache.org/jira/browse/FLINK-24939),引入CatalogStore后,实现这个特性的时机应该已经成熟了,我们这边业务场景里用到了很多catalog,管理起来很麻烦,有这个特性会好很多。
>>| |
>> 回复的原邮件 
>>| 发件人 | Feng Jin |
>>| 发送日期 | 2023年10月20日 13:18 |
>>| 收件人 |  |
>>| 主题 | Re: flink sql不支持show create catalog 吗? |
>>hi casel
>>
>>
>>从 1.18 开始,引入了 CatalogStore,持久化了 Catalog 的配置,确实可以支持 show create catalog 了。
>>
>>
>>Best,
>>Feng
>>
>>On Fri, Oct 20, 2023 at 11:55 AM casel.chen  wrote:
>>
>>之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink
>>sql不支持show create catalog 。
>>而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?


Re:flink sql如何处理脏数据问题?

2023-10-29 文章 Xuyang
Flink SQL目前对于脏数据没有类似side output的机制来输出,这个需求用自定义connector应该可以实现。







--

Best!
Xuyang





在 2023-10-29 10:23:38,"casel.chen"  写道:
>场景:使用flink 
>sql将数据写入下游OLAP系统,如doris,遇到一些异常情况,比如字段值超长或者分区字段值为当前doris表不存在的分区(需要先人为创建)等等,当前写入这些脏数据会使得作业写入报错,进而导致作业失败。我们是希望能够将这些“脏”数据单独发到一个kafka
> topic或者写入一个文件便于事后审查。这个目前有办法做到吗?


Re:回复: flink sql不支持show create catalog 吗?

2023-10-29 文章 casel.chen
请问flink 1.18引入的CatalogStore是为了解决什么问题呢?为什么引入CatalogStore后才可以提供show create 
catalog语法支持?

















在 2023-10-20 17:03:46,"李宇彬"  写道:
>Hi Feng
>
>
>我之前建过一个jira(https://issues.apache.org/jira/browse/FLINK-24939),引入CatalogStore后,实现这个特性的时机应该已经成熟了,我们这边业务场景里用到了很多catalog,管理起来很麻烦,有这个特性会好很多。
>| |
> 回复的原邮件 
>| 发件人 | Feng Jin |
>| 发送日期 | 2023年10月20日 13:18 |
>| 收件人 |  |
>| 主题 | Re: flink sql不支持show create catalog 吗? |
>hi casel
>
>
>从 1.18 开始,引入了 CatalogStore,持久化了 Catalog 的配置,确实可以支持 show create catalog 了。
>
>
>Best,
>Feng
>
>On Fri, Oct 20, 2023 at 11:55 AM casel.chen  wrote:
>
>之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink
>sql不支持show create catalog 。
>而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?


flink 1.18.0 使用 hive beeline + jdbc driver连接sql gateway失败

2023-10-29 文章 casel.chen
1. 启动flink集群
bin/start-cluster.sh


2. 启动sql gateway
bin/sql-gateway.sh start -Dsql-gateway.endpoint.type=hiveserver2


3. 将flink-sql-jdbc-driver-bundle-1.18.0.jar放到apache-hive-3.1.2-bin/lib目录下


4. 到apache-hive-3.1.2-bin目录下启动beeline连接sql gateway,提示输入用户名和密码时直接按的回车
$ bin/beeline 
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/Users/admin/dev/hadoop-3.3.4/share/hadoop/common/lib/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/Users/admin/dev/apache-hive-3.1.2-bin/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
Beeline version 3.1.2 by Apache Hive
beeline> !connect jdbc:flink://localhost:8083
Connecting to jdbc:flink://localhost:8083
Enter username for jdbc:flink://localhost:8083: 
Enter password for jdbc:flink://localhost:8083: 
Failed to create the executor.
0: jdbc:flink://localhost:8083 (closed)> CREATE TABLE T(
. . . . . . . . . . . . . . . . . . . .>   a INT,
. . . . . . . . . . . . . . . . . . . .>   b VARCHAR(10)
. . . . . . . . . . . . . . . . . . . .> ) WITH (
. . . . . . . . . . . . . . . . . . . .>   'connector' = 'filesystem',
. . . . . . . . . . . . . . . . . . . .>   'path' = 'file:///tmp/T.csv',
. . . . . . . . . . . . . . . . . . . .>   'format' = 'csv'
. . . . . . . . . . . . . . . . . . . .> );
Failed to create the executor.
Connection is already closed.



flink sql如何处理脏数据问题?

2023-10-28 文章 casel.chen
场景:使用flink 
sql将数据写入下游OLAP系统,如doris,遇到一些异常情况,比如字段值超长或者分区字段值为当前doris表不存在的分区(需要先人为创建)等等,当前写入这些脏数据会使得作业写入报错,进而导致作业失败。我们是希望能够将这些“脏”数据单独发到一个kafka
 topic或者写入一个文件便于事后审查。这个目前有办法做到吗?

FW: Unable to achieve Flink kafka connector exactly once delivery semantics.

2023-10-27 文章 Gopal Chennupati (gchennup)
Hi,
Can someone please help me to resolve the below issue while running flink job.
Or provide me any doc/example which describe the exactly-once delivery 
guarantee semantics.

Thanks,
Gopal.

From: Gopal Chennupati (gchennup) 
Date: Friday, 27 October 2023 at 11:00 AM
To: commun...@flink.apache.org , 
u...@flink.apache.org 
Subject: Unable to achieve Flink kafka connector exactly once delivery 
semantics.
Hi Team,


I am trying to configure my kafka sink connector with “exactly-once” delivery 
guarantee, however it’s failing when I run the flink job with this 
configuration, here is the full exception stack trace from the job logs.


[Source: SG-SGT-TransformerJob -> Map -> Sink: Writer -> Sink: Committer 
(5/10)#12] WARN org.apache.kafka.common.utils.AppInfoParser - Error registering 
AppInfo mbean

javax.management.InstanceAlreadyExistsException: 
kafka.producer:type=app-info,id=producer-sgt-4-1

  at 
java.management/com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:436)

  at 
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1855)

  at 
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:955)

  at 
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:890)

  at 
java.management/com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:320)

  at 
java.management/com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522)

  at 
org.apache.kafka.common.utils.AppInfoParser.registerAppInfo(AppInfoParser.java:64)

  at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:433)

  at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:289)

  at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:316)

  at 
org.apache.kafka.clients.producer.KafkaProducer.(KafkaProducer.java:301)

  at 
org.apache.flink.connector.kafka.sink.FlinkKafkaInternalProducer.(FlinkKafkaInternalProducer.java:55)

  at 
org.apache.flink.connector.kafka.sink.KafkaWriter.getOrCreateTransactionalProducer(KafkaWriter.java:332)

  at 
org.apache.flink.connector.kafka.sink.TransactionAborter.abortTransactionOfSubtask(TransactionAborter.java:104)

  at 
org.apache.flink.connector.kafka.sink.TransactionAborter.abortTransactionsWithPrefix(TransactionAborter.java:82)

  at 
org.apache.flink.connector.kafka.sink.TransactionAborter.abortLingeringTransactions(TransactionAborter.java:66)

  at 
org.apache.flink.connector.kafka.sink.KafkaWriter.abortLingeringTransactions(KafkaWriter.java:295)

  at 
org.apache.flink.connector.kafka.sink.KafkaWriter.(KafkaWriter.java:176)

  at 
org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:111)

  at 
org.apache.flink.connector.kafka.sink.KafkaSink.createWriter(KafkaSink.java:57)

  at 
org.apache.flink.streaming.runtime.operators.sink.StatefulSinkWriterStateHandler.createWriter(StatefulSinkWriterStateHandler.java:117)

  at 
org.apache.flink.streaming.runtime.operators.sink.SinkWriterOperator.initializeState(SinkWriterOperator.java:146)

  at 
org.apache.flink.streaming.api.operators.StreamOperatorStateHandler.initializeOperatorState(StreamOperatorStateHandler.java:122)

  at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:274)

  at 
org.apache.flink.streaming.runtime.tasks.RegularOperatorChain.initializeStateAndOpenOperators(RegularOperatorChain.java:106)

  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:734)

  at 
org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)

  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restoreInternal(StreamTask.java:709)

  at 
org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:675)

  at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)

  at 
org.apache.flink.runtime.taskmanager.Task.restoreAndInvoke(Task.java:921)

  at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:745)

  at org.apache.flink.runtime.taskmanager.Task.run(Task.java:562)

  at java.base/java.lang.Thread.run(Thread.java:834)


And here is the producer configuration,
KafkaSink sink = KafkaSink
.builder()

.setBootstrapServers(producerConfig.getProperty("bootstrap.servers"))
.setKafkaProducerConfig(producerConfig)
.setRecordSerializer(new 
GenericMessageSerialization<>(generic_key.class,
generic_value.class, 
producerConfig.getProperty("topic"),

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Jark Wu
Congratulations and thanks release managers and everyone who has
contributed!

Best,
Jark

On Fri, 27 Oct 2023 at 12:25, Hang Ruan  wrote:

> Congratulations!
>
> Best,
> Hang
>
> Samrat Deb  于2023年10月27日周五 11:50写道:
>
> > Congratulations on the great release
> >
> > Bests,
> > Samrat
> >
> > On Fri, 27 Oct 2023 at 7:59 AM, Yangze Guo  wrote:
> >
> > > Great work! Congratulations to everyone involved!
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren 
> wrote:
> > > >
> > > > Congratulations and big THANK YOU to everyone helping with this
> > release!
> > > >
> > > > Best,
> > > > Qingsheng
> > > >
> > > > On Fri, Oct 27, 2023 at 10:18 AM Benchao Li 
> > > wrote:
> > > >>
> > > >> Great work, thanks everyone involved!
> > > >>
> > > >> Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
> > > >> >
> > > >> > Thanks for the great work!
> > > >> >
> > > >> > Best,
> > > >> > Rui
> > > >> >
> > > >> > On Fri, Oct 27, 2023 at 10:03 AM Paul Lam 
> > > wrote:
> > > >> >
> > > >> > > Finally! Thanks to all!
> > > >> > >
> > > >> > > Best,
> > > >> > > Paul Lam
> > > >> > >
> > > >> > > > 2023年10月27日 03:58,Alexander Fedulov <
> > alexander.fedu...@gmail.com>
> > > 写道:
> > > >> > > >
> > > >> > > > Great work, thanks everyone!
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Alexander
> > > >> > > >
> > > >> > > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser <
> > > martijnvis...@apache.org>
> > > >> > > > wrote:
> > > >> > > >
> > > >> > > >> Thank you all who have contributed!
> > > >> > > >>
> > > >> > > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin <
> > > jinfeng1...@gmail.com>
> > > >> > > >>
> > > >> > > >>> Thanks for the great work! Congratulations
> > > >> > > >>>
> > > >> > > >>>
> > > >> > > >>> Best,
> > > >> > > >>> Feng Jin
> > > >> > > >>>
> > > >> > > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu <
> > xbjt...@gmail.com>
> > > wrote:
> > > >> > > >>>
> > > >> > >  Congratulations, Well done!
> > > >> > > 
> > > >> > >  Best,
> > > >> > >  Leonard
> > > >> > > 
> > > >> > >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee <
> > > lincoln.8...@gmail.com>
> > > >> > >  wrote:
> > > >> > > 
> > > >> > > > Thanks for the great work! Congrats all!
> > > >> > > >
> > > >> > > > Best,
> > > >> > > > Lincoln Lee
> > > >> > > >
> > > >> > > >
> > > >> > > > Jing Ge  于2023年10月27日周五
> > 00:16写道:
> > > >> > > >
> > > >> > > >> The Apache Flink community is very happy to announce the
> > > release of
> > > >> > > > Apache
> > > >> > > >> Flink 1.18.0, which is the first release for the Apache
> > > Flink 1.18
> > > >> > > > series.
> > > >> > > >>
> > > >> > > >> Apache Flink® is an open-source unified stream and batch
> > data
> > > >> > >  processing
> > > >> > > >> framework for distributed, high-performing,
> > > always-available, and
> > > >> > > > accurate
> > > >> > > >> data applications.
> > > >> > > >>
> > > >> > > >> The release is available for download at:
> > > >> > > >> https://flink.apache.org/downloads.html
> > > >> > > >>
> > > >> > > >> Please check out the release blog post for an overview of
> > the
> > > >> > > > improvements
> > > >> > > >> for this release:
> > > >> > > >>
> > > >> > > >>
> > > >> > > >
> > > >> > > 
> > > >> > > >>>
> > > >> > > >>
> > > >> > >
> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > > >> > > >>
> > > >> > > >> The full release notes are available in Jira:
> > > >> > > >>
> > > >> > > >>
> > > >> > > >
> > > >> > > 
> > > >> > > >>>
> > > >> > > >>
> > > >> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > > >> > > >>
> > > >> > > >> We would like to thank all contributors of the Apache
> Flink
> > > >> > > >> community
> > > >> > >  who
> > > >> > > >> made this release possible!
> > > >> > > >>
> > > >> > > >> Best regards,
> > > >> > > >> Konstantin, Qingsheng, Sergey, and Jing
> > > >> > > >>
> > > >> > > >
> > > >> > > 
> > > >> > > >>>
> > > >> > > >>
> > > >> > >
> > > >> > >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >>
> > > >> Best,
> > > >> Benchao Li
> > >
> >
>


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Hang Ruan
Congratulations!

Best,
Hang

Samrat Deb  于2023年10月27日周五 11:50写道:

> Congratulations on the great release
>
> Bests,
> Samrat
>
> On Fri, 27 Oct 2023 at 7:59 AM, Yangze Guo  wrote:
>
> > Great work! Congratulations to everyone involved!
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren  wrote:
> > >
> > > Congratulations and big THANK YOU to everyone helping with this
> release!
> > >
> > > Best,
> > > Qingsheng
> > >
> > > On Fri, Oct 27, 2023 at 10:18 AM Benchao Li 
> > wrote:
> > >>
> > >> Great work, thanks everyone involved!
> > >>
> > >> Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
> > >> >
> > >> > Thanks for the great work!
> > >> >
> > >> > Best,
> > >> > Rui
> > >> >
> > >> > On Fri, Oct 27, 2023 at 10:03 AM Paul Lam 
> > wrote:
> > >> >
> > >> > > Finally! Thanks to all!
> > >> > >
> > >> > > Best,
> > >> > > Paul Lam
> > >> > >
> > >> > > > 2023年10月27日 03:58,Alexander Fedulov <
> alexander.fedu...@gmail.com>
> > 写道:
> > >> > > >
> > >> > > > Great work, thanks everyone!
> > >> > > >
> > >> > > > Best,
> > >> > > > Alexander
> > >> > > >
> > >> > > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser <
> > martijnvis...@apache.org>
> > >> > > > wrote:
> > >> > > >
> > >> > > >> Thank you all who have contributed!
> > >> > > >>
> > >> > > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin <
> > jinfeng1...@gmail.com>
> > >> > > >>
> > >> > > >>> Thanks for the great work! Congratulations
> > >> > > >>>
> > >> > > >>>
> > >> > > >>> Best,
> > >> > > >>> Feng Jin
> > >> > > >>>
> > >> > > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu <
> xbjt...@gmail.com>
> > wrote:
> > >> > > >>>
> > >> > >  Congratulations, Well done!
> > >> > > 
> > >> > >  Best,
> > >> > >  Leonard
> > >> > > 
> > >> > >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee <
> > lincoln.8...@gmail.com>
> > >> > >  wrote:
> > >> > > 
> > >> > > > Thanks for the great work! Congrats all!
> > >> > > >
> > >> > > > Best,
> > >> > > > Lincoln Lee
> > >> > > >
> > >> > > >
> > >> > > > Jing Ge  于2023年10月27日周五
> 00:16写道:
> > >> > > >
> > >> > > >> The Apache Flink community is very happy to announce the
> > release of
> > >> > > > Apache
> > >> > > >> Flink 1.18.0, which is the first release for the Apache
> > Flink 1.18
> > >> > > > series.
> > >> > > >>
> > >> > > >> Apache Flink® is an open-source unified stream and batch
> data
> > >> > >  processing
> > >> > > >> framework for distributed, high-performing,
> > always-available, and
> > >> > > > accurate
> > >> > > >> data applications.
> > >> > > >>
> > >> > > >> The release is available for download at:
> > >> > > >> https://flink.apache.org/downloads.html
> > >> > > >>
> > >> > > >> Please check out the release blog post for an overview of
> the
> > >> > > > improvements
> > >> > > >> for this release:
> > >> > > >>
> > >> > > >>
> > >> > > >
> > >> > > 
> > >> > > >>>
> > >> > > >>
> > >> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > >> > > >>
> > >> > > >> The full release notes are available in Jira:
> > >> > > >>
> > >> > > >>
> > >> > > >
> > >> > > 
> > >> > > >>>
> > >> > > >>
> > >> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > >> > > >>
> > >> > > >> We would like to thank all contributors of the Apache Flink
> > >> > > >> community
> > >> > >  who
> > >> > > >> made this release possible!
> > >> > > >>
> > >> > > >> Best regards,
> > >> > > >> Konstantin, Qingsheng, Sergey, and Jing
> > >> > > >>
> > >> > > >
> > >> > > 
> > >> > > >>>
> > >> > > >>
> > >> > >
> > >> > >
> > >>
> > >>
> > >>
> > >> --
> > >>
> > >> Best,
> > >> Benchao Li
> >
>


Re: 退订

2023-10-26 文章 Junrui Lee
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。

Best,
Junrui

13430298988 <13430298...@163.com> 于2023年10月27日周五 11:00写道:

> 退订


Re: 退订

2023-10-26 文章 Junrui Lee
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。

Best,
Junrui

chenyu_opensource  于2023年10月27日周五 10:20写道:

> 退订


退订

2023-10-26 文章 13430298988
退订

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Yangze Guo
Great work! Congratulations to everyone involved!

Best,
Yangze Guo

On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren  wrote:
>
> Congratulations and big THANK YOU to everyone helping with this release!
>
> Best,
> Qingsheng
>
> On Fri, Oct 27, 2023 at 10:18 AM Benchao Li  wrote:
>>
>> Great work, thanks everyone involved!
>>
>> Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
>> >
>> > Thanks for the great work!
>> >
>> > Best,
>> > Rui
>> >
>> > On Fri, Oct 27, 2023 at 10:03 AM Paul Lam  wrote:
>> >
>> > > Finally! Thanks to all!
>> > >
>> > > Best,
>> > > Paul Lam
>> > >
>> > > > 2023年10月27日 03:58,Alexander Fedulov  写道:
>> > > >
>> > > > Great work, thanks everyone!
>> > > >
>> > > > Best,
>> > > > Alexander
>> > > >
>> > > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser 
>> > > > wrote:
>> > > >
>> > > >> Thank you all who have contributed!
>> > > >>
>> > > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin 
>> > > >>
>> > > >>> Thanks for the great work! Congratulations
>> > > >>>
>> > > >>>
>> > > >>> Best,
>> > > >>> Feng Jin
>> > > >>>
>> > > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  
>> > > >>> wrote:
>> > > >>>
>> > >  Congratulations, Well done!
>> > > 
>> > >  Best,
>> > >  Leonard
>> > > 
>> > >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
>> > >  
>> > >  wrote:
>> > > 
>> > > > Thanks for the great work! Congrats all!
>> > > >
>> > > > Best,
>> > > > Lincoln Lee
>> > > >
>> > > >
>> > > > Jing Ge  于2023年10月27日周五 00:16写道:
>> > > >
>> > > >> The Apache Flink community is very happy to announce the release 
>> > > >> of
>> > > > Apache
>> > > >> Flink 1.18.0, which is the first release for the Apache Flink 1.18
>> > > > series.
>> > > >>
>> > > >> Apache Flink® is an open-source unified stream and batch data
>> > >  processing
>> > > >> framework for distributed, high-performing, always-available, and
>> > > > accurate
>> > > >> data applications.
>> > > >>
>> > > >> The release is available for download at:
>> > > >> https://flink.apache.org/downloads.html
>> > > >>
>> > > >> Please check out the release blog post for an overview of the
>> > > > improvements
>> > > >> for this release:
>> > > >>
>> > > >>
>> > > >
>> > > 
>> > > >>>
>> > > >>
>> > > https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
>> > > >>
>> > > >> The full release notes are available in Jira:
>> > > >>
>> > > >>
>> > > >
>> > > 
>> > > >>>
>> > > >>
>> > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
>> > > >>
>> > > >> We would like to thank all contributors of the Apache Flink
>> > > >> community
>> > >  who
>> > > >> made this release possible!
>> > > >>
>> > > >> Best regards,
>> > > >> Konstantin, Qingsheng, Sergey, and Jing
>> > > >>
>> > > >
>> > > 
>> > > >>>
>> > > >>
>> > >
>> > >
>>
>>
>>
>> --
>>
>> Best,
>> Benchao Li


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Qingsheng Ren
Congratulations and big THANK YOU to everyone helping with this release!

Best,
Qingsheng

On Fri, Oct 27, 2023 at 10:18 AM Benchao Li  wrote:

> Great work, thanks everyone involved!
>
> Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
> >
> > Thanks for the great work!
> >
> > Best,
> > Rui
> >
> > On Fri, Oct 27, 2023 at 10:03 AM Paul Lam  wrote:
> >
> > > Finally! Thanks to all!
> > >
> > > Best,
> > > Paul Lam
> > >
> > > > 2023年10月27日 03:58,Alexander Fedulov 
> 写道:
> > > >
> > > > Great work, thanks everyone!
> > > >
> > > > Best,
> > > > Alexander
> > > >
> > > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser <
> martijnvis...@apache.org>
> > > > wrote:
> > > >
> > > >> Thank you all who have contributed!
> > > >>
> > > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin 
> > > >>
> > > >>> Thanks for the great work! Congratulations
> > > >>>
> > > >>>
> > > >>> Best,
> > > >>> Feng Jin
> > > >>>
> > > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu 
> wrote:
> > > >>>
> > >  Congratulations, Well done!
> > > 
> > >  Best,
> > >  Leonard
> > > 
> > >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee <
> lincoln.8...@gmail.com>
> > >  wrote:
> > > 
> > > > Thanks for the great work! Congrats all!
> > > >
> > > > Best,
> > > > Lincoln Lee
> > > >
> > > >
> > > > Jing Ge  于2023年10月27日周五 00:16写道:
> > > >
> > > >> The Apache Flink community is very happy to announce the
> release of
> > > > Apache
> > > >> Flink 1.18.0, which is the first release for the Apache Flink
> 1.18
> > > > series.
> > > >>
> > > >> Apache Flink® is an open-source unified stream and batch data
> > >  processing
> > > >> framework for distributed, high-performing, always-available,
> and
> > > > accurate
> > > >> data applications.
> > > >>
> > > >> The release is available for download at:
> > > >> https://flink.apache.org/downloads.html
> > > >>
> > > >> Please check out the release blog post for an overview of the
> > > > improvements
> > > >> for this release:
> > > >>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > > >>
> > > >> The full release notes are available in Jira:
> > > >>
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > > >>
> > > >> We would like to thank all contributors of the Apache Flink
> > > >> community
> > >  who
> > > >> made this release possible!
> > > >>
> > > >> Best regards,
> > > >> Konstantin, Qingsheng, Sergey, and Jing
> > > >>
> > > >
> > > 
> > > >>>
> > > >>
> > >
> > >
>
>
>
> --
>
> Best,
> Benchao Li
>


退订

2023-10-26 文章 chenyu_opensource
退订

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Benchao Li
Great work, thanks everyone involved!

Rui Fan <1996fan...@gmail.com> 于2023年10月27日周五 10:16写道:
>
> Thanks for the great work!
>
> Best,
> Rui
>
> On Fri, Oct 27, 2023 at 10:03 AM Paul Lam  wrote:
>
> > Finally! Thanks to all!
> >
> > Best,
> > Paul Lam
> >
> > > 2023年10月27日 03:58,Alexander Fedulov  写道:
> > >
> > > Great work, thanks everyone!
> > >
> > > Best,
> > > Alexander
> > >
> > > On Thu, 26 Oct 2023 at 21:15, Martijn Visser 
> > > wrote:
> > >
> > >> Thank you all who have contributed!
> > >>
> > >> Op do 26 okt 2023 om 18:41 schreef Feng Jin 
> > >>
> > >>> Thanks for the great work! Congratulations
> > >>>
> > >>>
> > >>> Best,
> > >>> Feng Jin
> > >>>
> > >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:
> > >>>
> >  Congratulations, Well done!
> > 
> >  Best,
> >  Leonard
> > 
> >  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
> >  wrote:
> > 
> > > Thanks for the great work! Congrats all!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Jing Ge  于2023年10月27日周五 00:16写道:
> > >
> > >> The Apache Flink community is very happy to announce the release of
> > > Apache
> > >> Flink 1.18.0, which is the first release for the Apache Flink 1.18
> > > series.
> > >>
> > >> Apache Flink® is an open-source unified stream and batch data
> >  processing
> > >> framework for distributed, high-performing, always-available, and
> > > accurate
> > >> data applications.
> > >>
> > >> The release is available for download at:
> > >> https://flink.apache.org/downloads.html
> > >>
> > >> Please check out the release blog post for an overview of the
> > > improvements
> > >> for this release:
> > >>
> > >>
> > >
> > 
> > >>>
> > >>
> > https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > >>
> > >> The full release notes are available in Jira:
> > >>
> > >>
> > >
> > 
> > >>>
> > >>
> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > >>
> > >> We would like to thank all contributors of the Apache Flink
> > >> community
> >  who
> > >> made this release possible!
> > >>
> > >> Best regards,
> > >> Konstantin, Qingsheng, Sergey, and Jing
> > >>
> > >
> > 
> > >>>
> > >>
> >
> >



-- 

Best,
Benchao Li


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Rui Fan
Thanks for the great work!

Best,
Rui

On Fri, Oct 27, 2023 at 10:03 AM Paul Lam  wrote:

> Finally! Thanks to all!
>
> Best,
> Paul Lam
>
> > 2023年10月27日 03:58,Alexander Fedulov  写道:
> >
> > Great work, thanks everyone!
> >
> > Best,
> > Alexander
> >
> > On Thu, 26 Oct 2023 at 21:15, Martijn Visser 
> > wrote:
> >
> >> Thank you all who have contributed!
> >>
> >> Op do 26 okt 2023 om 18:41 schreef Feng Jin 
> >>
> >>> Thanks for the great work! Congratulations
> >>>
> >>>
> >>> Best,
> >>> Feng Jin
> >>>
> >>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:
> >>>
>  Congratulations, Well done!
> 
>  Best,
>  Leonard
> 
>  On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
>  wrote:
> 
> > Thanks for the great work! Congrats all!
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Jing Ge  于2023年10月27日周五 00:16写道:
> >
> >> The Apache Flink community is very happy to announce the release of
> > Apache
> >> Flink 1.18.0, which is the first release for the Apache Flink 1.18
> > series.
> >>
> >> Apache Flink® is an open-source unified stream and batch data
>  processing
> >> framework for distributed, high-performing, always-available, and
> > accurate
> >> data applications.
> >>
> >> The release is available for download at:
> >> https://flink.apache.org/downloads.html
> >>
> >> Please check out the release blog post for an overview of the
> > improvements
> >> for this release:
> >>
> >>
> >
> 
> >>>
> >>
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> >>
> >> The full release notes are available in Jira:
> >>
> >>
> >
> 
> >>>
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> >>
> >> We would like to thank all contributors of the Apache Flink
> >> community
>  who
> >> made this release possible!
> >>
> >> Best regards,
> >> Konstantin, Qingsheng, Sergey, and Jing
> >>
> >
> 
> >>>
> >>
>
>


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Paul Lam
Finally! Thanks to all!

Best,
Paul Lam

> 2023年10月27日 03:58,Alexander Fedulov  写道:
> 
> Great work, thanks everyone!
> 
> Best,
> Alexander
> 
> On Thu, 26 Oct 2023 at 21:15, Martijn Visser 
> wrote:
> 
>> Thank you all who have contributed!
>> 
>> Op do 26 okt 2023 om 18:41 schreef Feng Jin 
>> 
>>> Thanks for the great work! Congratulations
>>> 
>>> 
>>> Best,
>>> Feng Jin
>>> 
>>> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:
>>> 
 Congratulations, Well done!
 
 Best,
 Leonard
 
 On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
 wrote:
 
> Thanks for the great work! Congrats all!
> 
> Best,
> Lincoln Lee
> 
> 
> Jing Ge  于2023年10月27日周五 00:16写道:
> 
>> The Apache Flink community is very happy to announce the release of
> Apache
>> Flink 1.18.0, which is the first release for the Apache Flink 1.18
> series.
>> 
>> Apache Flink® is an open-source unified stream and batch data
 processing
>> framework for distributed, high-performing, always-available, and
> accurate
>> data applications.
>> 
>> The release is available for download at:
>> https://flink.apache.org/downloads.html
>> 
>> Please check out the release blog post for an overview of the
> improvements
>> for this release:
>> 
>> 
> 
 
>>> 
>> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
>> 
>> The full release notes are available in Jira:
>> 
>> 
> 
 
>>> 
>> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
>> 
>> We would like to thank all contributors of the Apache Flink
>> community
 who
>> made this release possible!
>> 
>> Best regards,
>> Konstantin, Qingsheng, Sergey, and Jing
>> 
> 
 
>>> 
>> 



Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 liu ron
Great work, thanks everyone!

Best,
Ron

Alexander Fedulov  于2023年10月27日周五 04:00写道:

> Great work, thanks everyone!
>
> Best,
> Alexander
>
> On Thu, 26 Oct 2023 at 21:15, Martijn Visser 
> wrote:
>
> > Thank you all who have contributed!
> >
> > Op do 26 okt 2023 om 18:41 schreef Feng Jin 
> >
> > > Thanks for the great work! Congratulations
> > >
> > >
> > > Best,
> > > Feng Jin
> > >
> > > On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:
> > >
> > > > Congratulations, Well done!
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee  >
> > > > wrote:
> > > >
> > > > > Thanks for the great work! Congrats all!
> > > > >
> > > > > Best,
> > > > > Lincoln Lee
> > > > >
> > > > >
> > > > > Jing Ge  于2023年10月27日周五 00:16写道:
> > > > >
> > > > > > The Apache Flink community is very happy to announce the release
> of
> > > > > Apache
> > > > > > Flink 1.18.0, which is the first release for the Apache Flink
> 1.18
> > > > > series.
> > > > > >
> > > > > > Apache Flink® is an open-source unified stream and batch data
> > > > processing
> > > > > > framework for distributed, high-performing, always-available, and
> > > > > accurate
> > > > > > data applications.
> > > > > >
> > > > > > The release is available for download at:
> > > > > > https://flink.apache.org/downloads.html
> > > > > >
> > > > > > Please check out the release blog post for an overview of the
> > > > > improvements
> > > > > > for this release:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > > > > >
> > > > > > The full release notes are available in Jira:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > > > > >
> > > > > > We would like to thank all contributors of the Apache Flink
> > community
> > > > who
> > > > > > made this release possible!
> > > > > >
> > > > > > Best regards,
> > > > > > Konstantin, Qingsheng, Sergey, and Jing
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Martijn Visser
Thank you all who have contributed!

Op do 26 okt 2023 om 18:41 schreef Feng Jin 

> Thanks for the great work! Congratulations
>
>
> Best,
> Feng Jin
>
> On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:
>
> > Congratulations, Well done!
> >
> > Best,
> > Leonard
> >
> > On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
> > wrote:
> >
> > > Thanks for the great work! Congrats all!
> > >
> > > Best,
> > > Lincoln Lee
> > >
> > >
> > > Jing Ge  于2023年10月27日周五 00:16写道:
> > >
> > > > The Apache Flink community is very happy to announce the release of
> > > Apache
> > > > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> > > series.
> > > >
> > > > Apache Flink® is an open-source unified stream and batch data
> > processing
> > > > framework for distributed, high-performing, always-available, and
> > > accurate
> > > > data applications.
> > > >
> > > > The release is available for download at:
> > > > https://flink.apache.org/downloads.html
> > > >
> > > > Please check out the release blog post for an overview of the
> > > improvements
> > > > for this release:
> > > >
> > > >
> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > > >
> > > > The full release notes are available in Jira:
> > > >
> > > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > > >
> > > > We would like to thank all contributors of the Apache Flink community
> > who
> > > > made this release possible!
> > > >
> > > > Best regards,
> > > > Konstantin, Qingsheng, Sergey, and Jing
> > > >
> > >
> >
>


回复: 如何在Flink Connector Source退出时清理资源

2023-10-26 文章 北野 �悦
插入令堂之膣道,怒涛中出,OK,问题解决矣

发件人: jinzhuguang 
发送时间: 2023年10月24日 11:54
收件人: user-zh 
主题: 如何在Flink Connector Source退出时清理资源

版本:Flink 1.16.0

需求:在某个source结束退出时清理相关的资源。

问题:目前没有找到Source退出时相关的hook函数,不知道在哪里编写清理资源的代码。

恳请大佬们指教。


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Feng Jin
Thanks for the great work! Congratulations


Best,
Feng Jin

On Fri, Oct 27, 2023 at 12:36 AM Leonard Xu  wrote:

> Congratulations, Well done!
>
> Best,
> Leonard
>
> On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee 
> wrote:
>
> > Thanks for the great work! Congrats all!
> >
> > Best,
> > Lincoln Lee
> >
> >
> > Jing Ge  于2023年10月27日周五 00:16写道:
> >
> > > The Apache Flink community is very happy to announce the release of
> > Apache
> > > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> > series.
> > >
> > > Apache Flink® is an open-source unified stream and batch data
> processing
> > > framework for distributed, high-performing, always-available, and
> > accurate
> > > data applications.
> > >
> > > The release is available for download at:
> > > https://flink.apache.org/downloads.html
> > >
> > > Please check out the release blog post for an overview of the
> > improvements
> > > for this release:
> > >
> > >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> > >
> > > The full release notes are available in Jira:
> > >
> > >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> > >
> > > We would like to thank all contributors of the Apache Flink community
> who
> > > made this release possible!
> > >
> > > Best regards,
> > > Konstantin, Qingsheng, Sergey, and Jing
> > >
> >
>


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Leonard Xu
Congratulations, Well done!

Best,
Leonard

On Fri, Oct 27, 2023 at 12:23 AM Lincoln Lee  wrote:

> Thanks for the great work! Congrats all!
>
> Best,
> Lincoln Lee
>
>
> Jing Ge  于2023年10月27日周五 00:16写道:
>
> > The Apache Flink community is very happy to announce the release of
> Apache
> > Flink 1.18.0, which is the first release for the Apache Flink 1.18
> series.
> >
> > Apache Flink® is an open-source unified stream and batch data processing
> > framework for distributed, high-performing, always-available, and
> accurate
> > data applications.
> >
> > The release is available for download at:
> > https://flink.apache.org/downloads.html
> >
> > Please check out the release blog post for an overview of the
> improvements
> > for this release:
> >
> >
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
> >
> > The full release notes are available in Jira:
> >
> >
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
> >
> > We would like to thank all contributors of the Apache Flink community who
> > made this release possible!
> >
> > Best regards,
> > Konstantin, Qingsheng, Sergey, and Jing
> >
>


Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Lincoln Lee
Thanks for the great work! Congrats all!

Best,
Lincoln Lee


Jing Ge  于2023年10月27日周五 00:16写道:

> The Apache Flink community is very happy to announce the release of Apache
> Flink 1.18.0, which is the first release for the Apache Flink 1.18 series.
>
> Apache Flink® is an open-source unified stream and batch data processing
> framework for distributed, high-performing, always-available, and accurate
> data applications.
>
> The release is available for download at:
> https://flink.apache.org/downloads.html
>
> Please check out the release blog post for an overview of the improvements
> for this release:
>
> https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/
>
> The full release notes are available in Jira:
>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885
>
> We would like to thank all contributors of the Apache Flink community who
> made this release possible!
>
> Best regards,
> Konstantin, Qingsheng, Sergey, and Jing
>


[ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 文章 Jing Ge
The Apache Flink community is very happy to announce the release of Apache
Flink 1.18.0, which is the first release for the Apache Flink 1.18 series.

Apache Flink® is an open-source unified stream and batch data processing
framework for distributed, high-performing, always-available, and accurate
data applications.

The release is available for download at:
https://flink.apache.org/downloads.html

Please check out the release blog post for an overview of the improvements
for this release:
https://flink.apache.org/2023/10/24/announcing-the-release-of-apache-flink-1.18/

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12352885

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Best regards,
Konstantin, Qingsheng, Sergey, and Jing


Re: Flink1.17.1 yarn token 过期问题

2023-10-26 文章 Paul Lam
Hello,

这个问题解决了吗?我遇到相同的问题,还没定为到原因。

Best,
Paul Lam

> 2023年7月20日 12:04,王刚  写道:
> 
> 异常栈信息
> ```
> 
> 2023-07-20 11:43:01,627 ERROR 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner  [] - Terminating 
> TaskManagerRunner with exit code 1.
> org.apache.flink.util.FlinkException: Failed to start the TaskManagerRunner.
>at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:488)
>  ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
>at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerProcessSecurely$5(TaskManagerRunner.java:530)
>  ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
>at java.security.AccessController.doPrivileged(Native Method) 
> ~[?:1.8.0_92]
>at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_92]
>at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
>  ~[flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
>at 
> org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:530)
>  [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
>at 
> org.apache.flink.yarn.YarnTaskExecutorRunner.runTaskManagerSecurely(YarnTaskExecutorRunner.java:94)
>  [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
>at 
> org.apache.flink.yarn.YarnTaskExecutorRunner.main(YarnTaskExecutorRunner.java:68)
>  [flink-dist-1.17-SNAPSHOT.jar:1.17-SNAPSHOT]
> Caused by: org.apache.hadoop.ipc.RemoteException: token (token for flink: 
> HDFS_DELEGATION_TOKEN 
> owner=flink/lf-client-flink-28-243-196.hadoop.local@HADOOP.LOCAL, renewer=, 
> realUser=, issueDate=1689734389821, maxDate=1690339189821, 
> sequenceNumber=266208479, masterKeyId=1131) can't be found in cache
>at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1557) 
> ~[hadoop-common-3.2.1.jar:?]
>at org.apache.hadoop.ipc.Client.call(Client.java:1494) 
> ~[hadoop-common-3.2.1.jar:?]
>at org.apache.hadoop.ipc.Client.call(Client.java:1391) 
> ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:118)
>  ~[hadoop-common-3.2.1.jar:?]
>at com.sun.proxy.$Proxy26.mkdirs(Unknown Source) ~[?:?]
>at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.mkdirs(ClientNamenodeProtocolTranslatorPB.java:660)
>  ~[hadoop-hdfs-client-3.2.1.jar:?]
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[?:1.8.0_92]
>at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[?:1.8.0_92]
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_92]
>at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_92]
>at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
>  ~[hadoop-common-3.2.1.jar:?]
>at com.sun.proxy.$Proxy27.mkdirs(Unknown Source) ~[?:?]
>at 
> org.apache.hadoop.hdfs.DFSClient.primitiveMkdir(DFSClient.java:2425) 
> ~[hadoop-hdfs-client-3.2.1.jar:?]
>at org.apache.hadoop.hdfs.DFSClient.mkdirs(DFSClient.java:2401) 
> ~[hadoop-hdfs-client-3.2.1.jar:?]
>at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1318)
>  ~[hadoop-hdfs-client-3.2.1.jar:?]
>at 
> org.apache.hadoop.hdfs.DistributedFileSystem$27.doCall(DistributedFileSystem.java:1315)
>  ~[hadoop-hdfs-client-3.2.1.jar:?]
>at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  ~[hadoop-common-3.2.1.jar:?]
>at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirsInternal(DistributedFileSystem.java:1332)
>  ~[hadoop-hdfs-client-3.2.1.jar:?]
>at 
> org.apache.hadoop.hdfs.DistributedFileSystem.mkdirs(DistributedFileSystem.java:1307)
>  ~[hadoop-hdfs-client-3.2.1.jar:?]
>at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:2275) 
> ~[hadoop-common-3.2.1.jar:?]
>at 
> 

FLINK-24035, how can this issue be repeated?

2023-10-24 文章 rui chen
   We encountered similar problems in production, and we want to integrate
   FLINK-24035 to solve them, but we don't know how to repeat the problem.


Re:如何在Flink Connector Source退出时清理资源

2023-10-23 文章 Xuyang
Hi, 
看一下你的DynamicTableSource实现的类,如果你用的是InputFormat的旧source(用的是类似InputFormatProvider.of),可以使用InputFormat里的close方法;
如果用的是flip-27的source(用的是类似SourceProvider.of),SplitReader里也有一个close方法










--

Best!
Xuyang





在 2023-10-24 11:54:36,"jinzhuguang"  写道:
>版本:Flink 1.16.0
>
>需求:在某个source结束退出时清理相关的资源。
>
>问题:目前没有找到Source退出时相关的hook函数,不知道在哪里编写清理资源的代码。
>
>恳请大佬们指教。


如何在Flink Connector Source退出时清理资源

2023-10-23 文章 jinzhuguang
版本:Flink 1.16.0

需求:在某个source结束退出时清理相关的资源。

问题:目前没有找到Source退出时相关的hook函数,不知道在哪里编写清理资源的代码。

恳请大佬们指教。

Re: Problems with the state.backend.fs.memory-threshold parameter

2023-10-23 文章 rui chen
Hi,Zakelly
Thank you for your answer.

Best,
rui


Zakelly Lan  于2023年10月13日周五 19:12写道:

> Hi rui,
>
> The 'state.backend.fs.memory-threshold' configures the threshold below
> which state is stored as part of the metadata, rather than in separate
> files. So as a result the JM will use its memory to merge small
> checkpoint files and write them into one file. Currently the
> FLIP-306[1][2] is proposed to merge small checkpoint files without
> consuming JM memory. This feature is currently being worked on and is
> targeted for the next minor release (1.19).
>
>
> Best,
> Zakelly
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints
> [2] https://issues.apache.org/jira/browse/FLINK-32070
>
> On Fri, Oct 13, 2023 at 6:28 PM rui chen  wrote:
> >
> > We found that for some tasks, the JM memory continued to increase. I set
> > the parameter of state.backend.fs.memory-threshold to 0, and the JM
> memory
> > would no longer increase, but many small files might be written in this
> > way. Does the community have any optimization plan for this area?
>


回复: flink sql不支持show create catalog 吗?

2023-10-20 文章 李宇彬
Hi Feng


我之前建过一个jira(https://issues.apache.org/jira/browse/FLINK-24939),引入CatalogStore后,实现这个特性的时机应该已经成熟了,我们这边业务场景里用到了很多catalog,管理起来很麻烦,有这个特性会好很多。
| |
 回复的原邮件 
| 发件人 | Feng Jin |
| 发送日期 | 2023年10月20日 13:18 |
| 收件人 |  |
| 主题 | Re: flink sql不支持show create catalog 吗? |
hi casel


从 1.18 开始,引入了 CatalogStore,持久化了 Catalog 的配置,确实可以支持 show create catalog 了。


Best,
Feng

On Fri, Oct 20, 2023 at 11:55 AM casel.chen  wrote:

之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink
sql不支持show create catalog 。
而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?


jobmanager 开启ha模式重复启用的问题

2023-10-20 文章 s
flink onYarn集群,运行模式为yarn application
开启zookeeper  high 
availability后,JobManager由于异常退出后,会重新启动一个jm在其他节点,这个没有问题,但是之前的那个tm的信息在zk中并不会删除,导致出现了两个job在,并且后续启动的那个job长期属于悬挂状态,不会正常工作
flink 版本为1.15.3,日志中的错误如下:


页面上的现象如下:


zk中的状态看上tm的信息不会删除,仍然是被locks的状态






请教下各位,是否有遇到类似的情况,zk中的过久信息是否可以配置删除?



Re: kafka_appender收集flink任务日志连接数过多问题

2023-10-19 文章 Feng Jin
可以考虑在每台 yarn 机器部署日志服务(可收集本机的日志到 kafka)
yarn container -> 单机的日志服务 -> kafka.



On Mon, Oct 16, 2023 at 3:58 PM 阿华田  wrote:

>
> Flink集群使用kafka_appender收集flink产生的日志,但是现在实时运行的任务超过了三千个,运行的yarn-container有20万+。导致存储日志的kafka集群连接数过多,kafka集群压力有点大,请教各位大佬flink日志采集这块还有什么好的方式吗
>
>
> | |
> 阿华田
> |
> |
> a15733178...@163.com
> |
> 签名由网易邮箱大师定制
>
>


Re: flink sql不支持show create catalog 吗?

2023-10-19 文章 Feng Jin
hi casel


从 1.18 开始,引入了 CatalogStore,持久化了 Catalog 的配置,确实可以支持 show create catalog 了。


Best,
Feng

On Fri, Oct 20, 2023 at 11:55 AM casel.chen  wrote:

> 之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink
> sql不支持show create catalog 。
> 而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?


flink sql不支持show create catalog 吗?

2023-10-19 文章 casel.chen
之前在flink sql中创建过一个catalog,现在想查看当初创建catalog的语句复制并修改一下另存为一个新的catalog,发现flink 
sql不支持show create catalog 。
而据我所知doris是支持show create catalog语句的。flink sql能否支持一下呢?

Re: Flink SQL的状态清理

2023-10-17 文章 Jane Chan
Hi, 你好

如果使用的是 standalone session cluster, 想要在 JM/TM 日志中看到参数打印出来, 需要在集群启动前在
flink-conf.yaml 配置 table.exec.state.ttl: '${TTL}', 再启动集群;
集群启动后再修改的话, 日志不会打印出来, 可以通过 SET; 命令查看当前 SQL CLI 中配置的参数.
另外, 需要先执行 SET 'table.exec.state.ttl' = '${TTL}' , 然后提交作业, 可以确认下操作顺序是否有误.

祝好!
Jane

On Mon, Oct 9, 2023 at 6:01 PM 小昌同学  wrote:

> 你好,老师,我也是这样设置的,我这边是flink sql client,但是我去flink web ui界面并没有看到这个配置生效。
>
>
> | |
> 小昌同学
> |
> |
> ccc0606fight...@163.com
> |
>  回复的原邮件 
> | 发件人 | Jane Chan |
> | 发送日期 | 2023年9月25日 11:24 |
> | 收件人 |  |
> | 主题 | Re: Flink SQL的状态清理 |
> Hi,
>
> 可以通过设置 table.exec.state.ttl 来控制状态算子的 state TTL. 更多信息请参阅 [1]
>
> [1]
>
> https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/concepts/overview/#%e7%8a%b6%e6%80%81%e7%ae%a1%e7%90%86
>
> Best,
> Jane
>
> On Thu, Sep 21, 2023 at 5:17 PM faronzz  wrote:
>
> 试试这个 t_env.get_config().set("table.exec.state.ttl", "86400 s")
>
>
>
>
> | |
> faronzz
> |
> |
> faro...@163.com
> |
>
>
>  回复的原邮件 
> | 发件人 | 小昌同学 |
> | 发送日期 | 2023年09月21日 17:06 |
> | 收件人 | user-zh |
> | 主题 | Flink SQL的状态清理 |
>
>
> 各位老师好,请教一下大家关于flink sql的状态清理问题,我百度的话只找到相关的minbath设置,sql是没有配置state的ttl设置嘛
> | |
> 小昌同学
> |
> |
> ccc0606fight...@163.com
> |
>


Re: 请问1.18什么时候可以发布呢,想体验1.17jdk

2023-10-16 文章 Jing Ge
快了,已经开始voting了 :-))

On Sun, Oct 15, 2023 at 5:55 AM kcz <573693...@qq.com.invalid> wrote:

>


kafka_appender收集flink任务日志连接数过多问题

2023-10-16 文章 阿华田
Flink集群使用kafka_appender收集flink产生的日志,但是现在实时运行的任务超过了三千个,运行的yarn-container有20万+。导致存储日志的kafka集群连接数过多,kafka集群压力有点大,请教各位大佬flink日志采集这块还有什么好的方式吗


| |
阿华田
|
|
a15733178...@163.com
|
签名由网易邮箱大师定制



Flink启动失败问题

2023-10-16 文章 qwemssd
Flink版本


到bin目录执行:  ./start-cluster.sh




报错信息:








看是端口占用其实没有




尝试修改端口,还是一样报错端口问题,  任意修改一个端口多是报同上面一样的错误。




端口也是开放的






希望大佬们能帮忙看看是什么问题呢???


| |
qwemssd
|
|
qwem...@163.com
|

请问1.18什么时候可以发布呢,想体验1.17jdk

2023-10-14 文章 kcz


Re: Problems with the state.backend.fs.memory-threshold parameter

2023-10-13 文章 Zakelly Lan
Hi rui,

The 'state.backend.fs.memory-threshold' configures the threshold below
which state is stored as part of the metadata, rather than in separate
files. So as a result the JM will use its memory to merge small
checkpoint files and write them into one file. Currently the
FLIP-306[1][2] is proposed to merge small checkpoint files without
consuming JM memory. This feature is currently being worked on and is
targeted for the next minor release (1.19).


Best,
Zakelly

[1] 
https://cwiki.apache.org/confluence/display/FLINK/FLIP-306%3A+Unified+File+Merging+Mechanism+for+Checkpoints
[2] https://issues.apache.org/jira/browse/FLINK-32070

On Fri, Oct 13, 2023 at 6:28 PM rui chen  wrote:
>
> We found that for some tasks, the JM memory continued to increase. I set
> the parameter of state.backend.fs.memory-threshold to 0, and the JM memory
> would no longer increase, but many small files might be written in this
> way. Does the community have any optimization plan for this area?


Real-time task blocking problem

2023-10-13 文章 rui chen
After the task restart of our 1.13 version, kakfa consumption zero problem
occurred. Have you ever encountered it?


Problems with the state.backend.fs.memory-threshold parameter

2023-10-13 文章 rui chen
We found that for some tasks, the JM memory continued to increase. I set
the parameter of state.backend.fs.memory-threshold to 0, and the JM memory
would no longer increase, but many small files might be written in this
way. Does the community have any optimization plan for this area?


Re: Cannot find metata file metadats in directory

2023-10-13 文章 rui chen
After the task is restarted for several times, it is found that the
supported cp is deleted. I view the audit log of HDFS and find that the
deletion request comes from JM

Hangxiang Yu  于2023年9月30日周六 17:10写道:

> Hi,
> How did you point out the checkpoint path you restored from ?
>
> Seems that you are trying to restore from a not completed or failed
> checkpoint.
>
> On Thu, Sep 28, 2023 at 6:09 PM rui chen  wrote:
>
> > When we use 1.13.2,we have the following error:
> > FileNotFoundException: Cannot find metata file metadats in directory
> > 'hdfs://xx/f408dbe327f9e5053e76d7b5323d6e81/chk-173'.
> >
>
>
> --
> Best,
> Hangxiang.
>


Re: 关于Flink SQL无法对带metadata列的表在执行insert into时,指定具体的列名的问题。

2023-10-12 文章 jinzhuguang
感谢大佬!!!

> 2023年10月13日 10:44,tanjialiang  写道:
> 
> Hi, 
> 这个问题已经在1.17.0修复,详细可以看https://issues.apache.org/jira/browse/FLINK-30922
> 
> 
> best wishes,
> tanjialiang.
> 
> 
>  回复的原邮件 
> | 发件人 | jinzhuguang |
> | 发送日期 | 2023年10月13日 10:39 |
> | 收件人 | user-zh |
> | 主题 | 关于Flink SQL无法对带metadata列的表在执行insert into时,指定具体的列名的问题。 |
> 首先,我的Flink版本为1.16.0
> 为了方便理解,我以Kafka作为案例来描述:
> 我有以下两个表:
> CREATE TABLE orders(
> user_id BIGINT,
> name STRING,
> timestamp TIMESTAMP(3) METADATA VIRTUAL
> )WITH(
> 'connector'='kafka',
> 'topic'='orders',
> 'properties.group.id' = 'test_join_tempral',
> 'scan.startup.mode'='earliest-offset',
> 'properties.bootstrap.servers'='localhost:9092',
> 'format'='json',
> 'json.ignore-parse-errors' = 'true'
> );
> CREATE TABLE kafka_sink(
> user_id BIGINT,
> name STRING,
> timestamp TIMESTAMP(3) METADATA FROM 'timestamp'
> )WITH(
> 'connector'='kafka',
> 'topic'='kafka_sink',
> 'properties.group.id' = 'test_join_tempral',
> 'scan.startup.mode'='earliest-offset',
> 'properties.bootstrap.servers'='localhost:9092',
> 'format'='json',
> 'json.ignore-parse-errors' = 'true'
> );
> 
> 正常情况:
> Flink SQL> insert into kafka_sink select user_id,name,`timestamp` from orders;
> [INFO] Submitting SQL update statement to the cluster...
> [INFO] SQL update statement has been successfully submitted to the cluster:
> Job ID: e419ae9d2cad4c3c2a2c1150c1a86653
> 
> 
> 异常情况:
> Flink SQL> insert into kafka_sink(user_id,name,`timestamp`) select 
> user_id,name,`timestamp` from orders;
> [ERROR] Could not execute SQL statement. Reason:
> org.apache.calcite.sql.validate.SqlValidatorException: Unknown target column 
> 'timestamp'
> 很奇怪,为什么指定列名就不行了呢?而且还是识别不到”ts”列,kafka_sink schema如下:
> Flink SQL> describe kafka_sink;
> +---+--+--+-+---+---+
> |  name | type | null | key |extras | 
> watermark |
> +---+--+--+-+---+---+
> |   user_id |   BIGINT | TRUE | |   | 
>   |
> |  name |   STRING | TRUE | |   | 
>   |
> | timestamp | TIMESTAMP(3) | TRUE | | METADATA FROM 'timestamp' | 
>   |
> +---+--+--+-+---+---+
> 
> 
> 
> 恳请解答!



回复:关于Flink SQL无法对带metadata列的表在执行insert into时,指定具体的列名的问题。

2023-10-12 文章 tanjialiang
Hi, 
这个问题已经在1.17.0修复,详细可以看https://issues.apache.org/jira/browse/FLINK-30922


best wishes,
tanjialiang.


 回复的原邮件 
| 发件人 | jinzhuguang |
| 发送日期 | 2023年10月13日 10:39 |
| 收件人 | user-zh |
| 主题 | 关于Flink SQL无法对带metadata列的表在执行insert into时,指定具体的列名的问题。 |
首先,我的Flink版本为1.16.0
为了方便理解,我以Kafka作为案例来描述:
我有以下两个表:
CREATE TABLE orders(
user_id BIGINT,
name STRING,
timestamp TIMESTAMP(3) METADATA VIRTUAL
)WITH(
'connector'='kafka',
'topic'='orders',
'properties.group.id' = 'test_join_tempral',
'scan.startup.mode'='earliest-offset',
'properties.bootstrap.servers'='localhost:9092',
'format'='json',
'json.ignore-parse-errors' = 'true'
);
CREATE TABLE kafka_sink(
user_id BIGINT,
name STRING,
timestamp TIMESTAMP(3) METADATA FROM 'timestamp'
)WITH(
'connector'='kafka',
'topic'='kafka_sink',
'properties.group.id' = 'test_join_tempral',
'scan.startup.mode'='earliest-offset',
'properties.bootstrap.servers'='localhost:9092',
'format'='json',
'json.ignore-parse-errors' = 'true'
);

正常情况:
Flink SQL> insert into kafka_sink select user_id,name,`timestamp` from orders;
[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: e419ae9d2cad4c3c2a2c1150c1a86653


异常情况:
Flink SQL> insert into kafka_sink(user_id,name,`timestamp`) select 
user_id,name,`timestamp` from orders;
[ERROR] Could not execute SQL statement. Reason:
org.apache.calcite.sql.validate.SqlValidatorException: Unknown target column 
'timestamp'
很奇怪,为什么指定列名就不行了呢?而且还是识别不到”ts”列,kafka_sink schema如下:
Flink SQL> describe kafka_sink;
+---+--+--+-+---+---+
|  name | type | null | key |extras | watermark 
|
+---+--+--+-+---+---+
|   user_id |   BIGINT | TRUE | |   |   
|
|  name |   STRING | TRUE | |   |   
|
| timestamp | TIMESTAMP(3) | TRUE | | METADATA FROM 'timestamp' |   
|
+---+--+--+-+---+---+



恳请解答!

关于Flink SQL无法对带metadata列的表在执行insert into时,指定具体的列名的问题。

2023-10-12 文章 jinzhuguang
首先,我的Flink版本为1.16.0
为了方便理解,我以Kafka作为案例来描述:
我有以下两个表:
CREATE TABLE orders(
user_id BIGINT,
name STRING,
timestamp TIMESTAMP(3) METADATA VIRTUAL
)WITH(
'connector'='kafka',
'topic'='orders',
'properties.group.id' = 'test_join_tempral',
'scan.startup.mode'='earliest-offset',
'properties.bootstrap.servers'='localhost:9092',
'format'='json',
'json.ignore-parse-errors' = 'true'
);
CREATE TABLE kafka_sink(
user_id BIGINT,
name STRING,
timestamp TIMESTAMP(3) METADATA FROM 'timestamp'
)WITH(
'connector'='kafka',
'topic'='kafka_sink',
'properties.group.id' = 'test_join_tempral',
'scan.startup.mode'='earliest-offset',
'properties.bootstrap.servers'='localhost:9092',
'format'='json',
'json.ignore-parse-errors' = 'true'
);

正常情况: 
Flink SQL> insert into kafka_sink select user_id,name,`timestamp` from orders;
[INFO] Submitting SQL update statement to the cluster...
[INFO] SQL update statement has been successfully submitted to the cluster:
Job ID: e419ae9d2cad4c3c2a2c1150c1a86653


异常情况:
Flink SQL> insert into kafka_sink(user_id,name,`timestamp`) select 
user_id,name,`timestamp` from orders;
[ERROR] Could not execute SQL statement. Reason:
org.apache.calcite.sql.validate.SqlValidatorException: Unknown target column 
'timestamp'
很奇怪,为什么指定列名就不行了呢?而且还是识别不到”ts”列,kafka_sink schema如下:
Flink SQL> describe kafka_sink;
+---+--+--+-+---+---+
|  name | type | null | key |extras | watermark 
|
+---+--+--+-+---+---+
|   user_id |   BIGINT | TRUE | |   |   
|
|  name |   STRING | TRUE | |   |   
|
| timestamp | TIMESTAMP(3) | TRUE | | METADATA FROM 'timestamp' |   
|
+---+--+--+-+---+---+



恳请解答!

回复: Flink SQL的状态清理

2023-10-09 文章 小昌同学
你好,老师,我也是这样设置的,我这边是flink sql client,但是我去flink web ui界面并没有看到这个配置生效。


| |
小昌同学
|
|
ccc0606fight...@163.com
|
 回复的原邮件 
| 发件人 | Jane Chan |
| 发送日期 | 2023年9月25日 11:24 |
| 收件人 |  |
| 主题 | Re: Flink SQL的状态清理 |
Hi,

可以通过设置 table.exec.state.ttl 来控制状态算子的 state TTL. 更多信息请参阅 [1]

[1]
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/concepts/overview/#%e7%8a%b6%e6%80%81%e7%ae%a1%e7%90%86

Best,
Jane

On Thu, Sep 21, 2023 at 5:17 PM faronzz  wrote:

试试这个 t_env.get_config().set("table.exec.state.ttl", "86400 s")




| |
faronzz
|
|
faro...@163.com
|


 回复的原邮件 
| 发件人 | 小昌同学 |
| 发送日期 | 2023年09月21日 17:06 |
| 收件人 | user-zh |
| 主题 | Flink SQL的状态清理 |


各位老师好,请教一下大家关于flink sql的状态清理问题,我百度的话只找到相关的minbath设置,sql是没有配置state的ttl设置嘛
| |
小昌同学
|
|
ccc0606fight...@163.com
|


toolongframeexception

2023-10-08 文章 Jiacheng Jiang
大家好:

   我搭建了一个单机standealone 的flink用来测试mysql 
CDC,我发现每天晚上4点和4.30,我的taskmanager和jobmanager日志都有akka的TooLongFrameException,是BlobServerConnection报出来的,那个时候可能有定时任务在操作大量操作mysql数据库,我想问:

  1.  TooLongFrameException可能是mysql操作比较大造成的吗?如果是,有解决方法吗?
  2.  BlobServer是干嘛的?我一直认为blobserver是用来放job jar之类的,应该和cdc job的数据无关吧
  3.  哪些情况可能造成TooLongFrameException?

感谢大家


回复:flink两阶段提交

2023-10-08 文章 海风
多谢啦



 回复的原邮件 
| 发件人 | Feng Jin |
| 日期 | 2023年10月08日 13:17 |
| 收件人 | user-zh@flink.apache.org |
| 抄送至 | |
| 主题 | Re: flink两阶段提交 |
hi,

可以参考这篇博客,描述的非常清晰:
https://flink.apache.org/2018/02/28/an-overview-of-end-to-end-exactly-once-processing-in-apache-flink-with-apache-kafka-too/


Best,
Feng

On Sun, Sep 24, 2023 at 9:54 PM 海风 <18751805...@163.com> wrote:

> 请教一下,flink的两阶段提交对于sink算子,预提交是在做检查点的哪个阶段触发的?预提交时具体是做了什么工作?
>
>
>


Re: flink两阶段提交

2023-10-07 文章 Feng Jin
hi,

可以参考这篇博客,描述的非常清晰:
https://flink.apache.org/2018/02/28/an-overview-of-end-to-end-exactly-once-processing-in-apache-flink-with-apache-kafka-too/


Best,
Feng

On Sun, Sep 24, 2023 at 9:54 PM 海风 <18751805...@163.com> wrote:

> 请教一下,flink的两阶段提交对于sink算子,预提交是在做检查点的哪个阶段触发的?预提交时具体是做了什么工作?
>
>
>


Re: Flink CDC消费Apache Paimon表

2023-10-07 文章 Feng Jin
hi casel

Flink 实时消费 paimon,默认情况就是全量 + 增量的方式。

具体可以参考:  https://paimon.apache.org/docs/master/maintenance/configurations/
中的 scan.mode 参数


best,
Feng

On Fri, Sep 29, 2023 at 5:50 PM casel.chen  wrote:

> 目前想要通过Flink全量+增量消费Apache Paimon表需要分别起离线和增量消费两个作业,比较麻烦,而且无法无缝衔接,能否通过类似Flink
> CDC消费mysql表的方式消费Apache Paimon表?


Re: 退订

2023-10-06 文章 Yunfeng Zhou
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。

Best,
Yunfeng

On Wed, Oct 4, 2023 at 10:07 AM 1  wrote:
>
>


退订

2023-10-03 文章 1



Re: Cannot find metata file metadats in directory

2023-09-30 文章 Hangxiang Yu
Hi,
How did you point out the checkpoint path you restored from ?

Seems that you are trying to restore from a not completed or failed
checkpoint.

On Thu, Sep 28, 2023 at 6:09 PM rui chen  wrote:

> When we use 1.13.2,we have the following error:
> FileNotFoundException: Cannot find metata file metadats in directory
> 'hdfs://xx/f408dbe327f9e5053e76d7b5323d6e81/chk-173'.
>


-- 
Best,
Hangxiang.


Flink CDC消费Apache Paimon表

2023-09-29 文章 casel.chen
目前想要通过Flink全量+增量消费Apache Paimon表需要分别起离线和增量消费两个作业,比较麻烦,而且无法无缝衔接,能否通过类似Flink 
CDC消费mysql表的方式消费Apache Paimon表?

Cannot find metata file metadats in directory

2023-09-28 文章 rui chen
When we use 1.13.2,we have the following error:
FileNotFoundException: Cannot find metata file metadats in directory
'hdfs://xx/f408dbe327f9e5053e76d7b5323d6e81/chk-173'.


回复: 退订

2023-09-26 文章 Chen Zhanghao
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 来取消订阅邮件。

Best,
Zhanghao Chen

发件人: guangyu05 
发送时间: 2023年9月26日 18:45
收件人: user-zh@flink.apache.org 
主题: 退订

退订


退订

2023-09-26 文章 guangyu05
退订

Re: 退订

2023-09-26 文章 Xuannan Su
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 
 地址来取消订阅来自
user-zh@flink.apache.org  
mailto:u...@flink.apache.org>> 邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org 
 if you want to
unsubscribe the mail from user-zh@flink.apache.org 
 mailto:u...@flink.apache.org>>,
and you can refer [1][2] for more details.

Best,
Xuannan

[1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

> On Sep 26, 2023, at 16:44, 王唯  wrote:
> 
> 退订



Re: 退订

2023-09-26 文章 Xuannan Su
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 
 地址来取消订阅来自
user-zh@flink.apache.org  
mailto:u...@flink.apache.org>> 邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org 
 if you want to
unsubscribe the mail from user-zh@flink.apache.org 
 mailto:u...@flink.apache.org>>,
and you can refer [1][2] for more details.

Best,
Xuannan

[1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

> On Sep 26, 2023, at 14:43, 陈建华  wrote:
> 
> 退订



Re: 退订

2023-09-26 文章 Xuannan Su
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 
 地址来取消订阅来自
user-zh@flink.apache.org  
mailto:u...@flink.apache.org>> 邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org 
 if you want to
unsubscribe the mail from user-zh@flink.apache.org 
 mailto:u...@flink.apache.org>>,
and you can refer [1][2] for more details.

Best,
Xuannan

[1] https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

> On Sep 26, 2023, at 14:37, 悠舒 刘  wrote:
> 
> 退订



退订

2023-09-26 文章 王唯
退订

退订

2023-09-26 文章 陈建华
退订

退订

2023-09-26 文章 悠舒 刘
退订


退订

2023-09-25 文章 chenyu_opensource
退订

Re: Default Flink S3 FileSource timeout due to large file listing

2023-09-25 文章 王国成
退订

 Replied Message 
| From | Eleanore Jin |
| Date | 09/26/2023 01:50 |
| To | user-zh  |
| Subject | Default Flink S3 FileSource timeout due to large file listing |
Hello Flink Community,
Flink Version: 1.16.1, Zookeeper for HA.
My Flink Applications reads raw parquet files hosted in S3, applies
transformations and re-writes them to S3, under a different location.
Below is my code to read from parquets from S3:
```
   final Configuration configuration = new Configuration();
   configuration.set("fs.s3.impl",
"org.apache.hadoop.fs.s3a.S3AFileSystem");
   final ParquetColumnarRowInputFormat format =
 new ParquetColumnarRowInputFormat<>(
   configuration,
   ,
   InternalTypeInfo.of(),
   100,
   true,
   true
 );
   final FileSource source = FileSource
 .forBulkFileFormat(format, new Path("s3/"))
 .build();
stream = env.fromSource(source, WatermarkStrategy.noWatermarks(),
"parquet-source");
```
I noticed the following:
1. My S3 directory, "s3//", can have more than 1M+ files. The
parquets in this directory are partitioned by date and time. This makes the
folder structure of this directory deterministic. e.g
"s3//partiton_column_a/partition_columb_b/2023-09-25--13/{1,2...N}.parquet".
I believe the Flink Default FileSource is doing a list on this large
directory and gets stuck waiting for the operation to complete. The Akka
connect timeout error messages in the Task Manager logs support this.
Additionally, the job runs successfully when I restrict the input to a
subfolder, looking at only an hour's data, based on the mentioned
partitioning scheme. In my local machine, I also tried using S3 CLI to
recursively list this directory and the operation did not complete in 1
hour.

*Is this behavior expected based on Flink's S3 source implementation? *Looking
at the docs
,
one way to solve this is to implement the Split Enumerator by incrementally
processing the subfolders in "s3//", based on the mentioned
partitioning scheme.

*Are there any other approaches available?*
2. Following the code above, when I deserialize records from S3 I get
records of type BinaryRowData
.
However, when I use the same code in Unit Testing, with
MiniClusterWithClientResource
,
to read from a local parquet file (not S3), I get records of type
GenericRowData

.

*What is the reason for this discrepancy and is it possible to force
deserialization to output type GenericRowData? *Currently, I have written
code to convert BinaryRowData to GenericRowData as our downstream
ProcessFunctions expect this type.
I*s there a better solution to transform BinaryRowData to GenericRowData?*

Thanks!
Eleanore


Default Flink S3 FileSource timeout due to large file listing

2023-09-25 文章 Eleanore Jin
Hello Flink Community,
Flink Version: 1.16.1, Zookeeper for HA.
My Flink Applications reads raw parquet files hosted in S3, applies
transformations and re-writes them to S3, under a different location.
Below is my code to read from parquets from S3:
```
final Configuration configuration = new Configuration();
configuration.set("fs.s3.impl",
"org.apache.hadoop.fs.s3a.S3AFileSystem");
final ParquetColumnarRowInputFormat format =
  new ParquetColumnarRowInputFormat<>(
configuration,
,
InternalTypeInfo.of(),
100,
true,
true
  );
final FileSource source = FileSource
  .forBulkFileFormat(format, new Path("s3/"))
  .build();
 stream = env.fromSource(source, WatermarkStrategy.noWatermarks(),
"parquet-source");
```
I noticed the following:
1. My S3 directory, "s3//", can have more than 1M+ files. The
parquets in this directory are partitioned by date and time. This makes the
folder structure of this directory deterministic. e.g
"s3//partiton_column_a/partition_columb_b/2023-09-25--13/{1,2...N}.parquet".
I believe the Flink Default FileSource is doing a list on this large
directory and gets stuck waiting for the operation to complete. The Akka
connect timeout error messages in the Task Manager logs support this.
Additionally, the job runs successfully when I restrict the input to a
subfolder, looking at only an hour's data, based on the mentioned
partitioning scheme. In my local machine, I also tried using S3 CLI to
recursively list this directory and the operation did not complete in 1
hour.

*Is this behavior expected based on Flink's S3 source implementation? *Looking
at the docs
,
one way to solve this is to implement the Split Enumerator by incrementally
processing the subfolders in "s3//", based on the mentioned
partitioning scheme.

*Are there any other approaches available?*
2. Following the code above, when I deserialize records from S3 I get
records of type BinaryRowData
.
However, when I use the same code in Unit Testing, with
MiniClusterWithClientResource
,
to read from a local parquet file (not S3), I get records of type
GenericRowData

.

*What is the reason for this discrepancy and is it possible to force
deserialization to output type GenericRowData? *Currently, I have written
code to convert BinaryRowData to GenericRowData as our downstream
ProcessFunctions expect this type.
I*s there a better solution to transform BinaryRowData to GenericRowData?*

Thanks!
Eleanore


Re: 1.17.1 - Interval join的时候发生NPE

2023-09-24 文章 Phoes Huang
Hi Hangxiang,

感谢您的回应。
下面是该问题的关键代码,main_stream表是流数据源,数据事件流频约每笔500ms~1s,目前尝试将t1minStream和t5minStream 
assignTimestampsAndWatermarks(WatermarkStrategy.noWatermarks())是不会产生这问题造成作业失败了,但输出会有数据丢失。
如有其他思路,麻烦你了。

String t1minSql = "SELECT rowtime, key, id, AVG(num) OVER w_t1min AS avg_t1min 
FROM main_stream WINDOW w_t1min AS (PARTITON BY key ORDER BY rowtime RANGE 
BETWEEN INTERVAL ‘1’ MINUTES PRECEDING AND CURRENT ROW)";

Table t1minTable = tableEnv.sqlQuery(t1minSql);

String t5minSql = "SELECT rowtime, key, id, AVG(num) OVER w_t5min AS 
avg_t5min FROM main_stream WINDOW w_t5min AS (PARTITON BY key ORDER BY rowtime 
RANGE BETWEEN INTERVAL ‘5’ MINUTES PRECEDING AND CURRENT ROW)";

Table t5minTable = tableEnv.sqlQuery(t5minSql);

DataStream t1minStream = tableEnv.toChangelogStream(t1minTable);

DataStream t5minStream = tableEnv.toChangelogStream(t5minTable);

DataStream joinedStream = t1minStream.keyBy(new 
TupleKeySelector("key", "id")).intervalJoin(t5minStream.keyBy(new 
TupleKeySelector("key", 
"id"))).inEventTime().between(Time.milliseconds(-1000L), 
Time.milliseconds(1000L)).process(new ProcessJoinFunction() {
@Override
public void processElement(Row left, Row right, 
ProcessJoinFunction.Context ctx, Collector collector) 
throws Exception {
collector.collect(Row.join(left, right));
}
});



> Hangxiang Yu  於 2023年9月25日 上午10:54 寫道:
> 
> Hi, 请问下是 SQL 作业还是 DataStream 作业,可以提供一些可复现的关键 SQL 或代码吗
> 
> On Sat, Sep 23, 2023 at 3:59 PM Phoes Huang  wrote:
> 
>> Hi,
>> 
>> 单机本地开发执行,遇到该问题,请问有人遇过并解决吗?
>> 
>> 2023-09-23 13:52:03.989 INFO
>> [flink-akka.actor.default-dispatcher-9][Execution.java:1445] - Interval
>> Join (19/20)
>> (ff8e25fb94208d3c27f549a1e24757ea_e8388ada9c03cfdb1446bb3ccfbd461b_18_0)
>> switched from RUNNING to FAILED on d569c5db-6882-496b-9e92-8a40bb631784 @
>> localhost (dataPort=-1).
>> java.lang.NullPointerException: null
>>at
>> org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:149)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:39)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.lambda$logRemoval$1(ChangelogKeyGroupedPriorityQueue.java:153)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.AbstractStateChangeLogger.lambda$serialize$4(AbstractStateChangeLogger.java:184)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.AbstractStateChangeLogger.serializeRaw(AbstractStateChangeLogger.java:193)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.AbstractStateChangeLogger.serialize(AbstractStateChangeLogger.java:178)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.AbstractStateChangeLogger.log(AbstractStateChangeLogger.java:151)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.AbstractStateChangeLogger.valueElementRemoved(AbstractStateChangeLogger.java:125)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.logRemoval(ChangelogKeyGroupedPriorityQueue.java:153)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.poll(ChangelogKeyGroupedPriorityQueue.java:69)
>> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:301)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.InternalTimeServiceManagerImpl.advanceWatermark(InternalTimeServiceManagerImpl.java:180)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:602)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:609)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark2(AbstractStreamOperator.java:618)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at 
>> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitWatermark(StreamTwoInputProcessorFactory.java:268)
>> ~[flink-streaming-java-1.17.1.jar:1.17.1]
>>at
>> org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:200)
>> 

Re: Flink SQL的状态清理

2023-09-24 文章 Jane Chan
Hi,

可以通过设置 table.exec.state.ttl 来控制状态算子的 state TTL. 更多信息请参阅 [1]

[1]
https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/table/concepts/overview/#%e7%8a%b6%e6%80%81%e7%ae%a1%e7%90%86

Best,
Jane

On Thu, Sep 21, 2023 at 5:17 PM faronzz  wrote:

> 试试这个 t_env.get_config().set("table.exec.state.ttl", "86400 s")
>
>
>
>
> | |
> faronzz
> |
> |
> faro...@163.com
> |
>
>
>  回复的原邮件 
> | 发件人 | 小昌同学 |
> | 发送日期 | 2023年09月21日 17:06 |
> | 收件人 | user-zh |
> | 主题 | Flink SQL的状态清理 |
>
>
> 各位老师好,请教一下大家关于flink sql的状态清理问题,我百度的话只找到相关的minbath设置,sql是没有配置state的ttl设置嘛
> | |
> 小昌同学
> |
> |
> ccc0606fight...@163.com
> |


Re: 退订

2023-09-24 文章 Yunfeng Zhou
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自
user-zh@flink.apache.org  邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org if you want to
unsubscribe the mail from user-zh@flink.apache.org ,
and you can refer [1][2] for more details.

Best,
Yunfeng

On Mon, Sep 25, 2023 at 10:43 AM 星海 <2278179...@qq.com.invalid> wrote:
>
> 退订


Re: 1.17.1 - Interval join的时候发生NPE

2023-09-24 文章 Hangxiang Yu
Hi, 请问下是 SQL 作业还是 DataStream 作业,可以提供一些可复现的关键 SQL 或代码吗

On Sat, Sep 23, 2023 at 3:59 PM Phoes Huang  wrote:

> Hi,
>
> 单机本地开发执行,遇到该问题,请问有人遇过并解决吗?
>
> 2023-09-23 13:52:03.989 INFO
> [flink-akka.actor.default-dispatcher-9][Execution.java:1445] - Interval
> Join (19/20)
> (ff8e25fb94208d3c27f549a1e24757ea_e8388ada9c03cfdb1446bb3ccfbd461b_18_0)
> switched from RUNNING to FAILED on d569c5db-6882-496b-9e92-8a40bb631784 @
> localhost (dataPort=-1).
> java.lang.NullPointerException: null
> at
> org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:149)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:39)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.lambda$logRemoval$1(ChangelogKeyGroupedPriorityQueue.java:153)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.AbstractStateChangeLogger.lambda$serialize$4(AbstractStateChangeLogger.java:184)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.AbstractStateChangeLogger.serializeRaw(AbstractStateChangeLogger.java:193)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.AbstractStateChangeLogger.serialize(AbstractStateChangeLogger.java:178)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.AbstractStateChangeLogger.log(AbstractStateChangeLogger.java:151)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.AbstractStateChangeLogger.valueElementRemoved(AbstractStateChangeLogger.java:125)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.logRemoval(ChangelogKeyGroupedPriorityQueue.java:153)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.poll(ChangelogKeyGroupedPriorityQueue.java:69)
> ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:301)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.InternalTimeServiceManagerImpl.advanceWatermark(InternalTimeServiceManagerImpl.java:180)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:602)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:609)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark2(AbstractStreamOperator.java:618)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at 
> org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitWatermark(StreamTwoInputProcessorFactory.java:268)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:200)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:115)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:148)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at 
> org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at 
> org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at 
> org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:85)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
> ~[flink-streaming-java-1.17.1.jar:1.17.1]
> at
> org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788)
> 

在使用使用jemalloc内存分配器一段时间后,出现checkpoint 超时,任务卡住的情况

2023-09-24 文章 rui chen
在使用使用jemalloc内存分配器一段时间后,出现checkpoint
超时,任务卡住的情况,哪位遇到过呢?flink版本:flink-1.13.2,jiemalloc版本:5.3.0


After using the jemalloc memory allocator for a period of time, checkpoint timeout occurs and tasks are stuck

2023-09-24 文章 rui chen
After using the jemalloc memory allocator for a period of time, checkpoint
timeout occurs and tasks are stuck. Who has encountered this? flink
version:1.13.2, jiemalloc version: 5.3.0


flink两阶段提交

2023-09-24 文章 海风
请教一下,flink的两阶段提交对于sink算子,预提交是在做检查点的哪个阶段触发的?预提交时具体是做了什么工作?




1.17.1 - Interval join的时候发生NPE

2023-09-23 文章 Phoes Huang
Hi,

单机本地开发执行,遇到该问题,请问有人遇过并解决吗?

2023-09-23 13:52:03.989 INFO 
[flink-akka.actor.default-dispatcher-9][Execution.java:1445] - Interval Join 
(19/20) 
(ff8e25fb94208d3c27f549a1e24757ea_e8388ada9c03cfdb1446bb3ccfbd461b_18_0) 
switched from RUNNING to FAILED on d569c5db-6882-496b-9e92-8a40bb631784 @ 
localhost (dataPort=-1).
java.lang.NullPointerException: null
at 
org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:149)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.TimerSerializer.serialize(TimerSerializer.java:39)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.lambda$logRemoval$1(ChangelogKeyGroupedPriorityQueue.java:153)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.AbstractStateChangeLogger.lambda$serialize$4(AbstractStateChangeLogger.java:184)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.AbstractStateChangeLogger.serializeRaw(AbstractStateChangeLogger.java:193)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.AbstractStateChangeLogger.serialize(AbstractStateChangeLogger.java:178)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.AbstractStateChangeLogger.log(AbstractStateChangeLogger.java:151)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.AbstractStateChangeLogger.valueElementRemoved(AbstractStateChangeLogger.java:125)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.logRemoval(ChangelogKeyGroupedPriorityQueue.java:153)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.state.changelog.ChangelogKeyGroupedPriorityQueue.poll(ChangelogKeyGroupedPriorityQueue.java:69)
 ~[flink-statebackend-changelog-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.InternalTimerServiceImpl.advanceWatermark(InternalTimerServiceImpl.java:301)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.InternalTimeServiceManagerImpl.advanceWatermark(InternalTimeServiceManagerImpl.java:180)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:602)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark(AbstractStreamOperator.java:609)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.api.operators.AbstractStreamOperator.processWatermark2(AbstractStreamOperator.java:618)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.io.StreamTwoInputProcessorFactory$StreamTaskNetworkOutput.emitWatermark(StreamTwoInputProcessorFactory.java:268)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.findAndOutputNewMinWatermarkAcrossAlignedChannels(StatusWatermarkValve.java:200)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.watermarkstatus.StatusWatermarkValve.inputWatermark(StatusWatermarkValve.java:115)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.processElement(AbstractStreamTaskNetworkInput.java:148)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.io.AbstractStreamTaskNetworkInput.emitNext(AbstractStreamTaskNetworkInput.java:110)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:65)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.io.StreamMultipleInputProcessor.processInput(StreamMultipleInputProcessor.java:85)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:550)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.tasks.mailbox.MailboxProcessor.runMailboxLoop(MailboxProcessor.java:231)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.runMailboxLoop(StreamTask.java:839)
 ~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:788) 
~[flink-streaming-java-1.17.1.jar:1.17.1]
at 
org.apache.flink.runtime.taskmanager.Task.runWithSystemExitMonitoring(Task.java:952)
 ~[flink-runtime-1.17.1.jar:1.17.1]
at 

如何获取flink任务 source 和sink的所链接的中间件的ip信息

2023-09-22 文章 阿华田
本人目前再做实时计算平台,想审计平台上所有运行flink任务的中间件的连接信息。比如job1 
是kafka写入hbase的flink任务,希望可以自动审计此任务所连接的kafka的集群ip地址和topic以及hbase的集群ip地址和表信息
| |
阿华田
|
|
a15733178...@163.com
|
签名由网易邮箱大师定制



回复:Flink SQL的状态清理

2023-09-21 文章 faronzz
试试这个 t_env.get_config().set("table.exec.state.ttl", "86400 s")




| |
faronzz
|
|
faro...@163.com
|


 回复的原邮件 
| 发件人 | 小昌同学 |
| 发送日期 | 2023年09月21日 17:06 |
| 收件人 | user-zh |
| 主题 | Flink SQL的状态清理 |


各位老师好,请教一下大家关于flink sql的状态清理问题,我百度的话只找到相关的minbath设置,sql是没有配置state的ttl设置嘛
| |
小昌同学
|
|
ccc0606fight...@163.com
|

Flink SQL的状态清理

2023-09-21 文章 小昌同学


各位老师好,请教一下大家关于flink sql的状态清理问题,我百度的话只找到相关的minbath设置,sql是没有配置state的ttl设置嘛
| |
小昌同学
|
|
ccc0606fight...@163.com
|

Re: Flink cdc 2.0 历史数据太大,导致log积压怎么解决

2023-09-20 文章 jinzhuguang
你好,除了这些运维手段外,flink cdc本身有什么解法吗,比如说增量阶段不用从头开始读binlog,因为其实很多都是重复读到的数据

> 2023年9月20日 21:00,Jiabao Sun  写道:
> 
> Hi,
> 生产环境的binlog还是建议至少保留7天,可以提高故障恢复时间容忍度。
> 另外,可以尝试增加snapshot的并行度和资源来提升snapshot速度,snapshot完成后可以从savepoint恢复并减少资源。
> Best,
> Jiabao
> --
> From:jinzhuguang 
> Send Time:2023年9月20日(星期三) 20:56
> To:user-zh 
> Subject:Flink cdc 2.0 历史数据太大,导致log积压怎么解决
> 以mysql 
> cdc为例,现在的f整体流程是先同步全量数据,再开启增量同步;我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大,TB级别,全量同步可能需要很久,但是binlog又不能删除,这样堆积起来会占用很大的空间,不知道这个问题现在有什么常见的解法吗?



Re: 回复:flink1.17版本不支持hive 2.1版本了吗

2023-09-20 文章 yuxia
把这个 pr https://github.com/apache/flink/pull/19352 revert 掉,然后重新打包 flink hive 
connector 就可以。

Best regards,
Yuxia

- 原始邮件 -
发件人: "迎风浪子" <576637...@qq.com.INVALID>
收件人: "user-zh" 
发送时间: 星期二, 2023年 9 月 19日 下午 5:20:58
主题: 回复:flink1.17版本不支持hive 2.1版本了吗

我们还在使用hive1.1.0,怎么办?



---原始邮件---
发件人: "18579099920"<18579099...@163.com
发送时间: 2023年9月18日(周一) 中午11:26
收件人: "user-zh"

Flink cdc 2.0 历史数据太大,导致log积压怎么解决

2023-09-20 文章 jinzhuguang
以mysql 
cdc为例,现在的f整体流程是先同步全量数据,再开启增量同步;我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大,TB级别,全量同步可能需要很久,但是binlog又不能删除,这样堆积起来会占用很大的空间,不知道这个问题现在有什么常见的解法吗?

Re: Flink cdc 2.0 历史数据太大,导致log积压怎么解决

2023-09-20 文章 Jiabao Sun
Hi,
生产环境的binlog还是建议至少保留7天,可以提高故障恢复时间容忍度。
另外,可以尝试增加snapshot的并行度和资源来提升snapshot速度,snapshot完成后可以从savepoint恢复并减少资源。
Best,
Jiabao
--
From:jinzhuguang 
Send Time:2023年9月20日(星期三) 20:56
To:user-zh 
Subject:Flink cdc 2.0 历史数据太大,导致log积压怎么解决
以mysql 
cdc为例,现在的f整体流程是先同步全量数据,再开启增量同步;我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大,TB级别,全量同步可能需要很久,但是binlog又不能删除,这样堆积起来会占用很大的空间,不知道这个问题现在有什么常见的解法吗?


Flink cdc 2.0 历史数据太大,导致log积压怎么解决

2023-09-20 文章 jinzhuguang
以mysql 
cdc为例,现在的f整体流程是先同步全量数据,再开启增量同步;我看代码目前增量的初始offset选择的是所有全量split的最小的highwatermark。那我如果全量数据很大,TB级别,全量同步可能需要很久,但是binlog又不能删除,这样堆积起来会占用很大的空间,不知道这个问题现在有什么常见的解法吗?

[CVE-2023-41834] Apache Flink Stateful Functions allowed HTTP header injection due to Improper Neutralization of CRLF Sequences

2023-09-19 文章 Martijn Visser
CVE-2023-41834: Apache Flink Stateful Functions allowed HTTP header
injection due to Improper Neutralization of CRLF Sequences

Severity: moderate

Vendor:
The Apache Software Foundation

Versions Affected:
Stateful Functions 3.1.0 to 3.2.0

Description:
Improper Neutralization of CRLF Sequences in HTTP Headers in Apache
Flink Stateful Functions 3.1.0, 3.1.1 and 3.2.0 allows remote
attackers to inject arbitrary HTTP headers and conduct HTTP response
splitting attacks via crafted HTTP requests. Attackers could
potentially inject malicious content into the HTTP response that is
sent to the user. This could include injecting a fake login form or
other phishing content, or injecting malicious JavaScript code that
can steal user credentials or perform other malicious actions on the
user's behalf.

Mitigation:
Users should upgrade to 3.3.0

Credit:
This issue was discovered by Andrea Cosentino from Apache Software Foundation

References:
https://flink.apache.org/security/


[ANNOUNCE] Apache Flink Stateful Functions Release 3.3.0 released

2023-09-19 文章 Martijn Visser
The Apache Flink community is excited to announce the release of
Stateful Functions 3.3.0!

Stateful Functions is a cross-platform stack for building Stateful
Serverless applications, making it radically simpler to develop
scalable, consistent, and elastic distributed applications. This new
release upgrades the Flink runtime to 1.16.2.

Release highlight:
- Upgrade underlying Flink dependency to 1.16.2

Release blogpost:
https://flink.apache.org/2023/09/19/stateful-functions-3.3.0-release-announcement/

The release is available for download at:
https://flink.apache.org/downloads/

Java SDK can be found at:
https://search.maven.org/artifact/org.apache.flink/statefun-sdk-java/3.3.0/jar

Python SDK can be found at:
https://pypi.org/project/apache-flink-statefun/

GoLang SDK can be found at:
https://github.com/apache/flink-statefun/tree/statefun-sdk-go/v3.3.0

JavaScript SDK can be found at:
https://www.npmjs.com/package/apache-flink-statefun

Official Docker image for Flink Stateful Functions can be
found at: https://hub.docker.com/r/apache/flink-statefun

The full release notes are available in Jira:
https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522=12351276

We would like to thank all contributors of the Apache Flink community who
made this release possible!

Regards,
Martijn Visser


回复: 退订

2023-09-15 文章 Chen Zhanghao
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 
user-zh@flink.apache.org 邮件组的邮件

Best,
Zhanghao Chen

发件人: Lynn Chen 
发送时间: 2023年9月15日 16:56
收件人: user-zh@flink.apache.org 
主题: 退订







退订













回复: flink-metrics如何获取applicationid

2023-09-15 文章 Chen Zhanghao
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自 
user-zh@flink.apache.org 邮件组的邮件

Best,
Zhanghao Chen

发件人: im huzi 
发送时间: 2023年9月15日 18:14
收件人: user-zh@flink.apache.org 
主题: Re: flink-metrics如何获取applicationid

退订
On Wed, Aug 30, 2023 at 19:14 allanqinjy  wrote:

> hi,
>请教大家一个问题,就是在上报指标到prometheus时候,jobname会随机生成一个后缀,看源码也是new Abstract
> ID(),有方法在这里获取本次上报的作业applicationid吗?


Re: flink-metrics如何获取applicationid

2023-09-15 文章 im huzi
退订
On Wed, Aug 30, 2023 at 19:14 allanqinjy  wrote:

> hi,
>请教大家一个问题,就是在上报指标到prometheus时候,jobname会随机生成一个后缀,看源码也是new Abstract
> ID(),有方法在这里获取本次上报的作业applicationid吗?


Re: 退订

2023-09-13 文章 Biao Geng
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自
user-zh@flink.apache.org  邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org if you want to
unsubscribe the mail from user-zh@flink.apache.org ,
and you can refer [1][2] for more details.

Best,
Biao Geng

[1]
https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

wangchuan  于2023年9月11日周一 10:20写道:

> 退订
>


回复:flink-metrics如何获取applicationid

2023-09-11 文章 吴先生
请问好使吗,怎么使用的


| |
吴先生
|
|
15951914...@163.com
|
 回复的原邮件 
| 发件人 | allanqinjy |
| 发送日期 | 2023年8月30日 20:02 |
| 收件人 | user-zh@flink.apache.org |
| 主题 | 回复:flink-metrics如何获取applicationid |
多谢了,明天改一下代码试试
 回复的原邮件 
| 发件人 | Feng Jin |
| 发送日期 | 2023年08月30日 19:42 |
| 收件人 | user-zh |
| 主题 | Re: flink-metrics如何获取applicationid |
hi,

可以尝试获取下 _APP_ID  这个 JVM 环境变量.
System.getenv(YarnConfigKeys.ENV_APP_ID);

https://github.com/apache/flink/blob/6c9bb3716a3a92f3b5326558c6238432c669556d/flink-yarn/src/main/java/org/apache/flink/yarn/YarnConfigKeys.java#L28


Best,
Feng

On Wed, Aug 30, 2023 at 7:14 PM allanqinjy  wrote:

hi,
请教大家一个问题,就是在上报指标到prometheus时候,jobname会随机生成一个后缀,看源码也是new Abstract
ID(),有方法在这里获取本次上报的作业applicationid吗?


Re: 退订

2023-09-11 文章 Hang Ruan
Hi,

请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org 地址来取消订阅来自
user-zh@flink.apache.org  邮件组的邮件,你可以参考[1][2]
管理你的邮件订阅。
Please send email to user-zh-unsubscr...@flink.apache.org if you want to
unsubscribe the mail from user-zh@flink.apache.org ,
and you can refer [1][2] for more details.

Best,
Hang

[1]
https://flink.apache.org/zh/community/#%e9%82%ae%e4%bb%b6%e5%88%97%e8%a1%a8
[2] https://flink.apache.org/community.html#mailing-lists

刘海  于2023年9月11日周一 10:23写道:

> 退订
>
>
>


<    1   2   3   4   5   6   7   8   9   10   >