Re: Jobmanager stopped because uncaught exception

2021-02-07 Thread Yang Wang
Maybe it is a known issue[1] and has already been resolved in 1.12.2(will release soon). BTW, I think it is unrelated with the aliyun oss info logs. [1]. https://issues.apache.org/jira/browse/FLINK-20992 Best, Yang Lei Wang 于2021年2月8日周一 下午2:22写道: > Flink standalone HA. Flink version 1.12.1

Re: procesElement中每天数据都put 进入map,但下一条数据来时map都是空的咨询

2021-02-07 Thread Yun Tang
Hi 祖安, state抽象的数据结构,无论是value state,list state还是map state,其都是对应流计算处理中的当前key对应的数据结构。以map state具体来说对于每个正在处理的current key (由key selector选择出来 [1]),都有一个对应 的map存储相关的数据,如果你每次都发现对应的map为空,很有可能是因为你的key selector选择出来的key每次都不相同,很大概率是当前处理的record不同导致。 另外,map.isEmpty() 的调用是需要额外开销的(尤其对于RocksDB state

Re: Flink SQL temporal table join with Hive 报错

2021-02-07 Thread macia kk
图就是哪个报错 建表语句如下,表示公共表,我也没有改的权限. CREATE EXTERNAL TABLE `exchange_rate`(`grass_region` string COMMENT 'country', `currency` string COMMENT 'currency', `exchange_rate` decimal(25,10) COMMENT 'exchange rate') PARTITIONED BY (`grass_date` date COMMENT 'partition key, -MM-dd') ROW FORMAT SERDE

Jobmanager stopped because uncaught exception

2021-02-07 Thread Lei Wang
Flink standalone HA. Flink version 1.12.1 2021-02-08 13:57:50,550 ERROR org.apache.flink.runtime.util.FatalExitExceptionHandler [] - FATAL: Thread 'cluster-io-thread-30' produced an uncaught exception. Stopping the process... java.util.concurrent.RejectedExecutionException: Task

Re: Flink SQL temporal table join with Hive 报错

2021-02-07 Thread Rui Li
你好,图挂了,可以贴一下hive建表的DDL和join的语句是怎么写的么? On Mon, Feb 8, 2021 at 10:33 AM macia kk wrote: > Currently the join key in Temporal Table Join can not be empty. > > 我的 Hive 表 join DDL 没有设置 is not null ,但是都是有值的,还是会报这个错 > > [image: image.png] > -- Best regards! Rui Li

Re: Re: flink kryo exception

2021-02-07 Thread 赵一旦
yes, but I use stop not cancel, which also stop and cancel the job together. Yun Gao 于2021年2月8日周一 上午11:59写道: > Hi yidan, > > One more thing to confirm: are you create the savepoint and stop the job > all together with > > bin/flink cancel -s [:targetDirectory] :jobId > > command ? > > Best, >

Re: Sliding Window Count: Tricky Edge Case / Count Zero Problem

2021-02-07 Thread Yun Gao
Hi Jan, From my view, I think in Flink Window should be as a "high-level" operation for some kind of aggregation operation and if it could not satisfy the requirements, we could at least turn to using the "low-level" api by using KeyedProcessFunction[1]. In this case, we could use a ValueState

Re: flink on yarn 多TaskManager 拒绝连接问题

2021-02-07 Thread Yang Wang
那你可能需要把你的JobManager和TaskManager的日志发一下,才能进一步分析 主要需要确认的是连的端口是正确的,如果网络层面没有问题,那就有可能是哪个配置项使用了某个特定端口导致的 Best, Yang Junpb 于2021年2月8日周一 上午9:30写道: > 你好, > 我的测试环境yarn有三个节点,当TM启动只有一个时,JM和Tm随机启动在任何节点上都很正常,只有TM变为两个时,会出现报错。 > 每次启动JM和TM端口都是随机的,以上配置是确保2个TM启动,我现在怀疑是我其他配置导致的错误,谢谢 > > Best, > Bi > > > > -- >

Re: Re: flink kryo exception

2021-02-07 Thread Yun Gao
Hi yidan, One more thing to confirm: are you create the savepoint and stop the job all together with bin/flink cancel -s [:targetDirectory] :jobId command ? Best, Yun --Original Mail -- Sender:赵一旦 Send Date:Sun Feb 7 16:13:57 2021 Recipients:Till

Re: UUID in part files

2021-02-07 Thread Yun Gao
Hi Dan The SQL add the uuid by default is for the case that users want execute multiple bounded sql and append to the same directory (hive table), thus a uuid is attached to avoid overriding the previous output. The datastream could be viewed as providing the low-level api and thus it does not

Re: hybrid state backends

2021-02-07 Thread Yun Gao
Hi Marco, Sorry that current statebackend is a global configuration and could not be configured differently for different operators. One possible alternative option to this requirements might be set rocksdb as the default statebackend, and for those operators that want to put state in

"upsert-kafka" connector not working with Avro confluent schema registry

2021-02-07 Thread Shamit
Hello Team, As we have two kafka connectors "upsert-kafka" and "kafka". I am facing issue with "upsert-kafka" while reading avro message serialized using "io.confluent.kafka.serializers.KafkaAvroDeserializer". Please note "kafka" connector is working while reading avro message serialized

Table Cache Problem

2021-02-07 Thread Yongsong He
Hi experts, I want to cache a temporary table for reuse it Flink version 1.10.1 the table is consumer from kafka, struct like: create table a ( field1 string, field2 string, field3 string, field4 string ) the sample code looks like: val settings =

Flink SQL temporal table join with Hive 报错

2021-02-07 Thread macia kk
Currently the join key in Temporal Table Join can not be empty. 我的 Hive 表 join DDL 没有设置 is not null ,但是都是有值的,还是会报这个错 [image: image.png]

pyFlink UDF Caused by: java.lang.RuntimeException: Failed to create stage bundle factory! INFO:root:Initializing python harness:

2021-02-07 Thread 陈康
请教大佬们: 一个最简单pyflink UDF跑起来,报 Failed to create stage bundle factory! INFO:root:Initializing python harness: 在IdeaIJ上可以运行、大家有遇到过吗?谢谢~ /opt/module/flink-1.11.1/bin/flink run -m localhost:8081 -pyexec /opt/python36/bin/python3 -py udf.py [hadoop@hadoop01 pyflink]$ /opt/python36/bin/python3 -V

Re: 关于flink窗口state

2021-02-07 Thread HunterXHunter
你这代码贴的乱七八糟。。。 你需要再richjoinfunction里面设置valuestate的生命周期,他不随着窗口而销毁,窗口只会销毁自己设定的state,有空你可以看看window的源码,里面有清理state的逻辑 -- Sent from: http://apache-flink.147419.n8.nabble.com/

Re: question on ValueState

2021-02-07 Thread Yun Tang
Hi, MemoryStateBackend and FsStateBackend both hold keyed state in HeapKeyedStateBackend [1], and the main structure to store data is StateTable [2] which holds POJO format objects. That is to say, the object would not be serialized when calling update(). On the other hand, RocksDB

Re: procesElement中每天数据都put 进入map,但下一条数据来时map都是空的咨询

2021-02-07 Thread 赵一旦
keyedStream? key不同可能是。 谌祖安 于2021年2月7日周日 下午6:00写道: > 您好! > > > 重载procesElement方法,每条stream数据处理时,put数据进入map,后面每条数据处理时先判断是否在map中有相同的key值,无则新增后put进map,有则加工后put进map。 > 在实际代码中发现,每次写入put后,map有数据,但处理下一条数据时,先从map读取数据,发现map已经为空。 > 请问是哪里写错了吗? 和 flink官网中 state.update(current);有什么不同吗? > > 以下为代码: >

Re: flink on yarn 多TaskManager 拒绝连接问题

2021-02-07 Thread Junpb
你好, 我的测试环境yarn有三个节点,当TM启动只有一个时,JM和Tm随机启动在任何节点上都很正常,只有TM变为两个时,会出现报错。 每次启动JM和TM端口都是随机的,以上配置是确保2个TM启动,我现在怀疑是我其他配置导致的错误,谢谢 Best, Bi -- Sent from: http://apache-flink.147419.n8.nabble.com/

UUID in part files

2021-02-07 Thread Dan Hill
Hi. *Context* I'm migrating my Flink SQL job to DataStream. When switching to StreamingFileSink, I noticed that the part files now do not have a uuid in them. "part-0-0" vs "part-{uuid string}-0-0". This is easy to add with OutputFileConfig. *Question* Is there a reason why the base

Re: Cannot connect to queryable state proxy

2021-02-07 Thread 陳昌倬
On Thu, Feb 04, 2021 at 04:26:42PM +0800, ChangZhuo Chen (陳昌倬) wrote: > Hi, > > We have problem connecting to queryable state client proxy as described > in [0]. Any help is appreciated. > > * The port 6125 is opened in taskmanager pod. > > ``` > root@-654b94754d-2vknh:/tmp# ss -tlp >

question on ValueState

2021-02-07 Thread Colletta, Edward
Using FsStateBackend. I was under the impression that ValueState.value will serialize an object which is stored in the local state backend, copy the serialized object and deserializes it. Likewise update() would do the same steps copying the object back to local state backend.And as a

Re: PyFlink Could not read the user code wrapper:org.apache.flink.connector.jdbc.table.JdbcRowDataInputFormat

2021-02-07 Thread 陈康
感谢回复、jar添加到lib下没重启服务、 自己S 13了; 不过又在PyFlink 应用UDF(在SQL应用udf函数)过程中遇到如下问题;把udf函数去掉,pyflink 又可以执行.. 请问有遇到过嘛?谢谢~ --- Caused by: java.lang.RuntimeException: Failed to create stage bundle factory! INFO:root:Initializing python harness:

procesElement中每天数据都put 进入map,但下一条数据来时map都是空的咨询

2021-02-07 Thread 谌祖安
您好! 重载procesElement方法,每条stream数据处理时,put数据进入map,后面每条数据处理时先判断是否在map中有相同的key值,无则新增后put进map,有则加工后put进map。 在实际代码中发现,每次写入put后,map有数据,但处理下一条数据时,先从map读取数据,发现map已经为空。 请问是哪里写错了吗? 和 flink官网中 state.update(current);有什么不同吗? 以下为代码: private MapState map; //定义map @Override public void

pyflink Failed to create stage bundle factory! INFO:root:Initializing python harness:

2021-02-07 Thread 陈康
执行pyflink提交任务的报错、又遇到过的大佬嘛?谢谢! /opt/module/flink-1.11.1/bin/flink run -m localhost:8081 -pyexec /opt/python36/bin/python3 -py NtPyFlink.py Caused by: java.lang.RuntimeException: Failed to create stage bundle factory! INFO:root:Initializing python harness:

Re: PyFlink Could not read the user code wrapper:org.apache.flink.connector.jdbc.table.JdbcRowDataInputFormat

2021-02-07 Thread Dian Fu
Hi, flink-connector-jdbc_2.11-1.11.1.jar 有添加在flink/lib下,只能保证在作业执行的时候,可以找到对应的class,在客户端提交的时候,会编译作业,从报错看,是客户端编译作业的时候找不到对应的class。 可以试试这里的方法:https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/python/table-api-users-guide/dependency_management.html#java-dependency > 在

Re: flink升级hadoop3

2021-02-07 Thread Yun Tang
Hi, Flink自1.11 版本之后就已经支持了hadoop3 [1][2],具体来讲就是将 HADOOP_CLASSPATH 配置成运行机器上的hadoop3 相关jar包即可。 你也可以参照 [3] 的测试步骤 [1] https://issues.apache.org/jira/browse/FLINK-11086 [2] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/yarn.html#supported-hadoop-versions [3]

flink 1.12.0 k8s session部署异常

2021-02-07 Thread casel.chen
在k8s上部署sesson模式的flink集群遇到jobmanager报如下错误,请问这是什么原因造成的?要如何fix? 2021-02-07 08:21:41,873 INFO org.apache.flink.runtime.rpc.akka.AkkaRpcService [] - Starting RPC endpoint for org.apache.flink.runtime.dispatcher.StandaloneDispatcher at akka://flink/user/rpc/dispatcher_1 . 2021-02-07

Re: flink kryo exception

2021-02-07 Thread 赵一旦
It also maybe have something to do with my job's first tasks. The second task have two input, one is the kafka source stream(A), another is self-defined mysql source as broadcast stream.(B) In A: I have a 'WatermarkReAssigner', a self-defined operator which add an offset to its input watermark and

Re: 关于1.12新增的initialize阶段时间较长问题

2021-02-07 Thread 赵一旦
It also maybe have something to do with my job's first tasks. The second task have two input, one is the kafka source stream(A), another is self-defined mysql source as broadcast stream.(B) In A: I have a 'WatermarkReAssigner', a self-defined operator which add an offset to its input watermark and

Re: flink kryo exception

2021-02-07 Thread 赵一旦
The first problem is critical, since the savepoint do not work. The second problem, in which I changed the solution, removed the 'Map' based implementation before the data are transformed to the second task, and this case savepoint works. The only problem is that, I should stop the job and

Re: 关于1.12新增的initialize阶段时间较长问题

2021-02-07 Thread 赵一旦
截图也没办法反应动态变化的过程。 目前是10机器的Standalone集群,状态在5G左右。通过flink-client端提交任务,然后web-ui刷新就一直转圈,过一会(几十秒大概)就OK啦,然后刚刚OK一瞬间会有很多个处于Initialize状态的任务,然后慢慢(10s内吧)没掉。 flink-client端的话,有时候正常提交完成,有时候出现报错(类似说是重复任务的)。 zilong xiao 于2021年2月7日周日 下午3:25写道: > 有截图吗? > > 赵一旦 于2021年2月7日周日 下午3:13写道: > > > 这个问题现在还有个现象,我提交任务,web