Container is running beyond physical memory limits. Current usage: 5.0 GB of 5 GB physical memory used; 7.0 GB of 25 GB virtual memory used. Killing container.

2021-03-30 Thread admin
java.lang.Exception: Container [pid=17248,containerID=container_1597847003686_12235_01_001336] is running beyond physical memory limits. Current usage: 5.0 GB of 5 GB physical memory used; 7.0 GB of 25 GB virtual memory used. Killing container. Dump of the process-tree for

退订

2021-03-30 Thread Y Luo
退订

JDBC connector support for JSON

2021-03-30 Thread Fanbin Bu
Hi, For a streaming job that uses Kafka connector, this doc https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connectors/formats/json.html#format-options shows that we can parse json data format. However, it does not seem like Flink JDBC connector support json data type, at least

Re: flink-提交jar 隔断时间自己重启问题

2021-03-30 Thread yidan zhao
没看懂问题。任务自动重启?失败了自然就重启了,restart策略设置的吧。 valve <903689...@qq.com> 于2021年3月31日周三 上午11:31写道: > 我也遇到这个问题 不知道为啥 > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/ >

Re: Re:回复:flink 从mysql读取数据异常

2021-03-30 Thread Robin Zhang
Hi,air23 JDBCTableSource就是batch模式的,不走实时。Flink解析执行计划时内部会去判断。 Best air23 wrote > 这边是想离线读取。不是走实时的 > 看到异常是 Only insert statement is supported now > > > > > > > > > > > > > > > > > > 在 2021-03-30 10:31:51,"guoyb" < > 861277329@ >> 写道: >>可以读取的,还有内置flink cdc

Re: flink-提交jar 隔断时间自己重启问题

2021-03-30 Thread valve
我也遇到这个问题 不知道为啥 -- Sent from: http://apache-flink.147419.n8.nabble.com/

退订

2021-03-30 Thread 张保淇
退订

Restoring from Flink Savepoint in Kubernetes not working

2021-03-30 Thread Claude M
Hello, I have Flink setup as an Application Cluster in Kubernetes, using Flink version 1.12. I created a savepoint using the curl command and the status indicated it was completed. I then tried to relaunch the job from that save point using the following arguments as indicated in the doc found

Checkpoint Aligned问题

2021-03-30 Thread 张韩

Re: [External] : Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-03-30 Thread Yang Wang
Hi Fuyao, Thanks for sharing the progress. 1. The flink client is able to list/cancel jobs, based on logs shared > above, I should be able to ping 144.25.13.78, why I still can NOT ping such > address? I think this is a environment problem. Actually, not every IP address could be tested with

Organizing Flink Applications: Mono repo or polyrepo

2021-03-30 Thread Xinbin Huang
Hi community I am curious about people's experience in structuring Flink applications. Do you use a mono repo structure (multiple applications in one single repo) or broken down each application into its own repo? If possible, can you share some of your thoughts on the pros/cons of each

Re: Checkpoint fail due to timeout

2021-03-30 Thread Alexey Trenikhun
Hi Piotrek, I can't reproduce problem anymore, before the problem happened 2-3 times in row, I've turned off unaligned checkpoints, now returned unaligned checkpoints back, but the problem seems gone for now. When problem happened there was no progress on source operators, I thought maybe it

Re: Checkpoint fail due to timeout

2021-03-30 Thread Alexey Trenikhun
I also expected improve of checkpointing at the cost of throughput, but in in reality I didn't notice difference neither in checkpointing or throughput. Backlog was purged by Kafka, so can't post thread dump right now, but I doubt that the problem is gone, so will have next chance during next

Re: Source Operators Stuck in the requestBufferBuilderBlocking

2021-03-30 Thread Sihan You
Awesome. Let me know if you need any other information. Our application has a heavy usage on event timer and keyed state. The load is vey heavy. If that matters. On Mar 29, 2021, 05:50 -0700, Piotr Nowojski , wrote: > Hi Sihan, > > Thanks for the information. Previously I was not able to

Re: [External] : Re: Need help with executing Flink CLI for native Kubernetes deployment

2021-03-30 Thread Fuyao Li
Hello Yang, Thank you so much for providing me the flink-client.yaml. I was able to make some progress. I didn’t realize I should create an new pod flink-client to list/cancel jobs. I was trying to do such a thing from my local laptop. Maybe that is the reason why it doesn’t work. However, I

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Vishal Santoshi
Great, thanks! On Tue, Mar 30, 2021 at 11:00 AM Till Rohrmann wrote: > This is a good idea. I will add it to the section here [1]. > > [1] > https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#terminating-a-job > > Cheers, > Till > > On Tue, Mar 30, 2021 at 2:46 PM Vishal

IO benchmarking

2021-03-30 Thread deepthi Sridharan
Hi, I am trying to set up some benchmarking with a couple of IO options for saving checkpoints and have a couple of questions : 1. Does flink come with any IO benchmarking tools? I couldn't find any. I was hoping to use those to derive some insights about the storage performance and extrapolate

Proper way to get DataStream

2021-03-30 Thread Maminspapin
Hi, I'm trying to solve a task with getting data from topic. This topic keeps avro format data. I wrote next code: public static void main(String[] args) throws Exception { StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); Schema schema

Re: Support for sending generic class

2021-03-30 Thread Le Xu
Hi Gordon and Till: Thanks for pointing me to the new version! The code I'm using is for a research project so it's not on any production deadline. However I do like to know any upcoming updates so there won't be any duplicated works. Couple of questions I have now: 1. Does 3.0 support

Re: Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Lehuede sebastien
Hi Till, That solved my issue ! Many many thanks for the solution and for the useful StackOverflow link ! ☺️ Cheers, Sébastien > Le 30 mars 2021 à 18:16, Till Rohrmann a écrit : > > Hi Sebastien, > > I think the Scala compiler infers the most specific type for deepCopy() which > is

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
Hi Till, >From the version I am using (1.12.0), getFieldNames is not available in Row ... See https://github.com/apache/flink/blob/release-1.12/flink-core/src/main/java/org/apache/flink/types/Row.java . Is there any workaround for this in version 1.12.0? Thanks. Best, Yik San On Wed, Mar 31,

Re: Failure detection in Flink

2021-03-30 Thread Sonam Mandal
Hi Till, This is really helpful, thanks for the detailed explanation about what happens. I'll reach out again if Ihave any further questions. For now I'm just trying to understand the various failure scenarios and how they are handled by Flink. Thanks, Sonam

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Till Rohrmann
There is a method Row.getFieldNames. Cheers, Till On Tue, Mar 30, 2021 at 6:06 PM Yik San Chan wrote: > Hi Till, > > I look inside the Row class, it does contain a member `private final > Object[] fields;` though I wonder how to get column names out of the > member? > > Thanks! > > Best, > Yik

Re: Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Till Rohrmann
Hi Sebastien, I think the Scala compiler infers the most specific type for deepCopy() which is Nothing (Nothing is the subtype of every type) [1] because you haven't specified a type here. In order to make it work you have to specify the concrete type: event.get("value").deepCopy[ObjectNode]()

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
Hi Till, I look inside the Row class, it does contain a member `private final Object[] fields;` though I wonder how to get column names out of the member? Thanks! Best, Yik San On Tue, Mar 30, 2021 at 11:45 PM Till Rohrmann wrote: > Hi Yik San, > > by converting the rows to a Tuple3 you

Re: Flink Table to DataStream: how to access column name?

2021-03-30 Thread Till Rohrmann
Hi Yik San, by converting the rows to a Tuple3 you effectively lose the information about the column names. You could also call `toRetractStream[Row]` which will give you a `DataStream[Row]` where you keep the column names. Cheers, Till On Tue, Mar 30, 2021 at 3:52 PM Yik San Chan wrote: >

Re: Failure detection in Flink

2021-03-30 Thread Till Rohrmann
Well, the FLIP-6 documentation is probably the best resource albeit being a bit outdated. The components react a bit differently: JobMaster loses heartbeat with a TaskExecutor: If this happens, then the JobMaster will invalidate all slots from this TaskExecutor. This will then fail the tasks

Re: StateFun examples in scala

2021-03-30 Thread jose farfan
Hi Many thx for your quick answer. I will review the links. BR Jose On Tue, 30 Mar 2021 at 15:22, Tzu-Li (Gordon) Tai wrote: > Hi Jose! > > For Scala, we would suggest to wait until StateFun 3.0.0 is released, > which is actually happening very soon (likely within 1-2 weeks) as there is > an

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Till Rohrmann
This is a good idea. I will add it to the section here [1]. [1] https://ci.apache.org/projects/flink/flink-docs-stable/deployment/cli.html#terminating-a-job Cheers, Till On Tue, Mar 30, 2021 at 2:46 PM Vishal Santoshi wrote: > Got it. Is it possible to add this very important note to the >

Re: Failure detection in Flink

2021-03-30 Thread Sonam Mandal
Hi Till, Thanks, this helps! Yes, removing the AKKA related configs will definitely help to reduce confusion. One more question, I was going through FLIP-6 and it does talk about the behavior of various components when failures are detected via heartbeat timeouts etc. is this the best

Flink Table to DataStream: how to access column name?

2021-03-30 Thread Yik San Chan
The question is cross-posted on Stack Overflow https://stackoverflow.com/questions/66872184/flink-table-to-datastream-how-to-access-column-name . I want to consume a Kafka topic into a table using Flink SQL, then convert it back to a DataStream. Here is the `SOURCE_DDL`: ``` CREATE TABLE

Scala : ClassCastException with Kafka Connector and ObjectNode

2021-03-30 Thread Lehuede sebastien
Hi all, I’m currently trying to use Scala to setup a simple Kafka consumer that receive JSON formatted events and then just send them to Elasticsearch. This is the first step and after I want to add some processing logic. My code works well but interesting fields form my JSON formatted events

Re: Support for sending generic class

2021-03-30 Thread Tzu-Li (Gordon) Tai
Hi Le, Thanks for reaching out with this question! It's actually a good segue to allow me to introduce you to StateFun 3.0.0 :) StateFun 3.0+ comes with a new type system that would eliminate this hassle. You can take a sneak peek here [1]. This is part 1 of a series of tutorials on fundamentals

Re: StateFun examples in scala

2021-03-30 Thread Tzu-Li (Gordon) Tai
Hi Jose! For Scala, we would suggest to wait until StateFun 3.0.0 is released, which is actually happening very soon (likely within 1-2 weeks) as there is an ongoing release candidate vote [1]. The reason for this is that version 3.0 adds a remote SDK for Java, which you should be able to use

Re: Evenly distribute task slots across task-manager

2021-03-30 Thread Till Rohrmann
Hi Vignesh, if I understand you correctly, then you have a job like: KafkaSources(parallelism = 64) => Mapper(parallelism = 16) => something else Moreover, you probably have slot sharing enabled which means that a KafkaSource and a Mapper can be deployed into the same slot. So what happens

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Vishal Santoshi
Got it. Is it possible to add this very important note to the documentation. Our case is the former as in this is an infinite pipeline and we were establishing the CiCD release process when non breaking changes ( DAG compatible changes are made ) on a running pipe. Regards On Tue, Mar 30, 2021

Re: StateFun examples in scala

2021-03-30 Thread Till Rohrmann
Hi Jose, I am pulling in Gordon who will be able to help you with your question. Personally, I am not aware of any limitations which prohibit the usage of Scala. Cheers, Till On Tue, Mar 30, 2021 at 11:55 AM jose farfan wrote: > Hi > > I am trying to find some examples written in scala of

Re: Support for sending generic class

2021-03-30 Thread Till Rohrmann
Hi Le, I am pulling in Gordon who might be able to help you with your question. Looking at the interface Context, it looks that you cannot easily specify a TypeHint for the message you want to send. Hence, I guess that you explicitly need to register these types. Cheers, Till On Tue, Mar 30,

Re: SP with Drain and Cancel hangs after take a SP

2021-03-30 Thread Till Rohrmann
Hi Vishal, The difference between stop-with-savepoint and stop-with-savepoint-with-drain is that the latter emits a max watermark before taking the snapshot. The idea is to trigger all pending timers and flush the content of some buffering operations like windowing. Semantically, you should use

Re: Failure detection in Flink

2021-03-30 Thread Till Rohrmann
Hi Sonam, Flink uses its own heartbeat implementation to detect failures of components. This mechanism is independent of the used deployment model. The relevant configuration options can be found here [1]. The akka.transport.* options are only for configuring the underlying Akka system. Since we

Re: Flink State Query Server threads stuck in infinite loop with high GC activity on CopyOnWriteStateMap get

2021-03-30 Thread Till Rohrmann
Hi Aashutosh, The queryable state feature is no longer actively maintained by the community. What I would recommend is to output the aggregate counts via a sink to some key value store which you query to obtain the results. Looking at the implementation of CopyOnWriteStateMap, it does not look

flinkSQL + pythonUDF问题

2021-03-30 Thread guaishushu1103
任务运行一段时间出现Apache beam问题 有哪位大佬能帮忙看看: Caused by: java.lang.RuntimeException: Error received from SDK harness for instruction 3134: Traceback (most recent call last): File "/home/yarn/software/python/lib/python3.6/site-packages/apache_beam/runners/worker/data_plane.py", line 421, in input_elements

StateFun examples in scala

2021-03-30 Thread jose farfan
Hi I am trying to find some examples written in scala of StateFun. But, I cannot find nothing. My questions is: 1. is there any problem to use statefun with Scala 2. is there any place with examples written in scala. BR Jose

Flink 写ORC失败

2021-03-30 Thread Jacob
使用Flink API消费kafka消息,写orc文件,报错如下 Caused by: org.apache.flink.util.SerializedThrowable at java.lang.System.arraycopy(Native Method) ~[?:1.8.0_191-ojdkbuild] at org.apache.hadoop.io.Text.set(Text.java:225) ~[test456.jar:?] at

pyflink1.12 报错:org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Process died with exit code 0

2021-03-30 Thread xiaoyue
在执行 pyflink UDAF 脚本时报错:org.apache.beam.vendor.guava.v26_0_jre.com.google.common.util.concurrent.UncheckedExecutionException: java.lang.IllegalStateException: Process died with exit code 0。 目前udaf计算的结果,无法sink, 不知路过的大佬,是否也遇到过这个问题? 异常信息如下: Traceback (most recent call last): File

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Chesnay Schepler
This looks related to HDFS-12920; where Hadoop 2.X tries to read a duration from hdfs-default.xml expecting plain numbers, but in 3.x they also contain time units. On 3/30/2021 9:37 AM, Matthias Seiler wrote: Thank you all for the replies! I did as @Maminspapin suggested and indeed the

Re: With the checkpoint interval of the same size, the Flink 1.12 version of the job checkpoint time-consuming increase and production failure, the Flink1.9 job is running normally

2021-03-30 Thread Yingjie Cao
Hi Haihang, After scanning the user mailing list, I found some users have reported checkpoint timeout when using unaligned checkpoint, can you share which checkpoint mode do you use? (The information can be found in log or the checkpoint -> configuration tab in webui) Best, Yingjie Yingjie Cao

Re: DataStream from kafka topic

2021-03-30 Thread Maminspapin
I tried this: 1. Schema (found in stackoverflow) class GenericRecordSchema implements KafkaDeserializationSchema { private String registryUrl; private transient KafkaAvroDeserializer deserializer; public GenericRecordSchema(String registryUrl) { this.registryUrl =

Re: 相同的作业配置 ,Flink1.12 版本的作业checkpoint耗时增加以及制作失败,Flink1.9的作业运行正常

2021-03-30 Thread Yingjie Cao
这个应该不是FLINK-16404 的影响,那个对checkpoint时间的影响比较小,是已经有一个benchmark测试的,1s的checkpoint interval也没什么大问题,我建议可以看一下失败的task的stack,看一下在干什么,可能排查问题更快一些。 Haihang Jing 于2021年3月24日周三 下午12:06写道: > 【现象】相同配置的作业(checkpoint interval :3分钟,作业逻辑:regular >

Re: With the checkpoint interval of the same size, the Flink 1.12 version of the job checkpoint time-consuming increase and production failure, the Flink1.9 job is running normally

2021-03-30 Thread Yingjie Cao
Hi Haihang, I think your issue is not related to FLINK-16404 , because that change should have small impact on checkpoint time, we already have a micro benchmark for that change (1s checkpoint interval) and no regression is seen. Could you share

Re: flink sql count distonct 优化

2021-03-30 Thread Robin Zhang
Hi,guomuhua `The number of inputs accumulated by local aggregation every time is based on mini-batch interval. It means local-global aggregation depends on mini-batch optimization is enabled ` ,关于本地聚合,官网有这么一段话,也就是说,需要先开启批次聚合,然后才能使用本地聚合,加起来有三个参数.

Re: flink sql count distonct 优化

2021-03-30 Thread Robin Zhang
Hi,Jark 我理解疑问中的sql是一个普通的agg操作,只不过分组的键是时间字段,不知道您说的 `我看你的作业里面是window agg` ,这个怎么理解 Best, Robin Jark wrote >> 如果不是window agg,开启参数后flink会自动打散是吧 > 是的 > >> 那关于window agg, 不能自动打散,这部分的介绍,在文档中可以找到吗? > 文档中没有说明。 这个文档[1] 里说地都是针对 unbounded agg 的优化。 > > Best, > Jark > > [1]: >

Re: flink 从mysql读取数据异常

2021-03-30 Thread 张锴
报错 信息明确说了只支持insert air23 于2021年3月30日周二 上午10:32写道: > 你好 参考官网 > https://ci.apache.org/projects/flink/flink-docs-release-1.12/zh/dev/table/connectors/jdbc.html > 这边读取mysql jdbc数据报错Exception in thread "main" > org.apache.flink.table.api.TableException: Only insert statement is > supported now. > >

Re: Evenly distribute task slots across task-manager

2021-03-30 Thread yidan zhao
I think currently flink doesn't support your case, and another idea is that you can set the parallelism of all operators to 64, then it will be evenly distributed to the two taskmanagers. Vignesh Ramesh 于2021年3月25日周四 上午1:05写道: > Hi Matthias, > > Thanks for your reply. In my case, yes the

(无主题)

2021-03-30 Thread 高耀军
退订 | | 高耀军 | | 邮箱:18221112...@163.com | 签名由 网易邮箱大师 定制

Re: Hadoop is not in the classpath/dependencies

2021-03-30 Thread Matthias Seiler
Thank you all for the replies! I did as @Maminspapin suggested and indeed the previous error disappeared, but now the exception is ``` java.io.IOException: Cannot instantiate file system for URI: hdfs://node-1:9000/flink //... Caused by: java.lang.NumberFormatException: For input string: "30s"

退订

2021-03-30 Thread 徐永健
退订

Re: PyFlink Table API: Interpret datetime field from Kafka as event time

2021-03-30 Thread Dawid Wysakowicz
Hey, I am not sure which format you use, but if you work with JSON maybe this option[1] could help you. Best, Dawid [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/connectors/formats/json.html#json-timestamp-format-standard On 30/03/2021 06:45, Sumeet Malhotra

Support for sending generic class

2021-03-30 Thread Le Xu
Hello! I'm trying to figure out whether Flink Statefun supports sending object with class that has generic parameter types (and potentially nested types). For example, I send a message that looks like this: context.send(SINK_EVENT, idString, new Tuple3<>(someLongObject, listOfLongObject, Long));