Jobgraph not getting deleted from Zookeeper

2020-05-19 Thread Fritz Budiyanto
Hi All, I have been seeing this issue several time where JobGraph are not cleaned up properly. As a result, when Flink cluster is restarted, it will attempt to do HA restore on a checkpoint which doesn't exist anymore and the new restarted cluster eventually go give up and stay down. The wor

Re: Jobgraph not getting deleted from Zookeeper

2020-05-19 Thread Fritz Budiyanto
Forgot to mentioned, Flink version is 1.9.2 On May 19, 2020 at 6:22 PM, Fritz Budiyanto wrote: Hi All, I have been seeing this issue several time where JobGraph are not cleaned up properly. As a result, when Flink cluster is restarted, it will attempt to do HA restore on a checkpoint

Task manager got stuck not restarting due to OOM: Metaspace

2021-06-24 Thread Fritz Budiyanto
Hi, I have a job which kept on restarting due to a bug, and it brought down the task manager with it due to OOM Metaspace. Please ignore the memory leak for a moment, the problem here is the task manager does not restart and hung which reduce the overall slots capacity. We are running Flink in

Re: Task manager got stuck not restarting due to OOM: Metaspace

2021-06-24 Thread Fritz Budiyanto
I found this configuration which control jvm exit. Let me try it out: taskmanager.jvm-exit-on-oom: true > On Jun 24, 2021, at 8:07 AM, Fritz Budiyanto wrote: > > Hi, > > > I have a job which kept on restarting due to a bug, and it brought down the > task manag

ProcessFunction's Event Timer not firing

2018-11-08 Thread Fritz Budiyanto
Hi All, I noticed if one of the slot's watermark not progressing, its impacting all slots processFunction timer and no timer are not firing. In my example, I have Source parallelism set to 8 and Kafka partition is 4. The next operator is processFunction with parallelism of 8 + event timer. I

Re: ProcessFunction's Event Timer not firing

2018-11-10 Thread Fritz Budiyanto
imestamps_watermarks.html#timestamps-per-kafka-partition> > > > On Fri, Nov 9, 2018 at 1:56 AM Fritz Budiyanto <mailto:fbudi...@icloud.com>> wrote: > Hi All, > > I noticed if one of the slot's watermark not progressing, its impacting all > slots processFunct

Help debugging Kafka connection leaks after job failure/cancelation

2019-03-26 Thread Fritz Budiyanto
Hi All, We're using Flink-1.4.2 and noticed many dangling connections to Kafka after job deletion/recreation. The trigger here is Job cancelation/failure due to network down event followed by Job recreation. Our flink job has checkpointing disabled, and upon job failure (due to network failure

Re: Help debugging Kafka connection leaks after job failure/cancelation

2019-03-27 Thread Fritz Budiyanto
Thank you ! > On Mar 26, 2019, at 6:51 PM, Steven Wu wrote: > > it might be related to this issue > https://issues.apache.org/jira/browse/FLINK-10774 > <https://issues.apache.org/jira/browse/FLINK-10774> > > On Tue, Mar 26, 2019 at 4:35 PM Fritz Budiyanto <mail

ElasticSearch 6

2017-11-15 Thread Fritz Budiyanto
Hi All, ES6 is GA today, and I wonder if Flink-ES5 connector fully support ES6 ? Any caveat we need to know ? Thanks, Fritz

Re: ElasticSearch 6

2017-11-17 Thread Fritz Budiyanto
0_131] > On Nov 15, 2017, at 12:07 PM, Fritz Budiyanto wrote: > > Hi All, > > ES6 is GA today, and I wonder if Flink-ES5 connector fully support ES6 ? Any > caveat we need to know ? > > Thanks, > Fritz

No configuration found for key 'akka.version' after upgrade to Flink 1.4

2017-12-16 Thread Fritz Budiyanto
Hi All, After 1.4 upgrade, our unit tests failed to start flink in local mode. How can we fix this failure ? 07:31:34,388 INFO org.apache.flink.streaming.api.environment.LocalStreamEnvironment - Running job on local embedded Flink mini cluster 07:31:34,537 INFO org.apache.flink.runtime.min

Securing metrics reporters (Flink <-> Prometheus)

2018-03-02 Thread Fritz Budiyanto
Hi All, How can I configure encryption for metrics.reports ? Particularly Prometheus ? I do not see any mention of encryption in the metrics.reports traffic in flink documentation. Is encryption supported? If yes, could shed a light not how to do this ? metrics.reporters: prom metrics.reporte

Secure TLS/SSL ElasticSearch connector for current and future connector

2018-03-26 Thread Fritz Budiyanto
Hi All, Anyone know if Flink has TLS/SSL support for the current ES connector ? If yes, any sample configuration/code ? If not, would TLS/SSL be support in the upcoming ES connector using Java High Level client ? Thanks, Fritz

Re: Secure TLS/SSL ElasticSearch connector for current and future connector

2018-03-26 Thread Fritz Budiyanto
t; working on 6.1+. But at least there would be something "correct" for the > future. > > -- > Christophe > > On Mon, Mar 26, 2018 at 11:38 PM, Fritz Budiyanto <mailto:fbudi...@icloud.com>> wrote: > Hi All, > > Anyone know if Flink has TLS/SSL sup

Evicting elements in EventTimeSessionWindow

2017-02-27 Thread Fritz Budiyanto
Hi All, How do I evict elements from EventTimeSessionWindow ? My use case as follow: I have a long duration session window, and I’d like to do some processing on every minute and perform regular sink. I use ContinuousEventTimeTrigger to do the job, as the session could last for hours (or even

Re: Job ID

2017-05-09 Thread Fritz Budiyanto
Hi Gordon, Please add job’s computed SHA-1 as well if possible, so I can have the client code to automate uploading / replacing job if the computed SHA-1 is different — Fritz > On May 8, 2017, at 10:18 PM, Tzu-Li (Gordon) Tai wrote: > > Hi Joe, > > AFAIK, this currently isn’t possible throu

Excessive stdout is causing java heap out of mem

2017-05-19 Thread Fritz Budiyanto
Hi, I notice that when I enabled DataStreamSink’s print() for debugging, (kinda excessive printing), its causing java Heap out of memory. Possibly the Task Manager is buffering all stdout for the WebInterface? I haven’t spent time debugging it, but I wonder if this is expected where massive pri

Re: Excessive stdout is causing java heap out of mem

2017-05-22 Thread Fritz Budiyanto
n you can do stuff like: > LOG.info("My log statement"); > Also, using a logging Framework will allow you to redirect the log contents > of your job to a separate file. > > But I'm not sure if the logging is really causing the TaskManager JVMs to die > ... > > &

Need help debugging back pressure job

2017-05-22 Thread Fritz Budiyanto
Hi All, Any tips on debugging back pressure ? I have a workload where it get stuck after it ran for a couple of hours. I assume the cause of the back pressure is the block next to the one showing as having the back pressure, is this right ? Any idea on how to get the backtrace ? (I’m using stan

Re: Excessive stdout is causing java heap out of mem

2017-05-24 Thread Fritz Budiyanto
you using? > > > > On Tue, May 23, 2017 at 7:02 AM, Fritz Budiyanto <mailto:fbudi...@icloud.com>> wrote: > Hi Robert, > > Thanks Robert, I’ll start using the logger. > > I didn’t pay attention whether the error occur when I accessed the log from > job

FileSystem vs RocksDb backend

2017-06-02 Thread Fritz Budiyanto
Hi All, If my states fit in the JVM heap, should I be using filesystem backend with HDFS instead of RocksDB ? Could someone comment on the FS vs RocksDb backend performance ? I’ve also heard that RocksDb backend is required to support delta checkpointing. Is this true ? Is delta checkpointing

Parallelism, registerEventTimeTimer and watermark problem

2017-10-17 Thread Fritz Budiyanto
Hi All, If I have high parallelism and use processFunction to registerEventTimeTimer, the timer never gets fired. After debugging, I found out the watermark isn't updated because I have keyBy right after assignTimestampsAndWatermarks. And if I set assignTimestampsAndWatermarks right after the ke

Re: Parallelism, registerEventTimeTimer and watermark problem

2017-10-17 Thread Fritz Budiyanto
treamTask.java:69) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:263) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:702) at java.lang.Thread.run(Thread.java:745) -- Fritz > On Oct 17, 2017, at 7:55 PM, Fritz Budiyanto wrote: > > Hi All, > > If I have