FlinkRuntimeException: Unexpected list element deserialization failure

2020-04-09 Thread anaray
Hi flink team, I see below exception . What could be the reason of the failure ? Please share your thoughts? Caused by: org.apache.flink.util.FlinkRuntimeException: Unexpected list element deserialization failure at

Custom label for Prometheus Exporter

2020-01-22 Thread anaray
not of big help if I deploy the service in k8s or swarm. I would like associate a jobmanager with a jobname atleast in a JOB mode deployment. Please let me know if there is any way to add a custom label by cinfiguration? Thanks, anaray -- Sent from: http://apache-flink-user-mailing-list-archive.

Savepoint status check fails with error Operation not found under key

2019-06-11 Thread anaray
che.org/projects/flink/flink-docs-release-1.7/monitoring/rest_api.html#jobs-jobid-savepoints-triggerid Please let me know, if you have seen this issue before? Thanks, anaray -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk- directories

2019-06-07 Thread anaray
Hi Fabian, Thank you. Your observation is correct. The stale directories belong to the failed checkpoints. So it is related to FLINK-10855. I will closely follow FLINK-10855 and test when fix is available Thank You, anaray -- Sent from: http://apache-flink-user-mailing-list-archive.2336050

Flink 1.7.1 flink-s3-fs-hadoop-1.7.1 doesn't delete older chk- directories

2019-06-06 Thread anaray
Hi, I am using 1.7.1 and we store checkpoints in Ceph and we use flink-s3-fs-hadoop-1.7.1 to connect to Ceph. I have only 1 checkpoint retained. Issue I see is that previous/old chk- directories are still around. I verified that those older doesn't contain any checkpoint data. But the directories

Re: flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-15 Thread anaray
Thank You Andrey. Arity of the job has not changed. Here issue is that job will run for sometime (with checkpoint enabled) and then after some time will get into above exception. The job keeps restarting afterwards. One thing that I want point out here is that we have a custom *serialization

flink 1.4.2. java.lang.IllegalStateException: Could not initialize operator state backend

2019-05-14 Thread anaray
Hi, We have flink 1.4.2 in production, and we have started seeing below exception consistently. Could some help me understand the real issue happening here? I see that https://issues.apache.org/jira/browse/FLINK-8836 has fixed it, but since it needs an upgrade, we exploring workarounds or other

Flink 1.7.1 JobManager (docker) exits with status 0 - completed

2019-04-05 Thread anaray
Hi, We are using flink 1.7.1 and running as docker container. State backend is Ceph. Problem is that JobManager on startup exits with docker exit 0 (ie Completed). The only error/exception that I see is given below. Please share your thoughts. 2019-04-05 12:14:04,314 INFO

Details on Checkpointing if there are multiple/different sources grouped together.

2019-03-27 Thread anaray
Hi, Please help me to understand checkpointing if there multiple/different sources and which is grouped together. | || KAFKA TOPIC 1 | SOURCE1(DataStream1) |

Flink 1.7.1 uses Kryo version 2.24.0

2019-03-15 Thread anaray
Hi , Flink 1.7 still uses kryo-2.24.0. Is there any specific reason for not upgrading kryo? Thanks, -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Is taskmanager.heap.mb a valid configuration parameter in 1.7?

2019-03-07 Thread anaray
Hi, I see a reference about *taskmanager.heap.mb* in 1.7.1 config docs (https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html). I thought taskmanager.heap.mb got deprecated and new config is taskmanager.heap.size. Please correct me if I am wrong here. Thank you. -- Sent

Version "Unknown" - Flink 1.7.0

2019-02-01 Thread anaray
Though not a major issue. I see that Flink UI and REST api gives flink version as "UNKNOWN" I am using flink 1.7.0, with and running the cluster in JOB mode. REST endpoint /overview output

Re: Get the savepointPath of a particular savepoint

2019-01-15 Thread anaray
Dawid , Gary, Got it . Thanks -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Get the savepointPath of a particular savepoint

2019-01-13 Thread anaray
As per the 1.7.0 documentation here To start a job from a savepoint, savepointPath is required. But it not clear from where to get this savepointPath? In earlier versions we could get it from

Re: FLIP-6 Docker / Kubernetes

2018-05-07 Thread anaray
Thank You Fabian. We are looking forward for this feature, especially a way to bundle the job jar along with the container (even in taskmanagers). In production we deploy flink (1 job=1 cluster) in Docker Swarm. One of the main issue we face is related to blob download when the taskmanager fails