Can checkpoints be used to migrate jobs between Flink versions ?

2019-01-09 Thread Edward Rojas
Hello, For upgrading jobs between Flink versions I follow the guide in the doc here: https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/upgrading.html#upgrading-the-flink-framework-version It states that we should always use savepoints for this procedure, I followed it and it works

Re: How to migrate Kafka Producer ?

2019-01-07 Thread Edward Rojas
Hi Piotr, Thank you for looking into this. Do you have an idea when next version (1.7.2) will be available ? Also, could you validate / invalidate the approach I proposed in the previous comment ? Edward Rojas wrote > Regarding the kafka producer I am just updating the job with the

How to migrate Kafka Producer ?

2018-12-18 Thread Edward Rojas
Hi, I'm planning to migrate from kafka connector 0.11 to the new universal kafka connector 1.0.0+ but I'm having some troubles. The kafka consumer seems to be compatible but when trying to migrate the kafka producer I get an incompatibility error for the state migration. It looks like the

BucketingSink vs StreamingFileSink

2018-11-16 Thread Edward Rojas
Hello, We are currently using Flink 1.5 and we use the BucketingSink to save the result of job processing to HDFS. The data is in JSON format and we store one object per line in the resulting files. We are planning to upgrade to Flink 1.6 and we see that there is this new StreamingFileSink,

After OutOfMemoryError State can not be readed

2018-09-06 Thread Edward Rojas
Hello all, We are running Flink 1.5.3 on Kubernetes with RocksDB as statebackend. When performing some load testing we got an /OutOfMemoryError: native memory exhausted/, causing the job to fail and be restarted. After the Taskmanager is restarted, the job is recovered from a Checkpoint, but it

Re: flink list -r shows CANCELED jobs - Flink 1.5

2018-05-17 Thread Edward Rojas
I forgot to add an example of the execution: $ ./bin/flink list -r Waiting for response... -- Running/Restarting Jobs --- 17.05.2018 19:34:31 : edec969d6f9609455f9c42443b26d688 : FlinkAvgJob (CANCELED) 17.05.2018 19:36:01 : bd87ffc35e1521806928d6251990d715 :

flink list -r shows CANCELED jobs - Flink 1.5

2018-05-17 Thread Edward Rojas
Hello all, On Flink 1.5, the CLI returns the CANCELED jobs when requesting only the running job by using the -r flag... is this an intended behavior ? On 1.4 CANCELED jobs does not appear when running this command. -- Sent from:

Dynamically deleting kafka topics does not remove partitions from kafkaConsumer

2018-05-04 Thread Edward Rojas
Hello all, We have a kafka consumer listening to a topic pattern "topic-*" with a partition discovery interval. We eventually add new topics and this is working perfectly, the consumer discover the new topics (and partitions) and listen to them. But we also remove topics eventually and in this

Regression: On Flink 1.5 CLI -m,--jobmanager option not working

2018-04-27 Thread Edward Rojas
I'm preparing to migrate my environment from Flink 1.4 to 1.5 and I found this issue. Every time I try to use the flink CLI with the -m option to specify the jobmanager address, the CLI get stuck on "Waiting for response..." and I get the following error on the Jobmanager: WARN

Re: Migration to Flip6 Kubernetes

2018-03-21 Thread Edward Rojas
Hi Till, Thanks for the information. We are using the session cluster and is working quite good, but we would like to benefit from the new approach of per-job mode in order to have a better control over the jobs that are running on the cluster. I took a look to the YarnJobClusterEntrypoint and

Migration to Flip6 Kubernetes

2018-03-15 Thread Edward Rojas
Hello, Currently I have a Flink 1.4 cluster running on kubernetes based on the configuration describe on https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/kubernetes.html with additional config for HA with Zookeeper. With this I have several Taskmanagers, a single

Secured communication with Zookeeper without Kerberos

2018-02-20 Thread Edward Rojas
Hi, I'm setting up a Flink cluster on kubernetes with High Availability using Zookeeper. It's working well so far without the security configuration for zookeeper. I need to have secured communication between Flink and zookeeper but I'd like to avoid the need to setup a Kerberos server only for

RE: S3 for state backend in Flink 1.4.0

2018-02-01 Thread Edward Rojas
Hi Hayden, It seems like a good alternative. But I see it's intended to work with spark, did you manage to get it working with Flink ? I some tests but I get several errors when trying to create a file, either for checkpointing or saving data. Thanks in advance, Regards, Edward -- Sent

Re: S3 for state backend in Flink 1.4.0

2018-01-31 Thread Edward Rojas
Hi Aljoscha, Thinking a little bit more about this, although IBM Object storage is compatible with Amazon's S3, it's not an eventually consistent file system, but rather immediately consistent. So we won't need the support for eventually consistent FS for our use case to work, but we would only

Re: S3 for state backend in Flink 1.4.0

2018-01-31 Thread Edward Rojas
Thanks Aljoscha. That makes sense. Do you have a more specific date for the changes on BucketingSink and/or the PR to be released ? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

RE: S3 for state backend in Flink 1.4.0

2018-01-31 Thread Edward Rojas
Hi, We are having a similar problem when trying to use Flink 1.4.0 with IBM Object Storage for reading and writing data. We followed https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/aws.html and the suggestion on https://issues.apache.org/jira/browse/FLINK-851. We put