Re: NoClassDefFoundError for jersey-core on YARN

2018-03-29 Thread Juho Autio
Never mind, I'll post this new problem as a new thread. On Wed, Mar 28, 2018 at 6:35 PM, Juho Autio wrote: > Thank you. The YARN job was started now, but the Flink job itself is in > some bad state. > > Flink UI keeps showing status CREATED for all sub-tasks and nothing

All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Juho Autio
I built a new Flink distribution from release-1.5 branch yesterday. The first time I tried to run a job with it ended up in some stalled state so that the job didn't manage to (re)start but what makes it worse is that it didn't exit as failed either. Next time I tried running the same job (but

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Nico Kruber
That BlobServerConnection error is caused by a TaskManager which requested a BLOB (a jar file) but then closed the connection. I guess that may happen when the job is cancelled and the TaskManager processes are terminated. If this is not happening during that scenario, then your TaskManager

RE: How does setMaxParallelism work

2018-03-29 Thread NEKRASSOV, ALEXEI
Is there an auto-scaling feature in Flink, where I start with parallelism of (for example) 1, but Flink notices I have high volume of data to process, and automatically increases parallelism of a running job? Thanks, Alex -Original Message- From: Nico Kruber

Job restart hook

2018-03-29 Thread Navneeth Krishnan
Hi, Is there a way for a script to be called whenever a job gets restarted? My scenario is lets say there are 20 slots and the job runs on all 20 slots. After a while a task manager goes down and now there are only 14 slots and I need to readjust the parallelism of my job to ensure the job runs

Anyway to read Cassandra as DataStream/DataSet in Flink?

2018-03-29 Thread James Yu
Hi, I tried to treat Cassandra as the source of data in Flink with the information provided in the following links: - https://stackoverflow.com/questions/43067681/read-data-from-cassandra-for-processing-in-flink -

Re: Plain text SSL passwords in Log file

2018-03-29 Thread Szymon Szczypiński
Hi, i have the same problem with flink 1.3.1 and i created jira https://issues.apache.org/jira/browse/FLINK-9100 ( i saw that you wrote about my jira in your jira :)). For now to avoid printing password in log in logback.xml i configure  that  class GlobalConfiguration is logged only to

How can I set configuration of process function from job's main?

2018-03-29 Thread Main Frame
Hi guys! Iam newbie in flink and I have probably silly question about streaming api. So for the instance: I trying to apply SomeProcessFunction to stream1 … DataStream stream2 = stream1.process(new MyProcessFunction()).name("Ingest data»); … I have created package-private class with

bad data output

2018-03-29 Thread Darshan Singh
Hi I have a dataset which has almost 99% of correct data. As of now if say some data is bad I just ignore it and log it and return only correct data. I do this inside a map function. The part which decides whether data is correct or not is expensive one. Now I want to store the bad data

Re: How can I set configuration of process function from job's main?

2018-03-29 Thread Timo Walther
Hi, the configuration parameter is just legacy API. You can simply pass any serializable object to the constructor of your process function. Regards, Timo Am 29.03.18 um 20:38 schrieb Main Frame: Hi guys! Iam newbie in flink and I have probably silly question about streaming api. So for

Re: Out off memory when catching up

2018-03-29 Thread Lasse Nedergaard
Hi For sure I can share more info. We run on Flink 1.4.2 ( but have the same problems on 1.3.2 ) on a Aws EMR cluster. 6 taskmanagers on each m4.xlarge slave. Taskmanager heab set to 1850. We use RockStateDbBackend. we have set akka.ask.timeout to 60 s if GC should prevent heatbeat,

Re: cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Juho Autio
Thanks Gary. And what if I want to match the old behaviour ie. have the job cancelled after savepoint has been created? Maybe I saw some optional field for that purpose, that could be put into JSON payload of POST.. But this documentation doesn't cover it:

Re: Incremental checkpointing performance

2018-03-29 Thread Stephan Ewen
I think what happens is the following: - For full checkpoints, Flink iterates asynchronously over the data. That means the whole checkpoint is a compact asynchronous operation. - For incremental checkpoints, RocksDB has to flush the write buffer and create a new SSTable. That flush is

Subject: Last chance to register for Flink Forward SF (April 10). Get 25% discount

2018-03-29 Thread Stephan Ewen
Hi all! There are still some spots left to attend Flink Forward San Francisco, so sign up soon before registration closes. Use this promo code to get 25% off: MailingListFFSF The 1-day conference takes place on Tuesday, April 10 in downtown SF. We have a great lineup of speakers from companies

Re: Unable to load AWS credentials: Flink 1.2.1 + S3 + Kubernetes

2018-03-29 Thread Stephan Ewen
Using AWS credentials with Kubernetes are not trivial. Have you looked at AWS / Kubernetes docs and projects like https://github.com/jtblin/kube2iam which bridge between containers and AWS credentials? Also, Flink 1.2.1 is quite old, you may want to try a newer version. 1.4.x has a bit of an

cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Juho Autio
With a fresh build from release-1.5 branch, calling /cancel-with-savepoint fails with 404 Not Found. The snapshot docs still mention /cancel-with-savepoint: https://ci.apache.org/projects/flink/flink-docs-master/monitoring/rest_api.html#cancel-job-with-savepoint 1. How can I achieve the same

Re: cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Gary Yao
Hi Juho, Thank you for testing the Apache Flink 1.5 release. For FLIP-6 [1], the "cancel with savepoint" API was reworked. Unfortunately the FLIP-6 REST API documentation still needs to be re-generated [2][3]. Under the new API, you first issue a POST request against /jobs/:jobid/savepoints, and

Re: cancel-with-savepoint: 404 Not Found

2018-03-29 Thread Gary Yao
Hi Juho, Sorry, I should have included an example. To cancel the job: curl -XPOST host:port/jobs/:jobid/savepoints -d '{"cancel-job": true}' Let me know if it works for you. Best, Gary On Thu, Mar 29, 2018 at 10:39 AM, Juho Autio wrote: > Thanks Gary. And what if I

Re: bad data output

2018-03-29 Thread 杨力
You can use a split operator, generating 2 streams. Darshan Singh 于 2018年3月30日周五 上午2:53写道: > Hi > > I have a dataset which has almost 99% of correct data. As of now if say > some data is bad I just ignore it and log it and return only correct data. > I do this inside a

Re: Record timestamp from kafka

2018-03-29 Thread Ben Yan
hi, Is that what you mean? See : https://issues.apache.org/jira/browse/FLINK-8500?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel=16377145#comment-16377145

Record timestamp from kafka

2018-03-29 Thread Navneeth Krishnan
Hi, Is there way to get the kafka timestamp in deserialization schema? All records are written to kafka with timestamp and I would like to set that timestamp to every record that is ingested. Thanks.

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Juho Autio
Thanks again, Gary. It's true that I only let the job remain in the stuck state for something between 10-15 minutes. Then I shut down the cluster. But: if restart strategy is being applied, shouldn't I have seen those messages in job manager log? In my case it kept all quiet since ~2018-03-28

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Gary Yao
Hi Juho, The log message Could not allocate all requires slots within timeout of 30 ms. Slots required: 20, slots allocated: 8 indicates that you do not have enough resources in your cluster left. Can you verify that after you started the job submission the YARN cluster does not reach its

Re: Plain text SSL passwords in Log file

2018-03-29 Thread Vinay Patil
Hi, If this is not part of Flink 1.5 or not handled in latest 1.4.2 release, I can open a JIRA. Should be a small change. What do you think ? Regards, Vinay Patil On Wed, Mar 28, 2018 at 4:11 PM, Vinay Patil wrote: > Hi Greg, > > I am not concerned with

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Gary Yao
Hi Juho, Thanks for the follow up. Regarding the BlobServerConnection error, Nico (cc'ed) might have an idea. Best, Gary On Thu, Mar 29, 2018 at 4:08 PM, Juho Autio wrote: > Sorry, my bad. I checked the persisted jobmanager logs and can see that > job was still being

Re: All sub-tasks stuck in CREATED state after "BlobServerConnection: GET operation failed"

2018-03-29 Thread Juho Autio
Sorry, my bad. I checked the persisted jobmanager logs and can see that job was still being restarted at 15:31 and then at 15:36. If I wouldn't have terminated the cluster, I believe the flink job / yarn app would've eventually exited as failed. On Thu, Mar 29, 2018 at 4:49 PM, Juho Autio

Re: Plain text SSL passwords in Log file

2018-03-29 Thread Vinay Patil
I have created FLINK-9111 as this is not handled in the latest code of GlobalConfiguration. Regards, Vinay Patil On Thu, Mar 29, 2018 at 8:33 AM, Vinay Patil wrote: > Hi, > > If this is not part of Flink 1.5 or not

Re: SSL config on Kubernetes - Dynamic IP

2018-03-29 Thread Edward Alexander Rojas Clavijo
Hi all, I did some tests based on the PR Christophe mentioned above and by making a change on the NettyClient to use CanonicalHostName instead of HostNameAddress to identify the server, the SSL validation works!! I created a PR with this change: https://github.com/apache/flink/pull/5789