Flink job failing due to "Container is running beyond physical memory limits" error.

2018-11-23 Thread Gagan Agrawal
Hi, I am running flink job on yarn where it ran fine so far (4-5 days) and have now started failing with following errors. 2018-11-24 03:46:21,029 INFO org.apache.flink.yarn.YarnResourceManager - Closing TaskExecutor connection container_1542008917197_0038_01_06 because:

Re: how to override s3 key config in flink job

2018-11-23 Thread Tony Wei
Hi, Is there anyone can answer me? Thanks, Tony Wei Tony Wei 於 2018年11月20日 週二 下午7:39寫道: > Hi, > > Is there any way to provide s3.access-key and s3.secret-key in flink > application, instead of setting > them in flink-conf.yaml? > > In our use case, my team provide a flink standalone cluster

Re: OutOfMemoryError while doing join operation in flink

2018-11-23 Thread Ken Krugler
Hi Akshay, I don’t know much about the Beam/Flink integration, but I’m curious why you have a single record that would contain all 8M records with the same key. E.g. the code for your simple “group by” test would be interesting. — Ken > On Nov 22, 2018, at 10:54 AM, Akshay Mendole wrote: >

CEP Dynamic Patterns

2018-11-23 Thread Steve Bistline
Have dynamic patterns been introduced yet? Steve

Re: Multiple env.execute() into one Flink batch job

2018-11-23 Thread Flavio Pompermaier
We solved this issue (of read the value of an accumulator) by calling a REST endpoint after the job end, in order to store the value associated to the accumulator in some database. This is very awful but I didn't find any better solution.. This is the code that runs the job (of course its not

Re: Multiple env.execute() into one Flink batch job

2018-11-23 Thread bastien dine
Oh god, if we have some code with Accumulator after the env.execute(), this will not be executed on the JobManager too ? Thanks, I would be interested indeed ! -- Bastien DINE Data Architect / Software Engineer / Sysadmin bastiendine.io Le ven. 23 nov. 2018 à 16:37, Flavio

Re: Multiple env.execute() into one Flink batch job

2018-11-23 Thread Flavio Pompermaier
The problem is that the REST API block on env.execute. If you want to run your Flink job you have to submit it using the CLI client. As a workaround we wrote a Spring REST API that to run a job open an SSH connection to the job manager and execute the bin/flink run command.. If you're interested

Multiple env.execute() into one Flink batch job

2018-11-23 Thread bastien dine
Hello, I need to chain processing in DataSet API, so I am launching severals jobs, with multiple env.execute() : topology1.define(); env.execute; topogy2.define(); env.execute; This is working fine when I am running it within IntellIiJ But when I am deploying it into my cluster, it only launch

Re: [flink-cep] Flick CEP support for the group By operator

2018-11-23 Thread Piotr Nowojski
Hi, Yes, sure. Just use CEP on top of KeyedStream. Take a look at `keyBy`: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/ Piotrek > On 23 Nov 2018, at 16:04, Spico Florin wrote: > > Hello! > > I'm using Flink 1.4.2 and I would like to use a group by

[flink-cep] Flick CEP support for the group By operator

2018-11-23 Thread Spico Florin
Hello! I'm using Flink 1.4.2 and I would like to use a group by operator based on value of my event stream. The functionality that I would like to achieve is similar to the following Esper EPL (excerpt

Re: Call batch job in streaming context?

2018-11-23 Thread bastien dine
Hi Eric, You can run a job from another one, using the REST API This is the only way we have found to launch a batch job from a streaming job -- Bastien DINE Data Architect / Software Engineer / Sysadmin bastiendine.io Le ven. 23 nov. 2018 à 11:52, Piotr Nowojski a écrit : >

Re: error while joining two datastream

2018-11-23 Thread Piotr Nowojski
Hi, I assume that withTimestampsAndWatermarks1.print(); withTimestampsAndWatermarks2.print(); Actually prints what you have expected? If so, the problem might be that: a) time/watermarks are not progressing (watermarks are triggering the output of your

Re: Call batch job in streaming context?

2018-11-23 Thread Piotr Nowojski
Hi, I’m not sure if I understand your problem and your context, but spawning a batch job every 45 seconds doesn’t sound as a that bad idea (as long as the job is short). Another idea would be to incorporate this batch job inside your streaming job, for example by reading from Cassandra using

Call batch job in streaming context?

2018-11-23 Thread eric hoffmann
Hi Is it possible to call batch job on a streaming context? what i want to do is: for a given input event, fetch cassandra elements based on event data, apply transformation on them and apply a ranking when all elements fetched by cassandra are processed. If i do this in batch mode i would have to