Finding things not seen in the last window

2017-09-29 Thread Ron Crocker
Hi - I have a colleague who is trying to write a flink job that will determine deltas from period to period. Let’s say the periods are 1 minutes. What he would like to do is report in minute 2 those things that are new since from minute 1, then in minute 3 report those things that are new also

Enriching data from external source with cache

2017-09-29 Thread Derek VerLee
My basic problem will sound familiar I think, I need to enrich incoming data using a REST call to an external system for slowly evolving metadata. and some cache based lag is acceptable, so to reduce load on the external system and to process more efficiently, I would

Re: Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
Why of course... Thank you for your time. I'll figure out where to go with Beam. From: Chesnay Schepler Sent: 29 September 2017 16:41:23 To: user@flink.apache.org Subject: Re: Repeated exceptions during system metrics registration You

Sink buffering

2017-09-29 Thread nragon
Hi, Just like mentioned at Berlin FF17, Pravega talk, can we simulate, somehow, sink buffering(pravega transactions) and coordinate them with checkpoints? My intension is to buffer records before sending them to hbase. Any opinions or tips? Thanks -- Sent from:

Re: starting query server when running flink embedded

2017-09-29 Thread Piotr Nowojski
Hi, You can take a look at how is it done in the exercises here . There are example solutions that run on a local environment. I Hope that helps :) Piotrek > On Sep 28, 2017, at 11:22 PM, Henri Heiskanen

Re: state of parallel jobs when one task fails

2017-09-29 Thread r. r.
Thanks a lot - wasn't aware of FailoverStrategy Best regards Robert > Оригинално писмо >От: Piotr Nowojski pi...@data-artisans.com >Относно: Re: state of parallel jobs when one task fails >До: "r. r." >Изпратено на: 29.09.2017 18:21 > >

Re: how many 'run -c' commands to start?

2017-09-29 Thread r. r.
Sure, here is the cmdline output: flink-1.3.2/bin/flink run -c com.corp.flink.KafkaJob quickstart.jar --topic InputQ --bootstrap.servers localhost:9092 --zookeeper.connect localhost:2181 --group.id Consumers -p 5 --detached Cluster configuration: Standalone cluster with JobManager at

Re: state of parallel jobs when one task fails

2017-09-29 Thread Piotr Nowojski
Hi, Yes, by default Flink will restart all of the tasks. I think that since Flink 1.3, you can configure a FailoverStrategy to change this behavior.

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
I am still not sure what do you mean by “thread crash without throw”. If SourceFunction.run methods returns without an exception Flink assumes that it has cleanly shutdown and that there were simply no more elements to collect/create by this task. If it continue working, without throwing an

state of parallel jobs when one task fails

2017-09-29 Thread r. r.
Hello I have a simple job with a single map() processing which I want to run with many documents in parallel in Flink. What will happen if one of the 'instances' of the job fails?   This statement in Flink docs confuses me: "In case of failures, a job switches first to failing where it cancels

Re: how many 'run -c' commands to start?

2017-09-29 Thread Chesnay Schepler
The only nodes that matter are those on which the Flink processes, i.e Job- and TaskManagers , are being run. To prevent a JobManager node failure from causing the job to

Repeated exceptions during system metrics registration

2017-09-29 Thread Reinier Kip
Hi all, I'm running a Beam pipeline on Flink and sending metrics via the Graphite reporter. I get repeated exceptions on the slaves, which try to register the same metric multiple times. Jobmanager and taskmanager data is fine: I can see JVM stuff, but only one datapoint here and there for

Re: ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Tzu-Li (Gordon) Tai
Hi, I’m looking into this. Could you let us know the Flink version in which the exceptions occurred? Cheers, Gordon On 29 September 2017 at 3:11:30 PM, Federico D'Ambrosio (federico.dambro...@smartlab.ws) wrote: Hi, I'm coming across these Exceptions while running a pretty simple flink job.

ArrayIndexOutOfBoundExceptions while processing valve output watermark and while applying ReduceFunction in reducing state

2017-09-29 Thread Federico D'Ambrosio
Hi, I'm coming across these Exceptions while running a pretty simple flink job. First one: java.lang.RuntimeException: Exception occurred while processing valve output watermark: at

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
Any exception thrown by your SourceFunction will be caught by Flink and that will mark a task (that was executing this SourceFuntion) as failed. If you started some custom threads in your SourceFunction, you have to manually propagate their exceptions to the SourceFunction. Piotrek > On Sep

Re: EASY Friday afternoon question: order of chained sink operator execution in a streaming task

2017-09-29 Thread Chesnay Schepler
Yes, i believe that is correct. On 29.09.2017 14:01, Martin Eden wrote: Hi all, Just a quick one. I have a task that looks like this (as printed in the logs): 17-09-29 0703510695 INFO TaskManager.info: Received task Co-Flat Map -> Process -> (Sink: sink1, Sink: sink2, Sink: sink3) (2/2)

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread yunfan123
My source stream means the funciton implement the org.apache.flink.streaming.api.functions.source.SourceFunction. My question is how flink know all working thread is alive? If one working thread that execute the SourceFunction crash, how flink know this happenned? -- Sent from:

EASY Friday afternoon question: order of chained sink operator execution in a streaming task

2017-09-29 Thread Martin Eden
Hi all, Just a quick one. I have a task that looks like this (as printed in the logs): 17-09-29 0703510695 INFO TaskManager.info: Received task Co-Flat Map -> Process -> (Sink: sink1, Sink: sink2, Sink: sink3) (2/2) After looking a bit at the code of the streaming task I suppose the sink

Re: Clean GlobalWidnow state

2017-09-29 Thread Fabian Hueske
Thanks for creating the JIRA issue! Best, Fabian 2017-09-20 12:26 GMT+02:00 gerardg : > I have prepared a repo that reproduces the issue: > https://github.com/GerardGarcia/flink-global-window-growing-state > > Maybe this way it is easier to spot the error or we can determine

Re: How flink monitor source stream task(Time Trigger) is running?

2017-09-29 Thread Piotr Nowojski
We use Akka's DeathWatch mechanism to detect dead components. TaskManager failure shouldn’t prevent recovering from state (as long as there are enough task slots). I’m not sure if I understand what you mean by "source stream thread" crash. If is was some error during performing a checkpoint so

Re: Job Manager minimum memory hard coded to 768

2017-09-29 Thread Till Rohrmann
Hi Dan, I think Aljoscha is right and the 768 MB minimum JM memory is more of a legacy artifact which was never properly refactored. If I remember correctly, then we had problems when starting Flink in a container with a lower memory limit. Therefore this limit was introduced. But I'm actually