Flink Application on YARN failed on losing Job Manager | No recovery | Need help debug the cause from logs

2016-11-02 Thread Anchit Jatana
Hi All, I started my flink application on YARN using flink run -m yarn-cluster, after running smoothly for 20 hrs it failed. Ideally the application should have recover on losing the Job Manger (which runs in the same container as the application master) pertaining to the fault tolerant nature of

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-02 Thread Anchit Jatana
Hi Jamie, Thanks for sharing your thoughts. I'll try and integrate with Graphite to see if this gets resolved. Regards, Anchit -- View this message in context:

Re: TimelyFlatMapFunction and DataStream

2016-11-02 Thread Stephan Ewen
Hi Ken! It may not be obvious, so here is a bit of background: The timers that are used in the FlatMapFunction are scoped by key. We thought that this is how they are mainly useful - that's why you need to define keys to use them. I think the docs are in error, thanks for pointing that out. In

Re: TimelyFlatMapFunction and DataStream

2016-11-02 Thread Aljoscha Krettek
There is already an open PR for fixing those Javadoc issues (along with some other issues): https://github.com/apache/flink/pull/2715 On Wed, 2 Nov 2016 at 11:04 Stephan Ewen wrote: > Hi Ken! > > It may not be obvious, so here is a bit of background: > > The timers that are

Re: Question about the checkpoint mechanism in Flink.

2016-11-02 Thread Renjie Liu
Thanks for the reply. On Wed, Nov 2, 2016 at 5:19 PM Till Rohrmann wrote: > Yes you're right. Whenever you have multiple input channels which could > also be the case if you do a repartitioning between two mappers. > > On Tue, Nov 1, 2016 at 11:48 PM, Renjie Liu

Questions about FoldFunction and WindowFunction

2016-11-02 Thread Yassine MARZOUGUI
Hi all, I have a couple questions about FoldFunction and WindowFunction: 1. When using a RichFoldFunction after a window as in keyedStream.window().fold(new RichFoldFunction()), is the close() method called after each window or after all the windows for that key are fired? 2. When applying a

Re: Questions about FoldFunction and WindowFunction

2016-11-02 Thread Aljoscha Krettek
Hi Yassine, regarding 1. The close() method of the RichFoldFunction will only be called at the very end of your streaming job, so in practise it will never be called. This is there because of batch jobs, where you have an actual end in your processing. regarding 2. I'm afraid you came across a

Why are externalized checkpoints deleted on Job Manager exit?

2016-11-02 Thread Clifford Resnick
Testing externalized checkpoints in a YARN-based cluster, configured with: env.getCheckpointConfig.enableExternalizedCheckpoints(ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); I can confirm that checkpoint is retained between cancelled jobs, however it’s deleted when the Job Manager

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-11-02 Thread Manu Zhang
Thanks, that will be great. I'd like to test against my particular use cases once your PR is available. Manu On Wed, Nov 2, 2016 at 11:09 PM Ventura Del Monte wrote: > Hello, > > I have just opened the JIRA issue >

Reg. custom sink for Flink streaming

2016-11-02 Thread Sandeep Vakacharla
Hi there, I have the following use case- I have data coming from Kafka which I need to stream and write each message to a database. I’m using kafka-flink connector for streaming data from Kafka. I don’t want to use flink sinks to write date from stream. I’m doing the following which doesn’t

Re: Looping over a DataSet and accesing another DataSet

2016-11-02 Thread otherwise777
I did mean the iteratino yes, I currently solved the problem by rewriting the algorithm in gelly's GathersumApply model, thnx for the tips I had another question regarding the original message, about appending items to a list, how would I do that? Because afaik it's not possible to add a list or

Re: Kinesis Connector Dependency Problems

2016-11-02 Thread Justin Yan
Sorry it took me a little while, but I'm happy to report back that it seems to be working properly with EMR 4.8. It seems so obvious in retrospect... thanks again for the assistance! Cheers, Justin On Tue, Nov 1, 2016 at 11:44 AM, Robert Metzger wrote: > Hi Justin, > >

Link read avro from Kafka Connect Issue

2016-11-02 Thread Will Du
Hi folks, I am trying to consume avro data from Kafka in Flink. The data is produced by Kafka connect using AvroConverter. I have created a AvroDeserializationSchema.java used by Flink consumer. Then, I use following code to

Re: Accessing StateBackend snapshots outside of Flink

2016-11-02 Thread bwong247
We're currently investigating Flink, and one of the features that we'd like to have is a TTL feature to time out older values in state. I saw this thread and it sounds like the functionality was being considered. Is there any update? -- View this message in context:

Re: Questions about FoldFunction and WindowFunction

2016-11-02 Thread Aljoscha Krettek
Would you be interested in contributing a fix for that? Otherwise I'll probably fix work on that in the coming weeks. On Wed, 2 Nov 2016 at 13:38 Yassine MARZOUGUI wrote: > Thank you Aljoscha for your quick response. > > Best, > Yassine > > 2016-11-02 12:30 GMT+01:00

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-11-02 Thread Aljoscha Krettek
Hi, a contributor (Bonaventure Del Monte) has started working on this. He should open a Jira this week. Cheer, Aljoscha On Tue, 1 Nov 2016 at 23:57 aj heller wrote: Hi Manu, Aljoscha, I had been interested in implementing FLIP-2, but I haven't been able to make time for it.

Re: Questions about FoldFunction and WindowFunction

2016-11-02 Thread Yassine MARZOUGUI
Yes, with please. Could you please assign it temporarily to me? (I am not very familiar with the internal components of Flink and migh take some time before contributing the code, if by the time you are ready to work on it I am not yet done, you can reassign it to yourself) 2016-11-02 14:07

Re: Questions about FoldFunction and WindowFunction

2016-11-02 Thread Yassine MARZOUGUI
Thank you Aljoscha for your quick response. Best, Yassine 2016-11-02 12:30 GMT+01:00 Aljoscha Krettek : > Hi Yassine, > > regarding 1. The close() method of the RichFoldFunction will only be > called at the very end of your streaming job, so in practise it will never > be

Re: watermark trigger doesn't check whether element's timestamp is passed

2016-11-02 Thread Ventura Del Monte
Hello, I have just opened the JIRA issue and I have almost completed the implementation of this feature. I will keep you posted :) Cheers, Ventura This message, for the D. Lgs n. 196/2003 (Privacy Code), may contain confidential and/or

Re: Flink Metrics - InfluxDB + Grafana | Help with query influxDB query for Grafana to plot 'numRecordsIn' & 'numRecordsOut' for each operator/operation

2016-11-02 Thread Jamie Grier
Hi Anchit, That last bit is very interesting - the fact that it works fine with subtasks <= 30. It could be that either Influx or Grafana are not able to keep up with the data being produced. I would guess that the culprit is Grafana if looking at any particular subtask index works fine and

Re: TimelyFlatMapFunction and DataStream

2016-11-02 Thread Ken Krugler
Hi Stephan, > On Nov 2, 2016, at 3:04am, Stephan Ewen wrote: > > Hi Ken! > > It may not be obvious, so here is a bit of background: > > The timers that are used in the FlatMapFunction are scoped by key. We thought > that this is how they are mainly useful - that's why you

Re: Questions about FoldFunction and WindowFunction

2016-11-02 Thread Aljoscha Krettek
Hi Yassine, I made you a contributor in the Flink Jira so you will be able to assign issues to yourself in the future. I also assigned this issue to you. I think you only need to do changes in WindwedStream and AllWindowedStream. Let me know if you need anything. :-) Cheers, Aljoscha On Wed, 2

Re: NotSerializableException: jdk.nashorn.api.scripting.NashornScriptEngine

2016-11-02 Thread PedroMrChaves
Hello, I'm having the exact same problem. I'm using a filter function on a datastream. My flink version is 1.1.3. What could be the problem? Regards, Pedro Chaves. -- View this message in context:

Testing DataStreams

2016-11-02 Thread Juan Rodríguez Hortalá
Hi, I'm new to Flink, and I'm trying to write my first unit test for a simple DataStreams job. In https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/util/package-summary.html I see several promising classes, but for example I cannot import

Re: Question about the checkpoint mechanism in Flink.

2016-11-02 Thread Till Rohrmann
Yes you're right. Whenever you have multiple input channels which could also be the case if you do a repartitioning between two mappers. On Tue, Nov 1, 2016 at 11:48 PM, Renjie Liu wrote: > Hi, Till: > I think the multiple input should include the more general case

Best Practices/Advice - Execution of jobs

2016-11-02 Thread PedroMrChaves
Hello, I'm trying to build a stream event correlation engine with Flink and I have some questions regarding the for the execution of jobs. In my architecture I need to have different sources of data, lets say for instance: /firewallStream= environment.addSource([FirewalLogsSource]); proxyStream