Re: Periodic actions

2016-03-03 Thread Chesnay Schepler
could the problem be as simple as var active being never true? On 04.03.2016 03:08, shikhar wrote: I am trying to have my job also run a periodic action by using a custom source that emits a dummy element periodically and a sink that executes the callback, as shown in the code below. However as

Re: Flink packaging makes life hard for SBT fat jar's

2016-03-03 Thread shikhar
Thanks Till. I can confirm that things are looking good with RC5. sbt-assembly works well with the flink-kafka connector dependency not marked as "provided". -- View this message in context:

Re: Windows, watermarks, and late data

2016-03-03 Thread shikhar
In case this helps, this is a Scala helper I am using to filter out late data on a KeyedStream. The last timestamp state is maintained at the key-level. ``` implicit class StrictlyAscendingByTime[T, K](stream: KeyedStream[T, K]) { def filterStrictlyAscendingTime(timestampExtractor: T =>

Periodic actions

2016-03-03 Thread shikhar
I am trying to have my job also run a periodic action by using a custom source that emits a dummy element periodically and a sink that executes the callback, as shown in the code below. However as soon as I start the job and check the state in the JobManager UI this particular Sink->Source combo

Re: Windows, watermarks, and late data

2016-03-03 Thread Michael Radford
Ha, never mind, I realized I can just put the unique key into the aggregate object maintained by the FoldFunction. I'm still curious why RichWindowFunction (and RichFoldFunction) aren't supported for Scala WindowedStream.apply. Mike On Thu, Mar 3, 2016 at 4:50 PM, Michael Radford

Re: Windows, watermarks, and late data

2016-03-03 Thread Michael Radford
Thank you, that was helpful. I didn't appreciate that a Trigger is fully in control of when to fire / purge regardless of the watermark. Now I am wondering the best way to distinguish different instances of the same time window with completely different data, vs. repeated fires that include data

Re: Setting taskmanager.network.numberOfBuffers and getting errors...

2016-03-03 Thread Fabian Hueske
Hi Sourigna, you are using the formula correctly: #cores should to be translated into slots per taskmanager (TM), and #machines into number of TMs. So 36 ^ 2 * 10 * 4 = 51840 appears to be right. The constant 4 refers to the total number of concurrently active full network shuffles (partitioning

Re: java.lang.NoClassDefFoundError for Keys Class

2016-03-03 Thread Stephan Ewen
Hi! You are using some outdated dependencies. Now, all Scala-dependent Flink artifacts include the Scala Version Suffix (following common conventions) Please substitute: - flink-streaming-java--> flink-streaming-java_2.10 - flink-clients --> flink-clients_2.10 - flink-connector-kafka -->

Setting taskmanager.network.numberOfBuffers and getting errors...

2016-03-03 Thread Sourigna Phetsarath
All: I'm running a Flink 0.10.2 App by submitting to YARN as an application. I'm using an AWS EMR cluster of 1 Master and 10 d2.8xlarge. When I submit the job using: bin/flink run \ -m yarn-cluster \ -yjm 20480 \ -yn 10 \ -ytm 80960 \ -ys 36 \ -yD

Re: Compilation error with Scala case class with private constructor

2016-03-03 Thread Aljoscha Krettek
Yes Stephan, you’re spot on. We generate code for serialization/deserialization of the types and in there it creates code like this: new YourCaseClass(field1, field2, …) to create the value that is given to the user function. If the constructor is private compilation of that generated code

java.lang.NoClassDefFoundError for Keys Class

2016-03-03 Thread Madhire, Naveen
Hey All, I am getting the below error while executing a simple Kafka-Flink Application. java.lang.NoClassDefFoundError: org/apache/flink/api/java/operators/Keys Below are the maven dependencies which I included in my application. org.apache.kafka kafka_2.9.1 0.8.2.0

Re: Compilation error with Scala case class with private constructor

2016-03-03 Thread Stephan Ewen
Hi! My guess is that this error is indirectly reporting that no TypeInformation could be generated for the case class. That TypeInformation is generated using macros during program compilation. The generated TypeInformation will contain a partially code-generated serializer including the code to

Compilation error with Scala case class with private constructor

2016-03-03 Thread Andrew Whitaker
Hi, I've run up against a compilation error involving a case class with a private constructor: [error] /Users/anwhitaker/code/flink-fold-issue/src/main/scala/TestApp.scala:18: could not find implicit value for evidence parameter of type

Re: Flink CEP Pattern Matching

2016-03-03 Thread Vitor Vieira
I believe that most of functionalities regarding transformation and projection of windowed events will only be implemented in the next releases. I'm looking forward to contribute! 2016-03-03 15:29 GMT-03:00 Jerry Lam : > Hi Till, > > The idea of having CEP functionalities

Attempting to refactor Kafka Example using Flink 0.10.2 with Scala 2.11

2016-03-03 Thread Prez Cannady
I’ve forked and am now experimenting with Robert Metzler’s kaka-example. https://github.com/OCExercise/kafka-example Work fine from the vanilla fork (on the master branch). I performed my changes on branch enerscore-2.11, which includes: 1. Going

Re: Flink CEP Pattern Matching

2016-03-03 Thread Jerry Lam
Hi Till, The idea of having CEP functionalities in Flink is very exciting. I really appreciate your work on this. Will you consider in the future adding the similar functionalities described in this standard (

Re: Flink Streaming ContinuousTimeTriggers

2016-03-03 Thread Aljoscha Krettek
Hi, sorry for the long wait, we are currently very busy with finalizing the 1.0 release. I think I don’t yet completely get your use case. Could you maybe give some example input data and what the expected behavior would be? And maybe some code? If you already have coded something. Cheers,

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-03 Thread Maximilian Bode
Hi Aljoscha, thank you for the fast answer. The files in HDFS change as follows: -before task manager is killed: [user@host user]$ hdfs dfs -ls -R /hdfs/dir/outbound -rw-r--r-- 2 user hadoop2435461 2016-03-03 14:51 /hdfs/dir/outbound/_part-0-0.in-progress -rw-r--r-- 2 user hadoop

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-03 Thread Aljoscha Krettek
Hi, did you check whether there are any files at your specified HDFS output location? If yes, which files are there? Cheers, Aljoscha > On 03 Mar 2016, at 14:29, Maximilian Bode wrote: > > Just for the sake of completeness: this also happens when killing a task >

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Till Rohrmann
I've created an issue [1] and opened a PR [2] to fix the issue. [1] https://issues.apache.org/jira/browse/FLINK-3570 [2] https://github.com/apache/flink/pull/1758 Cheers, Till On Thu, Mar 3, 2016 at 12:33 PM, Maximilian Bode < maximilian.b...@tngtech.com> wrote: > Hi Ufuk, Till and Stephan, >

Re: Jobmanager HA with Rolling Sink in HDFS

2016-03-03 Thread Maximilian Bode
Just for the sake of completeness: this also happens when killing a task manager and is therefore probably unrelated to job manager HA. > Am 03.03.2016 um 14:17 schrieb Maximilian Bode : > > Hi everyone, > > unfortunately, I am running into another problem trying

Jobmanager HA with Rolling Sink in HDFS

2016-03-03 Thread Maximilian Bode
Hi everyone, unfortunately, I am running into another problem trying to establish exactly once guarantees (Kafka -> Flink 1.0.0-rc3 -> HDFS). When using RollingSink> sink = new

Re: Flink CEP Pattern Matching

2016-03-03 Thread Vitor Vieira
Hi Till, Idk if the windowing package should provide functions to operate on the internal elements. What is the easiest way, or is it possible to get, for example, the last event of a window, lets say a 5 second window? Rgds, Vitor Vieira @notvitor 2016-03-03 7:29 GMT-03:00 Till Rohrmann

Re: Kafka issue

2016-03-03 Thread Márton Balassi
I have inspected mvn dependency:tree in the meantime, the maven build fortunately looks healthy fortunately, it seems my IntelliJ is very keen on the freshly acquired dependencies it has gathered recently for scala 2.11. On Thu, Mar 3, 2016 at 1:04 PM, Márton Balassi

Re: Kafka issue

2016-03-03 Thread Márton Balassi
Hey guys, I have run into the same issue when developing against the master. Now after Max's commit supposedly fixing the issue reimporting the project gives me all the dependencies for 2.10, except for scala-compiler and scala-reflect, which come in version 2.11. It seems very weird. Do you

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Maximilian Bode
Hi Ufuk, Till and Stephan, Yes, that is what we observed. The primary hostname, i.e. the one returned by the unix hostname command, is in fact bound to the eth0 interface, whereas Flink uses the eth1 interface (pertaining to another hostname). Changing akka.lookup.timeout to 100 s seems to fix

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Till Rohrmann
No I don't think this behaviour has been introduced by HA. That is the default behaviour we used for a long time. If you think we should still change it, then I can open an issue for it. On Thu, Mar 3, 2016 at 12:20 PM, Stephan Ewen wrote: > Okay, that is a change from the

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Stephan Ewen
Okay, that is a change from the original behavior, introduced in HA. Originally, if the connection attempts failed, it always returned the InetAddress.getLocalHost() interface. I think we should change it back to that, because that interface is by far the best possible heuristic. On Thu, Mar 3,

Re: Flink CEP Pattern Matching

2016-03-03 Thread Till Rohrmann
Hi Jerry, at the moment it is not yet possible to access previous elements in the filter function of an individual element. Therefore, you have to check for the condition “B is 5 days after A” in the final select statement. Giving this context to the where clause would be indeed a nice addition

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Stephan Ewen
If the ThasManager cannot connect to the JobManager, it will use the interface that is bound to the machine's host name ("InetAddress.getLocalHost()"). So, the best way to fix this would be to make sure that all machines have a proper network configuration. Then Flink would either use an address

Re: YARN JobManager HA using wrong network interface

2016-03-03 Thread Ufuk Celebi
Hey Max! for the first WARN in org.apache.flink.runtime.webmonitor.JobManagerRetriever: this is expected if the new leader has not updated ZooKeeper yet. The important thing is that the new leading job manager is eventually retrieved. This did happen, right? Regarding eth1 vs. eth0: After the

Re: Kafka issue

2016-03-03 Thread Gyula Fóra
This problem is kind of the other way around, as our 2.10 build has 2.11 dependencies pulled in by kafka. But let's see what happens. :) Gyula Till Rohrmann ezt írta (időpont: 2016. márc. 3., Cs, 9:46): > Hi Gyula, > > we discovered yesterday that our build process for

Re: Kafka issue

2016-03-03 Thread Till Rohrmann
Hi Gyula, we discovered yesterday that our build process for Scala 2.11 is broken for the Kafka connector. The reason is that a property value is not properly resolved and thus pulls in the 2.10 Kafka dependencies. Max already opened a PR to fix this problem. I hope this will also solve your

Re: Kafka issue

2016-03-03 Thread Gyula Fóra
Hey, Do we have any idea why this is happening in the snapshot repo? We have run into the same issue again... Cheers, Gyula Gyula Fóra ezt írta (időpont: 2016. febr. 26., P, 11:17): > Thanks Robert, so apparently the snapshot version was screwed up somehow > and included

YARN JobManager HA using wrong network interface

2016-03-03 Thread Maximilian Bode
Hi everyone, we are trying to get to work JobManager HA in the context of a per-job YARN session using the 1.0.0-rc3 from a few days ago and are having a problem concerning task managers with several network interfaces. After manually killing the job manager process, the jobmanager.log on the