Re: flink shaded jar in yarn

2016-08-10 Thread Janardhan Reddy
sorry my bad, i was using some other version. On Thu, Aug 11, 2016 at 4:47 AM, Janardhan Reddy < janardhan.re...@olacabs.com> wrote: > Hi, > > the flink-dist_2.11-1.0.0.jar jar present in lib folder has unshaded > version of guava. > http://www.apache.org/dyn/closer.lua/flink/flink-1.1.0/ >

flink shaded jar in yarn

2016-08-10 Thread Janardhan Reddy
Hi, the flink-dist_2.11-1.0.0.jar jar present in lib folder has unshaded version of guava. http://www.apache.org/dyn/closer.lua/flink/flink-1.1.0/flink-1.1.0-bin-hadoop27-scala_2.11.tgz Can we get a shaded version of the above jar.

Re: Multiple Partitions (Source Functions) -> Event Time -> Watermarks -> Trigger

2016-08-10 Thread Sameer W
Sorry for replying to my own messages but this is super confusing and logical at the same time to me :-). If I have Kafka Topic with 10 partitions. If I partition by device id when I write to the Topic, and use Event Time, my pipeline freezes (if fewer than 10 devices are active initially).

Re: Flink : CEP processing

2016-08-10 Thread Sameer W
Mans, I think at this time we need someone who knows the internal implementation to answer definitively- My understanding is- 1. Internally CEP is like a map operator with session-like semantics operating in a pipeline. You could do what it does but you would have to implement all that. If you

Within interval for CEP - Wall Clock based or Event Timestamp based?

2016-08-10 Thread Sameer W
Hi, I am using EventTime but when the records get into the CEP PatternStream does the WITHIN interval refer to the wall clock time or the timestamps embedded in the event stream? If I provide WITHIN(Time.Seconds(5)) and in processing time I am getting events with timestamps in the range of 10

Get minimum or maximum value from a Dataset

2016-08-10 Thread Punit Naik
Hi I have a dataset like this: val x : Dataset[Long]… I wanted to get the minimum or the maximum Long value. How do I do it?

Re: flink no class found error

2016-08-10 Thread Janardhan Reddy
We don't use guava directly, we use another library which uses guava internally? How do we use shade plugin in this case. On Thu, Aug 11, 2016 at 1:37 AM, Janardhan Reddy < janardhan.re...@olacabs.com> wrote: > I have cross checked that all our yarn nodes have 1.8 java installed but > still we

Re: Firing windows multiple times

2016-08-10 Thread Shannon Carey
Hi Aljoscha, Yes, I am using an Evictor, and I think I have seen the problem you are referring to. However, that's not what I'm talking about. If you re-read my first email, the main point is the following: if users desire updates more frequently than window watermarks are reached, then window

Re: flink no class found error

2016-08-10 Thread Janardhan Reddy
I have cross checked that all our yarn nodes have 1.8 java installed but still we are getting the error : Unsupported major.minor version 52.0 On Thu, Aug 11, 2016 at 1:35 AM, Janardhan Reddy < janardhan.re...@olacabs.com> wrote: > can you please explain a bit more about last option. We are

Re: flink no class found error

2016-08-10 Thread Janardhan Reddy
can you please explain a bit more about last option. We are using yarn so guava might be in some classpath. On Thu, Aug 11, 2016 at 1:29 AM, Robert Metzger wrote: > Can you check if the jar you are submitting to the cluster contains a > different Guava than you use at

flink - Working with State example

2016-08-10 Thread Ramanan, Buvana (Nokia - US)
Hello, I am utilizing the code snippet in: https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state.html and particularly ‘open’ function in my code: @Override public void open(Configuration config) { ValueStateDescriptor> descriptor =

Re: flink no class found error

2016-08-10 Thread Robert Metzger
Can you check if the jar you are submitting to the cluster contains a different Guava than you use at compile time? Also, it might happen that Guava is in your classpath, for example one some YARN setups. The last resort to resolve these issues is to use the maven-shade-plugin and relocated the

Re: flink no class found error

2016-08-10 Thread Janardhan Reddy
#1 is thrown from user code. We use hadoop 2.7 which uses gauva 11.2 but our application uses 18.0. I think the hadoop's gauva is getting picked up instead of ours On Thu, Aug 11, 2016 at 1:24 AM, Robert Metzger wrote: > Hi Janardhan, > > #1 Is the exception thrown from

Re: flink no class found error

2016-08-10 Thread Robert Metzger
Hi Janardhan, #1 Is the exception thrown from your user code, or from Flink? #2 is most likely caused due to a compiler / runtime version mismatch: http://stackoverflow.com/questions/10382929/how-to-fix-java-lang-unsupportedclassversionerror-unsupported-major-minor-versi You compiled the code

flink no class found error

2016-08-10 Thread Janardhan Reddy
Hi, We are getting the following error on submitting the flink jobs to the cluster. 1. Caused by: java.lang.NoSuchMethodError: com.google.common.io.Resources.asCharSource 2. This is for entirely different job Caused by: java.lang.UnsupportedClassVersionError: com/olacabs/fabric/common/Metadata

Re: Multiple Partitions (Source Functions) -> Event Time -> Watermarks -> Trigger

2016-08-10 Thread Sameer W
And this is happening in my local environment. As soon as I set the parallelism to 1 it all works fine. Sameer On Wed, Aug 10, 2016 at 3:11 PM, Sameer W wrote: > Hi, > > I am noticing this behavior with Event Time processing- > > I have a Kafka topic with 10 partitions.

Multiple Partitions (Source Functions) -> Event Time -> Watermarks -> Trigger

2016-08-10 Thread Sameer W
Hi, I am noticing this behavior with Event Time processing- I have a Kafka topic with 10 partitions. Each Event Source sends data to any one of the partitions. Say I have only 1 event source active at this moment, which means only one partition is receiving data. None of my windows will fire

Re: Flink : CEP processing

2016-08-10 Thread M Singh
Thanks for the pointers Sameer. The reason I wanted to find out about snapshotting with CEP is because I thought that CEP state might also be snapshotted for recovery. If that is the case, then there are events in the CEP might be in two snapshots. Mans On Tuesday, August 9, 2016 1:15 PM,

Re: Firing windows multiple times

2016-08-10 Thread Vishnu Viswanath
Hi Aljoscha, This looks like the bug that we discussed, as part of Enhance window evictor JIRA Thanks, Vishnu On Wed, Aug 10, 2016 at 1:18 PM, Aljoscha Krettek wrote: > Hi, > from your mail I'm gathering that you are in fact using an Evictor, is > that correct? If not,

Re: Firing windows multiple times

2016-08-10 Thread Aljoscha Krettek
Hi, from your mail I'm gathering that you are in fact using an Evictor, is that correct? If not, then the window operator should not keep all the elements ever received for a window but only the aggregated result. Side note, there seems to be a bug in EvictingWindowOperator that causes evicted

Re: Firing windows multiple times

2016-08-10 Thread Shannon Carey
One unfortunate aspect of using a fold() instead of a window is that the fold function has no knowledge of the watermarks. As a result, it is difficult to ensure that only items before the current watermark are included in the aggregation, and that old items are evicted correctly. This fact

Firing windows multiple times

2016-08-10 Thread Shannon Carey
I recently noticed something about windows: they retain (in state) every element that they receive regardless of whether the user provides a fold/reduce function. I can tell that such an approach is necessary in order for evictors to work, but I'm not sure if there are other reasons. I'll

Re: Flink 1.1.0 : Hadoop 1/2 compatibility mismatch

2016-08-10 Thread Shannon Carey
Works for me, thanks! -Shannon

Re: Release notes 1.1.0?

2016-08-10 Thread Stephan Ewen
Hi! In the above example the keySelector would run once before and once inside the window operator. In that sense, the version below is a better way to do it. You can also create windows of 50 or max 100 ms by writing your own trigger. Have a look at the count trigger. You can augment it by

Re: Release notes 1.1.0?

2016-08-10 Thread Andrew Ge Wu
Hi Stephan Thanks for the explanation! We will stick to 1.0.3 to keep our code clean. In the workaround case, how does key selector instantiated? One instance per window operator? By the way is there a way to create a hybrid window of count and time, like 50 items or max process time 100ms?

Re: Release notes 1.1.0?

2016-08-10 Thread Stephan Ewen
Hi Andrew! Here is the reason for what is happening with your job: You have used some sort of undocumented and unofficial corner case behavior of Flink 1.0.0, namely, using parallel windowAll(). Initially, windowAll() was supposed to not be parallel, but the system did not prevent to set a

Re: Release notes 1.1.0?

2016-08-10 Thread Andrew Ge Wu
Hi Aljoscha We are not using state backend explicitly, recovery and state backend are pointed to file path. See attached json file Thanks for the help. Best regards Andrew > On 10 Aug 2016, at 11:38, Aljoscha Krettek wrote: > > Oh, are you by any chance specifying a

Re: Release notes 1.1.0?

2016-08-10 Thread Aljoscha Krettek
Oh, are you by any chance specifying a custom state backend for your job? For example, RocksDBStateBackend. Cheers, Aljoscha On Wed, 10 Aug 2016 at 11:17 Aljoscha Krettek wrote: > Hi, > could you maybe send us the output of "env.getExecutionPlan()". This would > help us

Re: Connected Streams - Controlling Order of arrival on the two streams

2016-08-10 Thread Aljoscha Krettek
Hi, I'm afraid you guessed correctly that it is not possible to ensure that rules arrive before events. I think the way you solved it (with buffering) is the correct way to go about this. Cheers, Aljoscha On Wed, 10 Aug 2016 at 01:31 Sameer W wrote: > Hi, > > I am using

Re: Release notes 1.1.0?

2016-08-10 Thread Aljoscha Krettek
Hi, could you maybe send us the output of "env.getExecutionPlan()". This would help us better understand which operators are used exactly. (You can of course remove any security sensitive stuff.) Cheers, Aljoscha On Tue, 9 Aug 2016 at 15:30 Andrew Ge Wu wrote: > Oh

Re: Window function - iterator data

2016-08-10 Thread Aljoscha Krettek
Hi, Kostas is right in that the elements are never explicitly sorted by timestamp. In some cases they might not even be iterated in the order that they were added so I would normally assume the order of the elements to be completely arbitrary. Cheers, Aljoscha On Wed, 10 Aug 2016 at 09:44 Kostas

RE: Flink 1.1.0 : Hadoop 1/2 compatibility mismatch

2016-08-10 Thread LINZ, Arnaud
Hi, Good for me ; my unit tests all passed with this rc version. Thanks, Arnaud -Message d'origine- De : Ufuk Celebi [mailto:u...@apache.org] Envoyé : mardi 9 août 2016 18:33 À : Ufuk Celebi Cc : user@flink.apache.org; d...@flink.apache.org Objet : Re: Flink 1.1.0 :

Re: Window function - iterator data

2016-08-10 Thread Kostas Kloudas
Hi Paul, Elements are returned in the order they were added in the window. No sorting on timestamp is performed. Hope this helps, Kostas > On Aug 9, 2016, at 10:22 PM, Paul Joireman wrote: > > When you are using a window function the docs: > >