InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-17 Thread Biplob Biswas
c class DegreeMapper implements MapFunction Long>, Tuple2> { > private static final long serialVersionUID = 1L; > public Tuple2 map(Tuple2 input) throws > Exception { > return new Tuple2(input.f1, 1); > } > } Now I am lost as to what I did wrong and why I am getting that error, any help would be appreciated. Thanks a lot. Thanks & Regards Biplob Biswas

Re: InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-18 Thread Biplob Biswas
Hi Till, I am using flink 0.10.1 and if i am not wrong it corresponds to the 1.0-Snapshot you mentioned. [image: Inline image 1] If wrong, please suggest what should I do to fix it. Thanks & Regards Biplob Biswas On Mon, Jan 18, 2016 at 11:23 AM, Till Rohrmann wrote: > Hi Biplob, &g

Re: InvalidTypesException - Input mismatch: Basic type 'Integer' expected but was 'Long'

2016-01-20 Thread Biplob Biswas
Hello everyone, I am still stuck with this issue, can anyone point me in the right direction? Thanks & Regards Biplob Biswas On Mon, Jan 18, 2016 at 2:24 PM, Biplob Biswas wrote: > Hi Till, > > I am using flink 0.10.1 and if i am not wrong it corresponds to the > 1.0-Snapsh

Regarding Concurrent Modification Exception

2016-02-13 Thread Biplob Biswas
ite(CollectionSerializer.java:22) > at com.esotericsoftware.kryo.Kryo.writeObject(Kryo.java:523) > at > com.esotericsoftware.kryo.serializers.ObjectField.write(ObjectField.java:61) > ... 73 more Can anyone enlighten us as why is it like this or how to fix this issue? We did a bit of google search, but all we get is some problem with serializing broadcast variable. We use flink bulk iterations and this variable is broadcasted to both map and reduce in one dataflow! Thanks & Regards Biplob Biswas

Re: Regarding Concurrent Modification Exception

2016-02-16 Thread Biplob Biswas
& Regards Biplob Biswas On Mon, Feb 15, 2016 at 12:25 PM, Fabian Hueske wrote: > Hi, > > This stacktrace looks really suspicious. > It includes classes from the submission client (CLIClient), optimizer > (JobGraphGenerator), and runtime (KryoSerializer). > > Is it possible th

Flink first() operator

2016-04-23 Thread Biplob Biswas
hrough the entire file, is there a better way to just get the top m lines using readCsvFile function? Thanks & Regards Biplob Biswas

Return unique counter using groupReduceFunction

2016-04-26 Thread Biplob Biswas
Hi, I am using a groupreduce function to aggregate the content of the objects but at the same time i need to return a unique counter from the function but my attempts are failing and the identifiers are somehow very random and getting duplicated. Following is the part of my code which is supposed

Re: Flink first() operator

2016-04-26 Thread Biplob Biswas
Thanks, I was looking into the Textinputformat you suggested, and would get back to it once I start working with huge files. I would assume there's no workaround or additonal parameters to the readscvfile function so as to restrict the number of lines read in one go as reading a big file would be a

Regarding Broadcast of datasets in streaming context

2016-04-26 Thread Biplob Biswas
Hi, I have yet another question, this time maintaining a global list of centroids. I am trying to implement the clustream algorithm and for that purpose I have the initial set of centres in a flink dataset. Now I need to update the set of centres for every data tuple that comes from the stream. F

Re: Regarding Broadcast of datasets in streaming context

2016-04-28 Thread Biplob Biswas
next operation. If the output elements are broadcasted, then how are they retrieved? Or maybe I am looking at this method in a completely wrong way? Thanks Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-Broadcast-of

Re: Regarding Broadcast of datasets in streaming context

2016-04-28 Thread Biplob Biswas
oids. > I'll directly include him in the email so that he will notice and can send > you the example. > > Cheers, > Aljoscha > > On Thu, 28 Apr 2016 at 13:57 Biplob Biswas < > revolutionisme@ > > wrote: > >> I am pretty new to flink syste

Re: Regarding Broadcast of datasets in streaming context

2016-04-30 Thread Biplob Biswas
ceive events > and update its local centroids (and periodically output the centroids) and > on the other input would send centroids of other flatmaps and would merge > them to the local. > > This might be a lot to take in at first, so you might want to read up on > streaming iterati

Re: Regarding Broadcast of datasets in streaming context

2016-05-02 Thread Biplob Biswas
as if streams are sent in parallel how would I ensure correct update of the centroids as multiple points can try to update the same centroid in parallel . I hope I made myself clear with this. Thanks and Regards Biplob Biplob Biswas wrote > Hi Gyula, > > I read your workaround and starte

Re: Regarding Broadcast of datasets in streaming context

2016-05-02 Thread Biplob Biswas
re is, I am assuming that this operation is not done in >> parallel as if streams are sent in parallel how would I ensure correct >> update of the centroids as multiple points can try to update the same >> centroid in parallel . >> >> I hope I made myself clear with th

Re: Regarding Broadcast of datasets in streaming context

2016-05-05 Thread Biplob Biswas
for your help throughout. Regards Biplob Biswas Gyula Fóra wrote > Hi, > > Iterating after every incoming point/centroid update means that you > basically defeat the purpose of having parallelism in your Flink job. > > If you only "sync" the centroids periodically by

Regarding source Parallelism and usage of coflatmap transformation

2016-05-05 Thread Biplob Biswas
27;t see the centroids already broadcasted by centroidStream.broadcast() in the map functions. Any kind of help is hugely appreciated. Thanks and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Regarding-source-Parallelism-and-usage

Re: Regarding source Parallelism and usage of coflatmap transformation

2016-05-09 Thread Biplob Biswas
e Regards Biplob Biswas Aljoscha Krettek wrote > Hi, > regarding 1) the source needs to implement the ParallelSourceFunction or > RichParallelSourceFunction interface to allow it to have a higher > parallelism than 1. > > regarding 2) I wrote a small example that showca

Re: Regarding Broadcast of datasets in streaming context

2016-05-11 Thread Biplob Biswas
Hi Gyula, I tried doing something like the following in the 2 flatmaps, but i am not getting desired results and still confused how the concept you put forward would work: public static final class MyCoFlatmap implements CoFlatMapFunction{ Centroid[] centroids;

Unexpected behaviour in datastream.broadcast()

2016-05-12 Thread Biplob Biswas
Hi, I am running this following sample code to understand how iteration and broadcast works in streaming context. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(4); long i = 5; Data

Re: Unexpected behaviour in datastream.broadcast()

2016-05-14 Thread Biplob Biswas
Can anyone help me understand how the out.collect() and the corresponding broadcast) is working? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Unexpected-behaviour-in-datastream-broadcast-tp6848p6925.html Sent from the Apache Flink User Mai

Re: Regarding Broadcast of datasets in streaming context

2016-05-15 Thread Biplob Biswas
848.html> Because its not working the way i am expecting it to be and the inputstream is completely consumed before anything is sent back and iterated. Could you please send me to a proper direction and help me in understanding the things properly? Thanks and Regards Biplob Biswas -- View this

Re: Regarding Broadcast of datasets in streaming context

2016-05-15 Thread Biplob Biswas
Hi, i read that article already but it is very simplistic and thus based on that article and other examples, i was trying to understand how my centroids can be sent to all the partitions and update accordingly. I also understood that the order of the input and the feedback stream cant be determin

Re: closewith(...) not working in DataStream error, but works in DataSet

2016-05-17 Thread Biplob Biswas
I am also a newbie but from what i experienced during my experiments is that ...The same implementation doesnt work for the streaming context because 1) In streaming context the stream is assumed to be infinite so the process of iteration is also infinite and the part with which you close your ite

Connect 2 datastreams and iterate

2016-05-19 Thread Biplob Biswas
Hi, Is there a way to connect 2 datastreams and iterate and then get the feedback as a third stream? I tried doing mergedDS = datastream1.connect(datastream2) iterateDS = mergedDS.iterate().withFeedback(datastream3type.class) but this didnt work and it was showing me errors. Is there any oth

Re: Connect 2 datastreams and iterate

2016-05-24 Thread Biplob Biswas
uments for type IterativeStream.ConnectedIterativeStreams; it cannot be parameterized with arguments " What a I doing wrong here? Thanks and Regards Biplob Biswas Ufuk Celebi wrote > Please provide the error message and stack trace in order to help > investigating this further. > > On Thu, May 19

Reading Parameter values sent to partition

2016-05-28 Thread Biplob Biswas
Hi, I am trying to send some static integer values down to each map function, using the following code public static void main(String[] args) throws Exception { ParameterTool params = ParameterTool.fromArgs(args); String fi

Re: Reading Parameter values sent to partition

2016-05-29 Thread Biplob Biswas
Aah, thanks a lot for that insight. Pretty new to the Flink systems and learning on my own so prone to making mistakes. Thanks a lot for helping. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Reading-Parameter-values-sent-to-partition-tp7

Extracting Timestamp in MapFunction

2016-05-29 Thread Biplob Biswas
e as well. Any help is much appreciated. Thanks a lot. Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Extracting-Timestamp-in-MapFunction-tp7240.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Extracting Timestamp in MapFunction

2016-06-01 Thread Biplob Biswas
Hi, Before giving the method u described above a try, i tried adding the timestamp with my data directly at the stream source. Following is my stream source: http://pastebin.com/AsXiStMC and I am using the stream source as follows: DataStream tuples = env.addSource(new DataStreamGenerator(file

Data Source Generator emits 4 instances of the same tuple

2016-06-06 Thread Biplob Biswas
s and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392.html Sent from the Apache Flink User Mailing List archive. mailing list archive at Nabble.com.

Re: Data Source Generator emits 4 instances of the same tuple

2016-06-06 Thread Biplob Biswas
://pastebin.com/NenvXShH I hope to get a solution out of it. Thanks and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-Source-Generator-emits-4-instances-of-the-same-tuple-tp7392p7405.html Sent from the Apache

Re: Extracting Timestamp in MapFunction

2016-06-09 Thread Biplob Biswas
Hi Aljoscha, I went to the Flink hackathon by Buzzwords yesterday where Fabian and Robert helped me with this issue. Apparently I was assuming that the file would be handled in a single thread but I was using parallelsourcefunction and it was creating 4 different threads and thus reading the same

Re: Data Source Generator emits 4 instances of the same tuple

2016-06-09 Thread Biplob Biswas
Yes Thanks a lot, also the fact that I was using ParallelSourceFunction was problematic. So as suggested by Fabian and Robert, I used Source Function and then in the flink job, i set the output of map with a parallelism of 4 to get the desired result. Thanks again. -- View this message in conte

Keeping latest data point in a data stream variable

2016-06-21 Thread Biplob Biswas
Hi, I want to keep the latest data point which is processed in a datastream variable. So technically I need just one value in the variable and discard all the older ones. Can this be done somehow? I was thinking about using filters but i don't think i can use it for this scenario. Any ideas as to

Way to hold execution of one of the map operator in Co-FlatMaps

2016-06-26 Thread Biplob Biswas
Hi, I was wondering whether it is possible to stop streaming data in from one of the map operators until some data arrives in the second map operator. For ex, if i have ds1.connect(ds2).map(new coflatmapper()) then, i need data to stop flowing from ds1 until some data arrives in ds2. Is that

Re: Way to hold execution of one of the map operator in Co-FlatMaps

2016-06-27 Thread Biplob Biswas
e and not as fast as the first one. Thanks for replying though, will try that out. Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Way-to-hold-execution-of-one-of-the-map-operator-in-Co-FlatMaps-tp7689p7699.html Sent fro

Data point goes missing within iteration

2016-07-03 Thread Biplob Biswas
eally know what is happening here exactly, why would the number of data points reduce like that suddenly? Thanks and Regards Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776.html

Re: Data point goes missing within iteration

2016-07-04 Thread Biplob Biswas
Can anyone check this once, and help me out with this? I would be really obliged. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7795.html Sent from the Apache Flink User Mailing List archiv

Re: Data point goes missing within iteration

2016-07-04 Thread Biplob Biswas
I have sent you my code in a separate email, I hope you can solve my issue. Thanks a lot Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p7798.html Sent from the Apache Flink User Mailing

Re: Data point goes missing within iteration

2016-07-06 Thread Biplob Biswas
Thanks a lot, would really appreciate it. Also. please let me know if you don't understand it well, the documentation is not really great at the moment in the code. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-wit

Re: Data point goes missing within iteration

2016-07-13 Thread Biplob Biswas
Hi, Sorry for the late reply, was trying different stuff on my code. And from what I observed, its very weird for me. So after experimentation, I found out that when I increase the number of centroids, the number of data points forwarded decreases, when I lower the umber of centroids, the datapo

Re: Data point goes missing within iteration

2016-07-19 Thread Biplob Biswas
Hi Ufuk, Did you get time to go through my issue, just wanted to follow up to see whether I can get a solution or not. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p8010.html Sent from the Ap

Re: Data point goes missing within iteration

2016-07-19 Thread Biplob Biswas
Hi Ufuk, Thanks for the update, is there any known way to fix this issue? Any workaround that you know of, which I can try? -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Data-point-goes-missing-within-iteration-tp7776p8015.html Sent from t

Can't access Flink Dashboard at 8081, running Flink program using Eclipse

2016-07-19 Thread Biplob Biswas
Hi, I am running my flink program using Eclipse and I can't access the dashboard at http://localhost:8081, can someone help me with this? I read that I need to check my flink-conf.yaml, but its a maven project and I don't have a flink-conf. Any help would be really appreciated. Thanks a lot Bip

Re: Can't access Flink Dashboard at 8081, running Flink program using Eclipse

2016-07-19 Thread Biplob Biswas
to access the jobs statuses on the dashboard when you finish > running your job. > > Sameer > > On Tue, Jul 19, 2016 at 6:35 AM, Biplob Biswas < > revolutionisme@ > > > wrote: > >> Hi, >> >> I am running my flink program using Eclipse and I

Re: Can't access Flink Dashboard at 8081, running Flink program using Eclipse

2016-07-19 Thread Biplob Biswas
e >> and connect to it using the getRemoteExecutionEnvironment(). That will >> allow >> you to access the jobs statuses on the dashboard when you finish running >> your job. >> >> Sameer >> >> On Tue, Jul 19, 2016 at 6:35 AM, Biplob Biswas <

Re: Can't access Flink Dashboard at 8081, running Flink program using Eclipse

2016-07-19 Thread Biplob Biswas
Thanks a ton, Till. That worked. Thank you so much. -Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Can-t-access-Flink-Dashboard-at-8081-running-Flink-program-using-Eclipse-tp8016p8035.html Sent from the Apache Flink User Mailing Li

Re: Data point goes missing within iteration

2016-07-20 Thread Biplob Biswas
Hi Max, Yeah I tried that and its definitely better. Only a few points go missing compared to a huge amount in the beginning. For now, its good for me and my work. Thanks a lot for the workaround. -Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.233605

Running multiple Flink Streaming Jobs, one by one

2016-07-20 Thread Biplob Biswas
Hi, I want to run test my flink streaming code, and thus I want to run flink streaming jobs with different parameters one by one. So, when one job finishes after it doesn't receive new data points for sometime , the next job with a different set of parameter should start. For this, I am already

Re: state size effects latency

2017-10-30 Thread Biplob Biswas
is this? specific operator latency? because the end to end latency is around 50ms and 370 ms. Was just curious how latency is seen from a different perspective, would really help me in my understanding. Thanks a lot, Biplob Thanks & Regards Biplob Biswas On Mon, Oct 30, 2017 at 8:53 AM, S

No Alerts with FinkCEP

2017-05-26 Thread Biplob Biswas
Hi, I just started exploring Flink CEP a day back and I thought I can use it to make a simple event processor. For that I looked into the CEP examples by Till and some other articles. Now I have 2 questions which i would like to ask: *Part 1:* I came up with the following piece of code, but th

Re: No Alerts with FinkCEP

2017-05-26 Thread Biplob Biswas
Hello Kostas, Thanks for the suggestions. I checked and I am getting my events in the partitionedInput stream when i am printing it but still nothing on the alert side. I checked flink UI for backpressure and all seems to be normal (I am having at max 1000 events per second on the kafka topic so

Re: No Alerts with FinkCEP

2017-05-29 Thread Biplob Biswas
Hello Kostas, I made the necessary changes and adapted the code to reflect the changes with 1.4-Snapshot. I still have similar behaviour, I can see that the data is there after partitionedinut stream but no alerts are being raised. I see some info log on my console as follows: INFO o.a.f.a.java

Re: No Alerts with FinkCEP

2017-05-31 Thread Biplob Biswas
Hi Dawid, Thanks for the response. Timeout patterns work like a charm, I saw it previously but didn't understood what it does, thanks for explaining that. Also, my problem with no alerts is solved now. The problem was that I was using "Event Time" for processing whereas my events didn't have any

Re: No Alerts with FinkCEP

2017-05-31 Thread Biplob Biswas
Hi Kostas, My application didn't have any timestamp extractor nor my events had any timestamp. Still I was using event time for processing it, probably that's why it was blocked. Now I removed the part where I mention timechracteristics as Event time and it works now. For example: Previously:

Re: No Alerts with FinkCEP

2017-05-31 Thread Biplob Biswas
Hi Kostas, I am okay with processing time at the moment but as my events already have a creation timestamp added to them and also to explore further the event time aspect with FlinkCEP, I proceeded further with evaluating with event time. For this I tried both 1. AscendingTimestampExtractor: usi

Re: No Alerts with FinkCEP

2017-05-31 Thread Biplob Biswas
Hi, I am sorry it worked with the BoundedOutOfOrdernessTimestampExtractor, somehow I replayed my events from kafka and the older events were also on the bus and it didnt correlate with my new events. Now i cleaned up my code and restarted it from the begninning and it works. Thanks a lot for

Queries regarding FlinkCEP

2017-06-02 Thread Biplob Biswas
Hi , Thanks a lot for the help last time, I have a few more questions and I chose to create a new topic as the problem in the previous topic was solved, thanks to useful inputs from Flink Community. The questions are as follows *1.* What time does the "within" operator works on "Event Time" or "P

Re: Queries regarding FlinkCEP

2017-06-06 Thread Biplob Biswas
Thanks a lot, Till and Dawid for such detailed reply. I tried to check and wait what both of you suggested and I still have no events. Thus as pointed out by till, I created a self-contained example to reproduce the issue and the behaviour is the same as was in my original case. Please find the s

Re: Queries regarding FlinkCEP

2017-06-06 Thread Biplob Biswas
Also, my test environment was Flink 1.4-Snapshot with Kafka 0.10.0 on HDP 2.5. And I sent my test messages via the console producer. Thanks, Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Queries-regarding-FlinkCEP-tp13454p13512.htm

Re: Queries regarding FlinkCEP

2017-06-06 Thread Biplob Biswas
Sorry to bombard with so many messages , but one last thing is the example would produce alert if the line specifying Event Time is commented out. More specifically, this one: env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); Only with event time, there is no alert. Thanks, Biplob

Re: Queries regarding FlinkCEP

2017-06-06 Thread Biplob Biswas
Hi Dawid, What you wrote is exactly correct, it wouldn't generate a new waatermark (and subsequently throw events) unless maxOutOfOrderness time is elapsed. Thus, I was expecting for alerts to be raised as the stream was out of order but not out of maxOutOfOrderness. Nevertheless I tried your ex

Re: Queries regarding FlinkCEP

2017-06-07 Thread Biplob Biswas
Hi Dawid, Yes, now I understood what you meant. Although I added exactly the input you asked me to and I still get no alerts. I also observed that I am not getting alerts even with normal ordering of timestamp and with ascedingTimestampExtractor. I am adding an image where I entered the data fr

Re: Queries regarding FlinkCEP

2017-06-08 Thread Biplob Biswas
Hi, Can anyone check, whether they can reproduce this issue on their end? There's no log yet as t what is happening. Any idea to debug this issue is well appreciated. Regards, Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Queries-r

Error running Flink job in Yarn-cluster mode

2017-06-08 Thread Biplob Biswas
I am trying to run a flink job on our cluster with 3 dn and 1 nn. I am usng the following command line argument to run this job, but I get an exception saying "Could not connect to the leading JobManager. Please check that the JobManager is running" ... what could I be doing wrong? Surprisingly, o

Re: Error running Flink job in Yarn-cluster mode

2017-06-09 Thread Biplob Biswas
Hi Nico, I tried running my job with 3 and even 2 yarn containers and the result is the same. Then I tried running the example wordcount(streaming and batch both) and they seem to find the task and job managers and run succesfully. ./bin/flink run -m yarn-cluster -yn 3 -yt ./lib ./examples/stream

Re: Error running Flink job in Yarn-cluster mode

2017-06-09 Thread Biplob Biswas
One more thing i noticed is that the streaming wordcount from the flink package works when i run it but when i used the same github code, packaged it and uploaded the fat jar with the word count example to the cluster, i get the same error. I am wondering, How can making my uber jar produce such e

Re: Error running Flink job in Yarn-cluster mode

2017-06-13 Thread Biplob Biswas
Thanks a lot Aljoscha (again) I created my project from scratch and used the flink-maven-archetype and now it works on the yarn-cluster mode. I was creating a fat jar initially as well with my old project setup so not really sure what went wrong there as it was working on my local test environment

Flink CEP not emitting timed out events properly

2017-06-16 Thread Biplob Biswas
updated when a new event comes in the system. So, I added the setAutoWatermarkInterval(1000) to the code but no avail. Thanks & Regards, Biplob Biswas -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-CEP-not-emitting-timed-out-even

Re: Flink CEP not emitting timed out events properly

2017-06-16 Thread Biplob Biswas
Hi Kostas, Thanks for the reply, makes things a bit more clear. Also, I went through this link and it is something similar I am trying to observe. http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Listening-to-timed-out-patterns-in-Flink-CEP-td9371.html I am checking for timed

Re: How choose between YARN/Mesos/StandAlone Flink

2017-06-16 Thread Biplob Biswas
Hi Andrea, If you are using Flink for research and/or testing purpose, standalone Flink is more or less sufficient. Although if you have a huge amount of data, it may take forever to process data with only one node/machine and that's where a cluster would be needed. A yarn and mesos cluster could

Re: Flink CEP not emitting timed out events properly

2017-06-16 Thread Biplob Biswas
Hi Kostas, Thanks for that suggestion, I would try that next, I have out of order events on one of my Kafka topics and that's why I am using BoundedOutOfOrdernessTimestampExtractor(), now that this doesn't work as expected I would try to work with the Base class as you suggested. Although this b

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
Hi Kostas, Implementing my custom timestamp assigned made em realise a problem which we have in our architecture you may say. Any inputs would be really appreciated. So, for now, we are reading from 4 different kafka topics, and we have a flow similar to something like this: Event 1(on topic t

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
Hi dawid, First of all congratulations on being a Flink committer, saw your tweet in the morning. Now regarding that link, that talks about multiple partitions for a single topic, here I am talking about multiple topics each having different number of partitions. I tried adding tinestampextract

Re: Queries regarding FlinkCEP

2017-06-20 Thread Biplob Biswas
Hi dawid, Yes I am reading from multiple topics and yes a few topics have multiple partitions, not all of them. But I didn't understand the concept of stalled partition. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Queries-regarding-Fli

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
But if that's the case, I don't understand why some of my events are just lost If the watermark which is used is the smallest ... They either I expect a match or I expect a timed out event. The only way I can imagine my events getting lost is higher watermark than the incoming event and thus

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
Hi Kostas, I have out-of-orderness of around 5 seconds from what I have observed but that too from events coming from a different topic. The initial topic doesn't have out-of-order events still I have added a generous time bound of 20 seconds. Still, I will try for a higher number just in order to

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
Hi Kostas, Yes, I have a flag in my timestampextractor. As you can see from the code below, I am checking whether currentTime - systemTimeSinceLastModification > 10 sec. as new events come then the watermark wouldn't be incremented. But as soon as I have a difference of more than 10 seconds,

Re: Flink CEP not emitting timed out events properly

2017-06-20 Thread Biplob Biswas
I know that there wouldn't be a scenario where the first event type(coming from topic t1) would be coming with a timestamp higher than the current watermark. Although I am still investigating whether the other events from other topics (specifically t3 and t4) are arriving after the watermark update

Re: Flink CEP not emitting timed out events properly

2017-06-26 Thread Biplob Biswas
Hi Kostas, I ended up setting my currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp); to currentMaxTimestamp = Math.min(timestamp, currentMaxTimestamp); and changing this : if(firstEventFlag && (currentTime - systemTimeSinceLastModification > 1)){ systemTimeSinceLastModi

Flink QueryableState with Sliding Window on RocksDB

2017-07-28 Thread Biplob Biswas
Hi, We recently moved from Spark Streaming to Flink for our stream processing requirements in our organization and we are in the process of removing the number of external calls as much as possible. Earlier we were using HBASE to store the incoming data, but we now want to try out stateful operati

Re: Flink QueryableState with Sliding Window on RocksDB

2017-07-31 Thread Biplob Biswas
Thanks Fabian for the reply, I was reconsidering my design and the requirement and what I mentioned already is partially confusing. I realized that using a sessionwindow is better in this scenario where I want a value to be updated per key and the session resets to wait for the gap period with ev

Re: Flink QueryableState with Sliding Window on RocksDB

2017-07-31 Thread Biplob Biswas
Hi Fabian, Thanks a lot for pointing that out would read about it and give it a try. Regards, Biplob -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Flink-QueryableState-with-Sliding-Window-on-RocksDB-tp14514p14551.html Sent from the Apach

Re: Flink QueryableState with Sliding Window on RocksDB

2017-07-31 Thread Biplob Biswas
Hi Fabian, I read about the process function and it seems a perfect fit for my requirement. Although while reading more about queryable-state I found that its not meant to do lookups within job (Your comment in the following link). http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.c

Re: Flink QueryableState with Sliding Window on RocksDB

2017-07-31 Thread Biplob Biswas
Hi Fabian, Thanks for the insight, I am currently exploring QueryableStateClient and would attempt to get the value for a corresponding key using the getkvstate() function, I was confused about the jobId but I am expecting this would provide me with the jobid of the current job - ExecutionEnviro

Re: Flink QueryableState with Sliding Window on RocksDB

2017-08-01 Thread Biplob Biswas
Hi Fabian, I am not really sure using CoProcessFunction would be useful for my use case. My use case, in short, can be explained as follows: 1) create 2 different local state store, where both have 1-N relationship. For eg. 1 -> [A,B,C] and A -> [1,2,3] 2) Based on the key A, get list of element

Storing POJO's to RocksDB state backend

2017-08-02 Thread Biplob Biswas
Hi, I had a simple query as to how POJO's are stored in a state back end like RocksDB? Is it deserialized internally(with a default serde or we have to specify something)? and if yes, is Kryo the default serde? Thanks, Biplob -- View this message in context: http://apache-flink-user-mailing-

Can't find correct JobManager address, job fails with Queryable state

2017-08-02 Thread Biplob Biswas
When I start my flink job I get the following warning, if I am not wrong this is because it can't find the jobmanager at the given address(localhost), I tried changing: config.setString(JobManagerOptions.ADDRESS, "localhost"); to LAN IP, 127.0.0.1 and localhost but none of it seems to work. I am

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-03 Thread Biplob Biswas
I managed to get the Web UI up and running but I am still getting the error with "Actor not found" Before the job failed I got the output for the Flink config from the WebUI and it seems okay to me, this corresponds to the config I have already set.

Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Hi, Is there a way to fetch the jobmanager address and port from a running flink job, I was expecting the address and port to be constant but it changes everytime I am running a job. ANd somehow its not honoring the jobmanager.rpc.address and jobmanager.rpc.port set in the flink-conf.yaml file. I

Re: Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Also, is it possible to get the JobID from a running flink instance for a streaming job? I know I can get for a batch job with ExecutionEnvironment.getExecutionEnvironment().getId() but apparently, it generates a new execution environment and returns the job id of that environment for a batch env

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-03 Thread Biplob Biswas
Hi Nico, I had actually tried doing that but I still get the same error as before with the actor not found. I then ran on my mock cluster and I was getting the same error although I could observe the jobmanager on the yarn cluster mode with a defined port. The addres and port combination was rand

Re: Getting JobManager address and port within a running job

2017-08-03 Thread Biplob Biswas
Hi nico, This behaviour was on my cluster and not on the local mode as I wanted to check whether it's an issue of my job or the behaviour with jobmanager is consistent everywhere. When I run my job on the yarn-cluster mode, it's not honouring the IP and port I specified and its randomly assigning

Re: Getting JobManager address and port within a running job

2017-08-09 Thread Biplob Biswas
Hi Aljoscha, I was expecting that I could set the jobmanager address and port by setting it up in the configuration and passing it to the execution environment, but learnt later that it was a wrong approach. My motivation of accessing the jobmanager coordinates was to setup a queryablestateclien

Re: Can't find correct JobManager address, job fails with Queryable state

2017-08-09 Thread Biplob Biswas
Thanks Aljoscha, this clarification probably ends my search of accessing local states from within the same job. Thanks for the help :) -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Can-t-find-correct-JobManager-address-job-fails-with-Que

Re: Getting JobManager address and port within a running job

2017-08-09 Thread Biplob Biswas
Hi Aljoscha, Thanks for the link. I read through it but I still can't imagine implementing something similar for my usecase. I explained my usecase to Fabian in a previous post, I would try to be again as clear as possible. So, my use case, in short, is something like below: 1) From input str

Evolving serializers and impact on flink managed states

2017-08-09 Thread Biplob Biswas
Hi, We have a set of XSDs which define our schema for the data and we are using the corresponding POJO's for serialization with Flink Serialization stack. Now, I was concerned about any evolution of our XSD schema which will subsequently change the generated POJO's which in turn are used for crea

Re: Getting JobManager address and port within a running job

2017-08-09 Thread Biplob Biswas
Hi Aljoscha, So basically, I am reading events from a kafka topic. These events have corresponding eventIds and a list of modes. *Job1* 1. When I first read from the kafka topic, I key by the eventId's and use a processfuntion to create a state named "events". Also, the list of modes are used to

Re: Evolving serializers and impact on flink managed states

2017-08-11 Thread Biplob Biswas
Hi Stefan, Thanks a lot for such a helpful response. That really made thing a lot clearer for me. Although at this point I have one more and probably last question. According to the Flink documentation, [Attention] Currently, as of Flink 1.3, if the result of the compatibility check acknowledge

Re: Evolving serializers and impact on flink managed states

2017-08-11 Thread Biplob Biswas
Thanks a ton Stefan, that was really helpful. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Evolving-serializers-and-impact-on-flink-managed-states-tp14777p14837.html Sent from the Apache Flink User Mailing List archive. mailing list arch

  1   2   >