Re: Regarding implementation of aggregate function using a ProcessFunction

2018-10-02 Thread Gaurav Luthra
Hi Fabian, Thanks for explaining in detail. But we know and you also mentioned the issues in 1) and 2). So, I am continuing with point 3). Thanks & Regards Gaurav Luthra Mob:- +91-9901945206 On Mon, Oct 1, 2018 at 3:11 PM Fabian Hueske wrote: > Hi, > > There are basically three options: > 1)

Re: Scala case class state evolution

2018-10-02 Thread Hequn Cheng
Hi Elias, >From my understanding, you can't do this since the state will no longer be compatible. Best, Hequn On Wed, Oct 3, 2018 at 6:32 AM Elias Levy wrote: > Currently it is impossible to evolve a Scala case class by adding a new > field to it that is stored as managed state using the

Re: Flink Checkpoint Barrier in case of Join

2018-10-02 Thread Hequn Cheng
Hi Anil, The join operator behaviors same as other operators. When a non-source task receives a barrier from one of its inputs, it blocks that input until it receives a barrier from all inputs. When barriers have been received from all the inputs, the task takes a snapshot of its current state

Re: Flink Python streaming

2018-10-02 Thread Hequn Cheng
Hi Bing, I'm not familiar with python programming. I guess we can simply import libraries in the python script. A example can be found here[1]. Hope this helps. Best, Hequn [1] https://github.com/wdm0006/flink-python-examples/blob/master/mandelbrot/mandelbrot_set.py On Wed, Oct 3, 2018 at 1:49

Re: Deserialization of serializer errored

2018-10-02 Thread Elias Levy
I am wondering if the issue here is the createTuple2TypeInformation implicit is creating an anonymous class which results in a non-stable

Re: Deserialization of serializer errored

2018-10-02 Thread Elias Levy
To add to the mystery, I extracted the class file mentioned in the exceptions (TestJob$$anon$13$$anon$3) from the job jar that created the savepoint and disassembled it to determine what serializer it is. The serializer actually has nothing to do with the case class that was initially modified

Using FlinkKinesisConsumer through a proxy

2018-10-02 Thread Vijay Balakrishnan
HI, How do I use FlinkKinesisConsumer using the Properties through a proxy ? Getting a Connection issue through the proxy. Works outside the proxy. Properties kinesisConsumerConfig = new Properties(); kinesisConsumerConfig.setProperty(AWSConfigConstants.AWS_REGION, region); if

Scala case class state evolution

2018-10-02 Thread Elias Levy
Currently it is impossible to evolve a Scala case class by adding a new field to it that is stored as managed state using the default Flink serializer and restore a the job from a savepoint created using the previous version of the class, correct?

Flink Checkpoint Barrier in case of Join

2018-10-02 Thread Anil
I'm trying to understand when will Flink's Stream Barrier (for checkpoint) be emitted by the join operator. Consider a query like - select * from stream_1 a1 INNER JOIN stream_2 a2 on a2.orderId = a1.orderId group by HOP(a1.proctime, INTERVAL '1' HOUR, INTERVAL '1' DAY), a1.restaurantId

Re: Cancelled job not showing its details

2018-10-02 Thread Julio Biason
Oh, another piece of information: Because the job was failing and restarting, I did a cancel via the CLI tool during one of the restarts. On Tue, Oct 2, 2018 at 4:03 PM, Julio Biason wrote: > Hello, > > I had a job that was failing -- a bug on our code -- so I decided to > cancel it and deploy

Cancelled job not showing its details

2018-10-02 Thread Julio Biason
Hello, I had a job that was failing -- a bug on our code -- so I decided to cancel it and deploy the fix. Because I couldn't create a savepoint due the job restarting, I decided to kill it anyway and use the web interface to get the last successful checkpoint. The problem is: the interface is

Flink Python streaming

2018-10-02 Thread Bing Lin
Hi, I'm wondering how I can add dependencies for third party and custom libraries to be executed in Flink for python streaming? Thank you, Bing

Re: Event timers - metrics

2018-10-02 Thread Alexey Trenikhun
Thank you for looking Alexey Get Outlook for iOS From: Andrey Zagrebin Sent: Tuesday, October 2, 2018 8:27 AM To: Alexey Trenikhun Cc: user@flink.apache.org Subject: Re: Event timers - metrics Hi Alexey, Looking into the source, I think

Re: Event timers - metrics

2018-10-02 Thread Andrey Zagrebin
Hi Alexey, Looking into the source, I think there are no such metrics at the moment. Best, Andrey > On 2 Oct 2018, at 03:44, Alexey Trenikhun wrote: > > Hello, > Are built-in timer metrics? For example number of registered timers, number > of triggered timers etc > > Thanks, > Alexey

Re: flink custom stream source

2018-10-02 Thread Andrey Zagrebin
Hi, why is it a problem that snapshotState is called and does nothing? If there is nothing to snapshot, nothing will be stored, just formal routine. I would assume that in general Flink cannot assume anything about a subtask of a custom source. Flink is not aware that it does nothing and should

BucketingSink to S3: Missing class com/amazonaws/AmazonClientException

2018-10-02 Thread Julio Biason
Hey guys, I've setup a BucketingSink as a dead letter queue into our Ceph cluster using S3, but when I start the job, I get this error: java.lang.NoClassDefFoundError: com/amazonaws/AmazonClientException at java.lang.Class.forName0(Native Method) at

hadoopInputFormat and elasticsearch

2018-10-02 Thread aviad
Hi, I want to write batch job which reads data from *elasticsearch* using *elasticsearch-hadoop* (https://github.com/elastic/elasticsearch-hadoop/) and *hadoopInputFormat* example code (from https://github.com/genged/flink-playground/blob/master/src/main/java/com/mic/flink/FlinkMain.java):

flink custom stream source

2018-10-02 Thread Darshan Singh
Hi , I am creating a new custom source for reading some streaming data which has different streams. So I assign streams to each task slots and then read it. This works fine but in some cases I have less streams than task slots and in that case some of workers are not assigned any streams and

Re: [DISCUSS] Dropping flink-storm?

2018-10-02 Thread Aljoscha Krettek
+1 for dropping it > On 1. Oct 2018, at 10:55, Fabian Hueske wrote: > > +1 to drop it. > > Thanks, Fabian > > Am Sa., 29. Sep. 2018 um 12:05 Uhr schrieb Niels Basjes : > >> I would drop it. >> >> Niels Basjes >> >> On Sat, 29 Sep 2018, 10:38 Kostas Kloudas, >> wrote: >> >>> +1 to drop it

Re: Running job in detached mode via ClusterClient.

2018-10-02 Thread Till Rohrmann
Hi Piotr, the reason why you cannot submit multiple jobs to a job cluster is that a job cluster is only responsible for a single job. If you want to submit multiple jobs, then you need to start a session cluster. In attached mode, this is currently still possible, because under the hood, we

Running job in detached mode via ClusterClient.

2018-10-02 Thread Piotr Szczepanek
Hey, I've a question regarding submission job in detached mode. We're trying to submit job via YarnClusterClient in detached mode and the first one goes correctly, but when we submit second job we get exception: Multiple environments cannot be created in detached mode at

Re: Deserialization of serializer errored

2018-10-02 Thread Fabian Hueske
Hi Elias, I am not familiar with the recovery code, but Flink might read (some of ) the savepoint data even though it is not needed and loaded into operators. That would explain why you see an exception when the case class is modified or completely removed. Maybe Stefan or Gordon can help here.