Flink multiple windows

2018-06-08 Thread antonio saldivar
Hello Has anyone work this way? I am asking because I have to get the aggregation ( Sum and Count) for multiple windows size (10 mins, 20 mins, 30 mins) please let me know if this works properly or is there other good solution. DataStream data = ... // append a Long 1 to each record to count

Re: Conceptual question

2018-06-08 Thread TechnoMage
Thank you all. This discussion is very helpful. It sounds like I can wait for 1.6 though given our development status. Michael > On Jun 8, 2018, at 1:08 PM, David Anderson wrote: > > Hi all, > > I think I see a way to eagerly do full state migration without writing your > own Operator,

Re: [DISCUSS] Flink 1.6 features

2018-06-08 Thread Elias Levy
Since wishes are free: - Standalone cluster job isolation: https://issues.apache.org/jira/browse/FLINK-8886 - Proper sliding window joins (not overlapping hoping window joins): https://issues.apache.org/jira/browse/FLINK-6243 - Sharing state across operators:

Re: [DISCUSS] Flink 1.6 features

2018-06-08 Thread Stephan Ewen
Hi all! Thanks for the discussion and good input. Many suggestions fit well with the proposal above. Please bear in mind that with a time-based release model, we would release whatever is mature by end of July. The good thing is we could schedule the next release not too far after that, so that

Take elements from window

2018-06-08 Thread Antonio Saldivar Lezama
Hello I am wondering if it is possible to process the following scenario, to store all events by event time in a general window and process elements from a smaller time Frame 1.- Store elements in a General SlidingWindow (60 mins, 10 mins) - Rule 1 -> gets 10 mins elements from the

Re: Conceptual question

2018-06-08 Thread David Anderson
Hi all, I think I see a way to eagerly do full state migration without writing your own Operator, but it's kind of hacky and may have flaws I'm not aware of. In Flink 1.5 we now have the possibility to connect BroadcastStreams to KeyedStreams and apply a KeyedBroadcastProcessFunction. This is

Heap Problem with Checkpoints

2018-06-08 Thread Fabian Wollert
Hi, in this email thread here, i tried to set up S3 as a filesystem backend for checkpoints. Now everything is working (Flink V1.5.0), but

State life-cycle for different state-backend implementations

2018-06-08 Thread Rinat
Hi mates, got a question about different state backends. As I've properly understood, on every checkpoint, Flink flushes it’s current state into backend. In case of FsStateBackend we’ll have a separate file for each checkpoint, and during the job lifecycle we got a risk of a huge amount of

[BucketingSink] notify on moving into pending/ final state

2018-06-08 Thread Rinat
Hi mates, I got a proposal about functionality of BucketingSink. During implementation of one of our tasks we got the following need - create a meta-file, with the path and additional information about the file, created by BucketingSink, when it’s been moved into final place. Unfortunately such

Re: [flink-connector-filesystem] OutOfMemory in checkpointless environment

2018-06-08 Thread Rinat
Chesnay, thx for your reply, I’ve created one https://issues.apache.org/jira/browse/FLINK-9558 > On 8 Jun 2018, at 12:58, Chesnay Schepler wrote: > > I agree, if the sink doesn't properly work without checkpointing we should > make sure

Can not submit flink job via IP or VIP of jobmanager

2018-06-08 Thread xie wei
Hello Flink team, We use Flink on DCOS and have problems submitting a Flink job from within a container to the Flink cluster. Both the container and the Flink cluster are running inside DCOS, on different nodes. We have the following setup: Flink was installed on DCOS using the package from

Implementation of ElasticsearchSinkFunction, how to handle class level variables

2018-06-08 Thread Jayant Ameta
Hi, I'm trying to integrate ElasticsearchSink in my pipeline. The example shows using Anonymous class which implements ElasticsearchSinkFunction. This is passed as a constructor argument to another

Re: [flink-connector-filesystem] OutOfMemory in checkpointless environment

2018-06-08 Thread Chesnay Schepler
I agree, if the sink doesn't properly work without checkpointing we should make sure that it fails early if it used that way. It would be great if you could open a JIRA. On 08.06.2018 10:08, Rinat wrote: Piotr, thx for your reply, for now everything is pretty clear. But from my point of view,

Re: Late data before window end is even close

2018-06-08 Thread Fabian Hueske
Thanks for reporting back and the debugging advice! Best, Fabian 2018-06-08 9:00 GMT+02:00 Juho Autio : > Flink was NOT at fault. Turns out our Kafka producer had OS level clock > sync problems :( > > Because of that, our Kafka occasionally had some messages in between with > an incorrect

Re: [flink-connector-filesystem] OutOfMemory in checkpointless environment

2018-06-08 Thread Rinat
Piotr, thx for your reply, for now everything is pretty clear. But from my point of view, it’s better to add some information about leaks in case of disabled checkpointing into BucketingSink documentation > On 8 Jun 2018, at 10:35, Piotr Nowojski wrote: > > Hi, > > BucketingSink is designed

Flink kafka consumer stopped committing offsets

2018-06-08 Thread Juho Autio
Hi, We have a Flink stream job that uses Flink kafka consumer. Normally it commits consumer offsets to Kafka. However this stream ended up in a state where it's otherwise working just fine, but it isn't committing offsets to Kafka any more. The job keeps writing correct aggregation results to

Re: Datastream[Row] covert to table exception

2018-06-08 Thread Timo Walther
Yes, that's a workaround. I found the cause of the problem. It is a Scala API specific problem. See: https://issues.apache.org/jira/browse/FLINK-9556 Thanks for reporting it! Regards, Timo Am 08.06.18 um 09:43 schrieb 孙森: Yes,I really override the method, but it did not work.  Finally ,I

Re: [flink-connector-filesystem] OutOfMemory in checkpointless environment

2018-06-08 Thread Piotr Nowojski
Hi, BucketingSink is designed to provide exactly-once writes to file system, which is inherently tied to checkpointing. As you just saw, without checkpointing, BucketingSink is never notified that it can commit pending files. If you do not want to use checkpointing for some reasons, you could

Re: Conceptual question

2018-06-08 Thread Piotr Nowojski
Hi, Yes it should be feasible. As I said before, with Flink 1.6 there will be better way for migrating a state, but for now you either need to lazily convert the state, or iterate over the keys and do the job manually. Piotrek > On 7 Jun 2018, at 15:52, Tony Wei wrote: > > Hi Piotrek, > >

Re: Late data before window end is even close

2018-06-08 Thread Juho Autio
Flink was NOT at fault. Turns out our Kafka producer had OS level clock sync problems :( Because of that, our Kafka occasionally had some messages in between with an incorrect timestamp. In practice they were about 7 days older than they should. I'm really sorry for wasting your time on this.