Re: applying where after group by in dataset

2016-08-17 Thread Jark Wu
Hi, You can only apply aggregate after groupBy. But you can apply filter before groupBy. So you can do like this: distancePoints.filter(filterFunction).groupBy(1) - Jark Wu > 在 2016年8月17日,下午5:29,subash basnet 写道: > > here

Re: Cannot use WindowedStream.fold with EventTimeSessionWindows

2016-08-17 Thread Jack Huang
Hi Till, The session I am dealing with does not have a reliable "end-of-session" event. It could stop sending events all of sudden or it could keep sending events forever. I need to be able to determine when a session expire due to inactivity or to kill off a session if it lives longer than it

off heap memory deallocation

2016-08-17 Thread Janardhan Reddy
Hi, When does off heap memory gets deallocated ? Does it get deallocated only when gc is triggered ? When does the gc gets triggered other than when the direct memory reached -XX::MaxDirectMemory limit passed in jvm flag. Thanks

Re: Sorting in datastream

2016-08-17 Thread subash basnet
Hello Stephan, Okey, then it's the same reason why there is no *count()* function in Data streams as well I suppose. Regards, Subash On Wed, Aug 17, 2016 at 6:26 PM, Stephan Ewen wrote: > Hi! > > Data streams are inifnite. It's quite hard to sort something infinite ;-) >

Re: Sorting in datastream

2016-08-17 Thread Stephan Ewen
Hi! Data streams are inifnite. It's quite hard to sort something infinite ;-) That's why the operation does not exist on DataStream. Stephan On Wed, Aug 17, 2016 at 6:22 PM, subash basnet wrote: > Hello all, > > I found the *sortPartition()* function in dataset for

Sorting in datastream

2016-08-17 Thread subash basnet
Hello all, I found the *sortPartition()* function in dataset for ordering the dataset elements as below: DataSet> data; DataSet> partitionedData = data.sortPartition(0, Order.DESCENDING); But I couldn't find any methods to sort the elements in

Re: Programmatically Creating a Flink Cluster On YARN

2016-08-17 Thread Maximilian Michels
Hi Benjamin, Please apologize the late reply. In the latest code base and also Flink 1.1.1, the Flink configuration doesn't have to be loaded via a file location read from an environment variable and it doesn't throw an exception if it can't find the config upfront (phew). Instead, you can also

Re: partial savepoints/combining savepoints

2016-08-17 Thread Till Rohrmann
Hi Claudia, On Wed, Aug 17, 2016 at 5:06 PM, Claudia Wegmann wrote: > Hi again, > > > > I was thinking this over and tried some stuff. Some questions remain, > though: > > > > To 2) > > a) I only have used standalone mode yet. What would be the upsides of > using Yarn? >

AW: partial savepoints/combining savepoints

2016-08-17 Thread Claudia Wegmann
Hi again, I was thinking this over and tried some stuff. Some questions remain, though: To 2) a) I only have used standalone mode yet. What would be the upsides of using Yarn? b) Does an easy way to start Flink (JobManager+TaskManagers) programmatically already exist? c) I tried to build my

counting elements in datastream

2016-08-17 Thread subash basnet
Hello all, There is *count()* function in database to count the number of elements in the dataset. But it's not there in case of datastream. What could be the way to count the number of elements in case of datastream. Best Regards, Subash Basnet

Re: TumblingEventTimeWindows with time characteristic set to 'ProcessingTime'

2016-08-17 Thread Stephan Ewen
You can still mix processing time and event time. You only need to set the base environment to event time in order to activate the timestamp/watermark transport. On Wed, Aug 17, 2016 at 1:40 PM, Al-Isawi Rami wrote: > Hi Till, > > Yes, I understand that. I am just

Re: TumblingEventTimeWindows with time characteristic set to 'ProcessingTime'

2016-08-17 Thread Al-Isawi Rami
Hi Till, Yes, I understand that. I am just asking why. If I am assigning the timestamps. why can’t flink deal with the window based on the event time? i.e TumblingEventTimeWindows. Why should I set the whole env to be ticking on event time? It is just that I have some windows that needs to

Re: Error joining with Python API

2016-08-17 Thread Chesnay Schepler
Found the issue, there was a missing tab in the chaining method... On 16.08.2016 12:12, Chesnay Schepler wrote: looks like a bug, will look into it. :) On 16.08.2016 10:29, Ufuk Celebi wrote: I think that this is actually a bug in Flink. I'm cc'ing Chesnay who originally contributed the

applying where after group by in dataset

2016-08-17 Thread subash basnet
Hello all, In the following dataset: DataSet> distancePoints; I wanted to count the number of *distancePoints* where boolean value is either true of false. distancePoints.groupBy(1). didn't find how to apply there 'where' clause here. Best Regards, Subash Basnet

Re: Cannot use WindowedStream.fold with EventTimeSessionWindows

2016-08-17 Thread Till Rohrmann
Hi Jack, the problem with session windows and a fold operation, which is an incremental operation, is that you don't have a way to combine partial folds when merigng windows. As a workaround you have to specify a window function where you get an iterator over all your window elements and then

Re: Azure Blob Storage Connector

2016-08-17 Thread Lau Sennels
I have successfully connected Azure blob storage to Flink-1.1. Below are the steps necessary: - Add hadoop-azure-2.7.2.jar (assuming you are using a Hadoop 2.7 Flink binary) and azure-storage-4.3.0.jar to /lib, and set file permissions / ownership accordingly. - Add the following to a file

Re: TumblingEventTimeWindows with time characteristic set to 'ProcessingTime'

2016-08-17 Thread Till Rohrmann
Hi Rami, have you set the event time characteristic to event time with env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);? Otherwise the event time windows won’t work. Cheers, Till ​ On Tue, Aug 16, 2016 at 2:42 PM, Al-Isawi Rami wrote: > Hi, > > Why

Re: Enriching events with data from external http resources

2016-08-17 Thread Maciek Próchniak
Hi Ufuk, thanks for info - this is good news :) maciek On 16/08/2016 12:16, Ufuk Celebi wrote: On Mon, Aug 15, 2016 at 8:52 PM, Maciek Próchniak wrote: I know it's not really desired way of using flink and that it would be better to keep data as state inside stream and have