Re: ValueState is missing

2016-08-11 Thread Dong-iL, Kim
in my code, is the config of ExecutionEnv alright? > On Aug 11, 2016, at 8:47 PM, Dong-iL, Kim wrote: > > > my code and log is as below. > > >val getExecuteEnv: StreamExecutionEnvironment = { >val env = >

Re: Does Flink DataStreams using combiners?

2016-08-11 Thread Sameer W
Sorry I mean streaming cannot use combiners (repeated below) --- Streaming cannot use combiners. The aggregations happen on the trigger. The elements being aggregated are only known after the trigger delivers the elements to the evaluation function. Since windows can overlap and even

RE: flink - Working with State example

2016-08-11 Thread Ramanan, Buvana (Nokia - US)
Kostas, Good catch! That makes it working! Thank you so much for the help. Regards, Buvana -Original Message- From: Kostas Kloudas [mailto:k.klou...@data-artisans.com] Sent: Thursday, August 11, 2016 11:22 AM To: user@flink.apache.org Subject: Re: flink - Working with State example Hi

Does Flink DataStreams using combiners?

2016-08-11 Thread Elias Levy
I am wondering if Flink makes use of combiners to pre-reduce a keyed and windowed stream before shuffling the data among workers. I.e. will it use a combiner in something like: stream.flatMap {...} .assignTimestampsAndWatermarks(...) .keyBy(...) .timeWindow(...)

Re: Is java.sql.Timestamp fully suported in Flink SQL?

2016-08-11 Thread Timo Walther
Hi Davran, unfortunately, you found a bug. I created an issue for it ( https://issues.apache.org/jira/browse/FLINK-4385). You could convert the timestamp to a long value as a workaround. Table table1 = tableEnv.fromDataSet(dataSet1); Table table2 = tableEnv.fromDataSet(dataSet2); Table table

Re: Firing windows multiple times

2016-08-11 Thread Shannon Carey
"If Window B is a Folding Window and does not have an evictor then it should not keep the list of all received elements." Agreed! Upon closer inspection, the behavior I'm describing is only present when using EvictingWindowOperator, not when using WindowOperator. I misread line 382 of

Is java.sql.Timestamp fully suported in Flink SQL?

2016-08-11 Thread Davran Muzafarov
I have two tables created from data sets: List infos0 = . List infos1 = . DataSet dataSet0 = env.fromCollection( infos0 ); DataSet dataSet1 = env.fromCollection( infos1 ); tableEnv.registerDataSet( "table0", dataSet0 ); tableEnv.registerDataSet( "table1",

Unit tests failing, losing stream contents

2016-08-11 Thread Ciar, David B.
Hi everyone, I've been trying to write unit tests for my data stream bolts (map, flatMap, apply etc.), however the results I've been getting are strange. The code for testing is here (running with scalatest and sbt): https://gist.github.com/dbciar/7469adfea9e6442cdc9568aed07095ff It runs

Re: flink - Working with State example

2016-08-11 Thread Kostas Kloudas
Hi Buvana, At a first glance, your snapshotState() should return a Double. Kostas > On Aug 11, 2016, at 4:57 PM, Ramanan, Buvana (Nokia - US) > wrote: > > Thank you Kostas & Ufuk. I get into the following compilation error when I > use checkpointed

Re: Firing windows multiple times

2016-08-11 Thread Aljoscha Krettek
Hi Shannon, thanks for the clarification. If Window B is a Folding Window and does not have an evictor then it should not keep the list of all received elements. Could you maybe post the section of the log that shows what window operator is used for Window B? I'm looking for something like this:

RE: flink - Working with State example

2016-08-11 Thread Ramanan, Buvana (Nokia - US)
Thank you Kostas & Ufuk. I get into the following compilation error when I use checkpointed interface. Pasting the code & message as follows: Is the Serializable definition supposed to be from java.io.Serializable or somewhere else? Thanks again, Buvana

Re: flink - Working with State example

2016-08-11 Thread Kostas Kloudas
Exactly as Ufuk suggested, if you are not grouping your stream by key, you should use the checkpointed interface. The reason I asked before if you are using the keyBy() is because this is the one that implicitly sets the keySerializer and scopes your (keyed) state to a specific key. If there

Re: flink - Working with State example

2016-08-11 Thread Ufuk Celebi
This only works for keyed streams, you have to use keyBy(). You can use the Checkpointed interface instead (https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state.html#checkpointing-instance-fields). On Thu, Aug 11, 2016 at 3:35 PM, Ramanan, Buvana (Nokia - US)

Re: specify user name when connecting to hdfs

2016-08-11 Thread Ufuk Celebi
Do you also set fs.hdfs.hadoopconf in flink-conf.yaml (https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#common-options)? On Thu, Aug 11, 2016 at 2:47 PM, Dong-iL, Kim wrote: > Hi. > In this case , I used standalone cluster(aws EC2) and I wanna connect

RE: flink - Working with State example

2016-08-11 Thread Ramanan, Buvana (Nokia - US)
Hi Kostas, Here is my code. All I am trying to compute is (x[t] – x[t-1]), where x[t] is the current value of the incoming sample and x[t-1] is the previous value of the incoming sample. I store the current value in state store (‘prev_tuple’) so that I can use it for computation in next cycle.

Re: Strange behaviour of the flatMap Collector

2016-08-11 Thread Yassin Marzouki
Indeed, using the same parallelism corrected the output. Thank you! On Thu, Aug 11, 2016 at 2:34 PM, Stephan Ewen wrote: > Hi! > > The source runs parallel (n tasks), but the sink has a parallelism of 1. > The sink hence has to merge the parallel streams from the source, which

Re: specify user name when connecting to hdfs

2016-08-11 Thread Dong-iL, Kim
Hi. In this case , I used standalone cluster(aws EC2) and I wanna connect to remote HDFS machine(aws EMR). I register the location of core-site.xml as below. does it need other properties? fs.defaultFS hdfs://…:8020 hadoop.security.authentication

Re: Strange behaviour of the flatMap Collector

2016-08-11 Thread Stephan Ewen
Hi! The source runs parallel (n tasks), but the sink has a parallelism of 1. The sink hence has to merge the parallel streams from the source, which happens based on arrival speed of the streams, i.e., its not deterministic. That's why you see the lines being mixed. Try running source and sink

Re: specify user name when connecting to hdfs

2016-08-11 Thread Stephan Ewen
Hi! Do you register the Hadoop Config at the Flink Configuration? Also, do you use Flink standalone or on Yarn? Stephan On Tue, Aug 9, 2016 at 11:00 AM, Dong-iL, Kim wrote: > Hi. > I’m trying to set external hdfs as state backend. > my os user name is ec2-user. hdfs user

Strange behaviour of the flatMap Collector

2016-08-11 Thread Yassin Marzouki
Hi all, When I use out.collect() twice inside a faltMap, the output is sometimes and randomly skewed. Take this example: final StreamExecutionEnvironment env = StreamExecutionEnvironment.createLocalEnvironment(); env.generateSequence(1, 10) .flatMap((Long t, Collector out) -> {

Re: Maintaining global variables. Best practices.

2016-08-11 Thread Stephan Ewen
Hi! A global shared variable is not something that is offered by Flink right now. It is not part of the system, because it is not really part of the stream or state derived from individual streams. It is also quite hard to do efficiently and general purpose. I see that it is a useful tool in

Re: Firing windows multiple times

2016-08-11 Thread Kostas Kloudas
Just to add a drawback in solution 2) you may have some issues because window boundaries may not be aligned. For example the elements of a day window may be split between the last day of a month and the first of the next month. Kostas > On Aug 11, 2016, at 2:21 PM, Kostas Kloudas

Re: Firing windows multiple times

2016-08-11 Thread Kostas Kloudas
Hi Shanon, From what I understand, you want to have your results windowed by different different durations, e.g. by minute, by day, by month and you use the evictor to decide which elements should go into each window. If I am correct, then I do not think that you need the evictor which bounds

Re: Flink : CEP processing

2016-08-11 Thread Aljoscha Krettek
Hi, Sameet is right about the snapshotting. The CEP operator behaves more or less like a FlatMap operator that keeps some more complex state internally. Snapshotting works the same as with any other operator. Cheers, Aljoscha On Thu, 11 Aug 2016 at 00:54 Sameer W wrote: >

Re: ValueState is missing

2016-08-11 Thread Dong-iL, Kim
my code and log is as below. val getExecuteEnv: StreamExecutionEnvironment = { val env = StreamExecutionEnvironment.getExecutionEnvironment.enableCheckpointing(1) env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime)

Re: ValueState is missing

2016-08-11 Thread Ufuk Celebi
What do you mean with lost exactly? You call value() and it returns a value (!= null/defaultValue) and you call it again and it returns null/defaultValue for the same key with no update in between? On Thu, Aug 11, 2016 at 11:59 AM, Kostas Kloudas wrote: > Hello, > >

Re: ValueState is missing

2016-08-11 Thread Kostas Kloudas
Hello, Could you share the code of the job you are running? With only this information I am afraid we cannot help much. Thanks, Kostas > On Aug 11, 2016, at 11:55 AM, Dong-iL, Kim wrote: > > Hi. > I’m using flink 1.0.3 on aws EMR. > sporadically value of ValueState is

ValueState is missing

2016-08-11 Thread Dong-iL, Kim
Hi. I’m using flink 1.0.3 on aws EMR. sporadically value of ValueState is lost. what is starting point for solving this problem. Thank you.

[ANNOUNCE] Flink 1.1.1 Released

2016-08-11 Thread Ufuk Celebi
The Flink PMC is pleased to announce the availability of Flink 1.1.1. The Maven artifacts published on Maven central for the previous 1.1.0 version had a Hadoop dependency issue. No Hadoop 1 specific version (with version 1.1.0-hadoop1) was deployed and the 1.1.0 artifacts have a dependency on