Re: Lot of data generated in out file
Thanks Gordon for your reply. The out file mistry got resolved. Someone accidently, modified the POJO code on server that I work on to, and had put in println. Thank you for the information. I am experimenting with windowing to understand better and fit in my use case. Thanks -Ashish On Sun, Apr 15, 2018 at 10:09 PM, Tzu-Li (Gordon) Tai wrote: > Hi Ashish, > > I don't really see why there are outputs in the out file for the program > you > provided. Perhaps others could chime in here .. > > As for your second question regarding window outputs: > Yes, subsequent window operators should definitely be doable in Flink. > This is just a matter of multiple transformations in your pipeline. > The only restriction right now, is that after a window operation, the > stream > is no longer a KeyedStream, so you would need to "re-key" the stream before > applying the second windowed transformation. > > Cheers, > Gordon > > > > -- > Sent from: http://apache-flink-user-mailing-list-archive.2336050. > n4.nabble.com/ > -- Thanks -Ashish Attarde
Re: Lot of data generated in out file
Hi Ashish, I don't really see why there are outputs in the out file for the program you provided. Perhaps others could chime in here .. As for your second question regarding window outputs: Yes, subsequent window operators should definitely be doable in Flink. This is just a matter of multiple transformations in your pipeline. The only restriction right now, is that after a window operation, the stream is no longer a KeyedStream, so you would need to "re-key" the stream before applying the second windowed transformation. Cheers, Gordon -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Lot of data generated in out file
Hi Flink Team, I am seeing one of the out file for on my task manager is dumping lot of data. Not sure, why this is happening. All the data that is getting dumped in out file is ideally what *parsedInput *stream should be getting. Here is the flink program that is executing: env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); env.setParallelism(1); DataStream rawInput = env.addSource(new FlinkKafkaConsumer010<>( "event-ft", new SimpleStringSchema(), kafkaProps).setStartFromLatest()); DataStream input2 = rawInput .map(new KafkaMsgReads()); DataStream parsedInput = input2 .flatMap(new Splitter()) .assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor(Time.seconds(2)) { @Override public long extractTimestamp(EventRec record) { return record.getmTimeStamp()/TT_SCALE_FACTOR; } }).rebalance().map(new RawInputCounter()); parsedInput .keyBy("mflowHashLSB","mflowHashMSB") .window(SlidingEventTimeWindows.of(Time.milliseconds(1000),Time.milliseconds(950))) .allowedLateness(Time.seconds(1)) .apply(new CRWindow()); parsedInput.writeUsingOutputFormat(new DiscardingOutputFormat<>()); env.execute(); Here is the definition of *CRWindow* class: public static class CRWindow implements WindowFunction { @Override public void apply(Tuple key, TimeWindow window, Iterable ftRecords, Collector collector) { return; } } Also, is there any elaborate documentation of windowing mechanism available? I am intereseted in using windowing with ability to push the events from one window to future window. Similar funcationality exist in storm for pushing an event to subsequent window. Thanks -Ashish