Re: Lot of data generated in out file

2018-04-15 Thread Ashish Attarde
Thanks Gordon for your reply.

The out file mistry got resolved. Someone accidently, modified the POJO
code on server that I work on to, and had put in println.

Thank you for the information. I am experimenting with windowing to
understand better and fit in my use case.

Thanks
-Ashish

On Sun, Apr 15, 2018 at 10:09 PM, Tzu-Li (Gordon) Tai 
wrote:

> Hi Ashish,
>
> I don't really see why there are outputs in the out file for the program
> you
> provided. Perhaps others could chime in here ..
>
> As for your second question regarding window outputs:
> Yes, subsequent window operators should definitely be doable in Flink.
> This is just a matter of multiple transformations in your pipeline.
> The only restriction right now, is that after a window operation, the
> stream
> is no longer a KeyedStream, so you would need to "re-key" the stream before
> applying the second windowed transformation.
>
> Cheers,
> Gordon
>
>
>
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.
> n4.nabble.com/
>



-- 

Thanks
-Ashish Attarde


Re: Lot of data generated in out file

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi Ashish,

I don't really see why there are outputs in the out file for the program you
provided. Perhaps others could chime in here ..

As for your second question regarding window outputs:
Yes, subsequent window operators should definitely be doable in Flink.
This is just a matter of multiple transformations in your pipeline.
The only restriction right now, is that after a window operation, the stream
is no longer a KeyedStream, so you would need to "re-key" the stream before
applying the second windowed transformation.

Cheers,
Gordon



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/


Lot of data generated in out file

2018-04-06 Thread Ashish Attarde
Hi Flink Team,

I am seeing one of the out file for on my task manager is dumping lot of
data.
Not sure, why this is happening. All the data that is getting dumped in out
file is ideally what *parsedInput *stream should be getting.



Here is the flink program that is executing:

env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
env.setParallelism(1);

DataStream rawInput = env.addSource(new FlinkKafkaConsumer010<>(
"event-ft",
new SimpleStringSchema(),
kafkaProps).setStartFromLatest());

DataStream input2 = rawInput
.map(new KafkaMsgReads());

DataStream parsedInput = input2
.flatMap(new Splitter())
.assignTimestampsAndWatermarks(new
BoundedOutOfOrdernessTimestampExtractor(Time.seconds(2)) {
@Override
public long
extractTimestamp(EventRec record) {
return
record.getmTimeStamp()/TT_SCALE_FACTOR;
}
}).rebalance().map(new RawInputCounter());

parsedInput
.keyBy("mflowHashLSB","mflowHashMSB")

.window(SlidingEventTimeWindows.of(Time.milliseconds(1000),Time.milliseconds(950)))
.allowedLateness(Time.seconds(1))
.apply(new CRWindow());

parsedInput.writeUsingOutputFormat(new DiscardingOutputFormat<>());

env.execute();


Here is the definition of *CRWindow* class:


public static class CRWindow  implements WindowFunction {

@Override
public void apply(Tuple key, TimeWindow window, Iterable
ftRecords, Collector collector) {
return;
}

}


Also, is there any elaborate documentation of windowing mechanism
available? I am intereseted in using windowing with ability to push
the events from one window to future window. Similar funcationality
exist in storm for pushing an event to subsequent window.


Thanks

-Ashish