Could someone clarify how exactly event time/watermarks and allow lateness
work. I have created the program below and I have an input file such as...

  device_id,trigger_id,event_time,messageId
    1,START,1520433909396,1
    1,TRACKING,1520433914398,2
    1,TRACKING,1520433919398,3
    1,STOP,1520433924398,4
    1,START,1520433929398,5
    1,TRACKING,1520433934399,6
    1,TRACKING,1520433939399,7
    1,TRACKING,1520433944399,8
    1,STOP,1520433949399,9

Where trigger_id can be an indicator such as: start,tracking,stop. What I
would like to do is based on device_id group all incoming events and define
a window based on the trigger_id. I.e group all events from start until stop
and then do some calculations such as: average,max etc.

I call  env.readTextFile("events.csv"); and set StreamTimeCharacteristic to
EventTime. Parellism is set to 4. This means my messages from the source
file are read but are not in order... (I have added messageId as an counter
only for dev purposes)

I have defined a custom trigger to find stop events and fire in order to
evict all collected events. My main problem is that if parellism is
increased from 1 the input source reads these out of order.

Shouldn't event time and watermarks resolve this issue? How do i handle
possible out of order events?

public class SimpleEventJob {

        public static void main(String[] args) throws Exception {
                // set up the streaming execution environment
                final StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();
                env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
                env.setParallelism(4);
        //env.setParallelism(1);
                DataStream<String> input = env.readTextFile("events.csv");
                // create event stream
                DataStream<Event> events = input.map(new LineToEvent());
                DataStream<Event> waterMarkedStreams =
events.assignTimestampsAndWatermarks(new EventWAssigner());
                DataStream<TripEv> tripStream = 
waterMarkedStreams.keyBy("deviceId")
                                .window(GlobalWindows.create())
                                .trigger(new TripTrigger())
                                .evictor(new TripEvictor())
                                .apply(new CreateTrip());
                tripStream.print();
        // execute program
                env.execute("Flink Streaming Java API Skeleton");
        }
        
    public static class TripTrigger extends Trigger<Event, GlobalWindow> {
        @Override
        public TriggerResult onElement(Event event, long timestamp,
GlobalWindow window, TriggerContext context) throws Exception {
            // if this is a stop event, set a timer
            if (event.getTrigger() == 53) {
                                return TriggerResult.FIRE;
            }
            return TriggerResult.CONTINUE;
        }

                @Override
                public TriggerResult onEventTime(long time, GlobalWindow window,
TriggerContext ctx) {
            return TriggerResult.FIRE;
                }

                @Override
                public TriggerResult onProcessingTime(long time, GlobalWindow 
window,
TriggerContext ctx) {
                        return TriggerResult.CONTINUE;
                }

                @Override
                public void clear(GlobalWindow window, TriggerContext ctx) {
                }
    }

    private static class TripEvictor implements Evictor<Event, GlobalWindow>
{
                @Override
                public void evictBefore(Iterable<TimestampedValue&lt;Event>> 
events,
                int size, GlobalWindow window, EvictorContext ctx) {
                }

                @Override
                public void evictAfter(Iterable<TimestampedValue&lt;Event>> 
elements, int
size, GlobalWindow window, EvictorContext
                ctx) {
                        System.out.println(elements);
                        long firstStop = Event.earliestStopElement(elements);
                        // remove all events up to (and including) the first 
stop event (which is
the event that triggered the window)
                        for (Iterator<TimestampedValue&lt;Event>> iterator = 
elements.iterator();
iterator.hasNext(); ) {
                                TimestampedValue<Event> element = 
iterator.next();
                                if (element.getTimestamp() >= firstStop ) {
                                        iterator.remove();
                                }
                        }
                }
        }

        public static class CreateTrip implements WindowFunction<Event, TripEv,
Tuple, GlobalWindow> {
                @Override
                public void apply(Tuple key, GlobalWindow window, 
Iterable<Event> events,
Collector<TripEv> out) {
                        TripEv trp = new TripEv(events);
                        if (trp.length > 0) {
                                out.collect(trp);
                        }
                }
        }

        private static class LineToEvent implements MapFunction<String, Event> {
                @Override
                public Event map(String line) throws Exception {
                        return Event.fromString(line);
                }
        }


        private static class EventWAssigner implements
AssignerWithPunctuatedWatermarks<Event> {
                @Override
                public long extractTimestamp(Event event, long 
previousElementTimestamp) {
                        return event.getTimestamp();
                }

                @Override
                public Watermark checkAndGetNextWatermark(Event event, long
extractedTimestamp) {
                        // simply emit a watermark with every event
                        return new Watermark(extractedTimestamp - 1000L);
                }
        }
}







--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Reply via email to