Most likely this is late data https://beam.apache.org/documentation/programming-guide/#watermarks-and-late-data . Try configuring a trigger with a late data behavior that is more appropriate for your particular use case.
On Thu, Dec 7, 2017 at 3:03 PM Nishu <nishuta...@gmail.com> wrote: > Hi, > > I am running a Streaming pipeline with Flink runner. > *Operator sequence* is -> Reading the JSON data, Parse JSON String to the > Object and Group the object based on common key. I noticed that > GroupByKey operator throws away some data in between and hence I don't get > all the keys as output. > > In the below screenshot, 1001 records are read from kafka Topic , each > record has unique ID . After grouping it returns only 857 unique IDs. > Ideally it should return 1001 records from GroupByKey operator. > > > [image: Inline image 3] > > Any idea, what can be the issue? Thanks in advance! > > -- > Thanks & Regards, > Nishu Tayal >