Hi Dominik,

Was the job running with processing time or event time? If event time, how are 
you producing the watermarks?
Normally to understand how windows are firing in Flink, these two factors would 
be the place to look at.
I can try to further explain this once you provide info with these. Also, are 
you using Kafka 0.10?

Cheers,
Gordon

On March 27, 2017 at 11:25:49 PM, Dominik Safaric (dominiksafa...@gmail.com) 
wrote:

Hi all,  

Lately I’ve been investigating onto the performance characteristics of Flink 
part of our internal benchmark. Part of this we’ve developed and deployed an 
application that pools data from Kafka, groups the data by a key during a fixed 
time window of a minute.  

In total, the topic that the KafkaConsumer pooled from consists of 100 million 
messages each of 100 bytes size. What we were expecting is that no records will 
be neither read nor produced back to Kafka for the first minute of the window 
operation - however, this is unfortunately not the case. Below you may find a 
plot showing the number of records produced per second.  

Could anyone provide an explanation onto the behaviour shown in the graph 
below? What are the reasons behind consuming/producing messages from/to Kafka 
while the window has not expired yet?  

Reply via email to