Re: Aggregated windowed counts

2017-01-05 Thread Matthias J. Sax
On a clean restart on the same machine, the local RocksDB will just be reused as it contains the complete state. Thus there is no need to read the changelog topic at all. The changelog topic is only read when a state is moved from one node to another, or the state got corrupted due to an failure (

Re: Aggregated windowed counts

2017-01-05 Thread Benjamin Black
I understand now. The commit triggers the output of the window data, whether or not the window is complete. For example, if I use .print() as you suggest: [KSTREAM-AGGREGATE-03]: [kafka@148363192] , (9<-null) [KSTREAM-AGGREGATE-03]: [kafka@1483631925000] , (5<-null) [KSTREAM-AG

Re: Aggregated windowed counts

2017-01-04 Thread Matthias J. Sax
There is no such thing as a final window aggregate and you might see intermediate results -- thus the count do not add up. Please have a look here: http://stackoverflow.com/questions/38935904/how-to-send-final-kafka-streams-aggregation-result-of-a-time-windowed-ktable/38945277#38945277 and here:

Re: Aggregated windowed counts

2017-01-04 Thread Benjamin Black
I'm hoping the DSL will do what I want :) Currently the example is continuously adding instead of bucketing, so if I modify it by adding a window to the count function: .groupBy((key, word) -> word) .count(TimeWindows.of(5000L), "Counts") .toStream((k, v) -> k.key()); Then I do see bucketing happ

Re: Aggregated windowed counts

2017-01-04 Thread Matthias J. Sax
Do you know about Kafka Streams? It's DSL gives you exactly what you want to do. Check out the documentation and WordCount example: http://docs.confluent.io/current/streams/index.html https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/Wor