Re: KafkaStream: puncutuate() never called even when data is received by process()

2016-11-28 Thread Guozhang Wang
Yes we are considering to differentiate "process" and "punctuate" function to be "data-driven" and "time-driven" computations. That is, triggering of punctuate should NOT be depending on the arrival of messages, or the message's associated timestamps. As for now I think periodically inserting the

Re: KafkaStream: puncutuate() never called even when data is received by process()

2016-11-26 Thread David Garcia
I know that the Kafka team is working on a new way to reason about time. My team's solution was to not use punctuate...but this only works if you have guarantees that all of the tasks will receive messages..if not all the partitions. Another solution is to periodically send canaries to all

Re: KafkaStream: puncutuate() never called even when data is received by process()

2016-11-23 Thread shahab
Thanks for the comments. @David: yes, I have a source which is reading data from two topics and one of them were empty while the second one was loaded with plenty of data. So what do you suggest to solve this ? Here is snippet of my code: StreamsConfig config = new

Re: KafkaStream: puncutuate() never called even when data is received by process()

2016-11-23 Thread David Garcia
If you are consuming from more than one topic/partition, punctuate is triggered when the “smallest” time-value changes. So, if there is a partition that doesn’t have any more messages on it, it will always have the smallest time-value and that time value won’t change…hence punctuate never gets

Re: KafkaStream: puncutuate() never called even when data is received by process()

2016-11-23 Thread Matthias J. Sax
Your understanding is correct: Punctuate is not triggered base on wall-clock time, but based in internally tracked "stream time" that is derived from TimestampExtractor. Even if you use WallclockTimestampExtractor, "stream time" is only advance if there are input records. Not sure why

KafkaStream: puncutuate() never called even when data is received by process()

2016-11-23 Thread shahab
Hello, I am using low level processor and I set the context.schedule(1), assuming that punctuate() method is invoked every 10 sec . I have set configProperties.put(StreamsConfig.TIMESTAMP_EXTRACTOR_CLASS_CONFIG, WallclockTimestampExtractor.class.getCanonicalName()) ) Although data is keep

Re: puncutuate() never called

2016-10-11 Thread Michael Noll
Thanks for the follow-up and the bug report, David. We're taking a look at that. On Mon, Oct 10, 2016 at 4:36 PM, David Garcia wrote: > Thx for the responses. I was able to identify a bug in how the times are > obtained (offsets resolved as unknown cause the issue): >

Re: puncutuate() never called

2016-10-10 Thread David Garcia
Thx for the responses. I was able to identify a bug in how the times are obtained (offsets resolved as unknown cause the issue): “Actually, I think the bug is more subtle. What happens when a consumed topic stops receiving messages? The smallest timestamp will always be the static timestamp

Re: puncutuate() never called

2016-10-10 Thread Michael Noll
> We have run the application (and have confirmed data is being received) for over 30 mins…with a 60-second timer. Ok, so your app does receive data but punctuate() still isn't being called. :-( > So, do we need to just rebuild our cluster with bigger machines? That's worth trying out. See

Re: puncutuate() never called

2016-10-07 Thread David Garcia
Yeah, this is possible. We have run the application (and have confirmed data is being received) for over 30 mins…with a 60-second timer. So, do we need to just rebuild our cluster with bigger machines? -David On 10/7/16, 11:18 AM, "Michael Noll" wrote: David,

Re: puncutuate() never called

2016-10-07 Thread Michael Noll
David, punctuate() is still data-driven at this point, even when you're using the WallClock timestamp extractor. To use an example: Imagine you have configured punctuate() to be run every 5 seconds. If there's no data being received for a minute, then punctuate won't be called -- even though

puncutuate() never called

2016-10-07 Thread David Garcia
Hello, I’m sure this question has been asked many times. We have a test-cluster (confluent 3.0.0 release) of 3 aws m4.xlarges. We have an application that needs to use the punctuate() function to do some work on a regular interval. We are using the WallClock extractor. Unfortunately, the