Re: KStream Close Processor

2016-04-11 Thread Guozhang Wang
Yeah we can definitely do better in documentation. While regarding the API changes I would prefer to hold and think through if such use cases are common in pattern, and that if we can even re-order the closing process to get around the issue I mentioned above if it is required. Guozhang On Mon,

Re: KStream Close Processor

2016-04-11 Thread Matthias J. Sax
What about extending the API with a method beforeClose() that enables the user to flush buffered data? Maybe we can also rename close() to afterClose(), to make the difference clear. At least, we should document when close() is called -- from a user point of view, I would expect that close()

Re: KStream Close Processor

2016-04-10 Thread Guozhang Wang
Re 1), Kafka Streams intentionally close all underlying clients before closing processors since some of closing the processors require shutting down its processor state managers, for example we need to make sure producer's message sends // have all been acked before the state manager records //

Re: KStream Close Processor

2016-04-09 Thread Guozhang Wang
Mike, Not clear what do you mean by "buffering up the contents". Producer itself already did some buffering and batching when sending to Kafka. Did you actually "merge" multiple small messages into one large message before giving it to the producer in the app code? In either case, I am not sure

Re: KStream Close Processor

2016-04-09 Thread Michael D. Coon
Guozhang,    In my processor, I'm buffering up contents of the final messages in order to make them larger. This is to optimize throughput and avoid tiny messages from being injected downstream. So nothing is being pushed to the producer until my configured thresholds are met in the buffering

Re: KStream Close Processor

2016-04-08 Thread Guozhang Wang
Hi Michael, When you call KafkaStreams.close(), it will first trigger a commitAll() function, which will 1) flush local state store if necessary; 2) flush messages buffered in producer; 3) commit offsets on consumer. Then it will close the producer / consumer clients and shutdown the tasks. So

KStream Close Processor

2016-04-08 Thread Michael D. Coon
All,    I'm seeing my processor's "close" method being called AFTER my downstream producer has been closed. I had assumed that on close I would be able to flush whatever I had been buffering up to send to kafka topic. In other words, we've seen significant performance differences in building