Re: Kafka streams in Kubernetes

2019-06-10 Thread Matthias J. Sax
What I try to say is, that compaction is not perfect. Assuming you have 10 unique keys, and a message size of 1KB, this implies that your data set if it's perfectly compacted would be roughly 100MB. The default segment size is 1GB and the active segment is not compacted. Hence, if the active

Re: Kafka streams in Kubernetes

2019-06-10 Thread Scott Reynolds
We have been giving this a bunch of thought lately. We attempted to replace PARTITION_ASSIGNMENT_STRATEGY_CONFIG with our implementation that hooks into our deployment service. The idea is simple, the new deployment gets *Standby tasks assigned to them until they are caught up*. Once they are caugh

Re: Repeating UNKNOWN_PRODUCER_ID errors for Kafka streams applications

2019-06-10 Thread Guozhang Wang
Hi Pieter, My reasoning is that the most recent segment (called the `active` segment`) would not be deleted for immediately since it is still being appended. I.e. say you have two segments offset range at [0, 100), [100, 180). And if the delete records API is issued to delete any records older tha

Re: Kafka streams in Kubernetes

2019-06-10 Thread Parthasarathy, Mohan
Matt, I read your email again and this one that you point out: > What you also need to take into account is, how often topics are > compacted, and how large the segment size is, because the active segment > is not subject to compaction. Are you saying that compaction aff

Re: WakeupException while commitSync

2019-06-10 Thread Boris Molodenkov
Can anyone answer? On Fri, Jun 7, 2019 at 12:49 PM Boris Molodenkov wrote: > Hello. > > I noticed that sometimes offsets were committed even though commitSync > threw WakeupException. Is it correct consumer behavior? > > Thanks >

Re: Kafka streams in Kubernetes

2019-06-10 Thread Parthasarathy, Mohan
Thanks. That helps me understand why recreating state might take time. -mohan On 6/9/19, 11:50 PM, "Matthias J. Sax" wrote: By default, Kafka Streams does not "close" windows. To handle out-of-order data, windows are maintained until their retention time passed, and are upda

Re: First time building a streaming app and I need help understanding how to build out my use case

2019-06-10 Thread SenthilKumar K
```*When I get a request for all of the messages containing a given user ID, I need to query in to the topic and get the content of those messages. Does that make sense and is it a think Kafka can do?*``` - If i understand correctly , your requirement is to Query the Kafka Topics based on key. Exam

Re: First time building a streaming app and I need help understanding how to build out my use case

2019-06-10 Thread Simon Calvin
Martin, Thank you very much for your reply. I appreciate the perspective on securing communications with Kafka, but before I get to that point I'm trying to figure out if/how I can implement this use case specifically in Kafka. The point that I'm stuck on is needing to query for specific messag

How spark structured streaming consumers initiated and invoked while reading multi-partitioned kafka topics?

2019-06-10 Thread Shyam P
Hi, Any suggestions regarding below issue? https://stackoverflow.com/questions/56524921/how-spark-structured-streaming-consumers-initiated-and-invoked-while-reading-mul Thanks, Shyam

Re: First time building a streaming app and I need help understanding how to build out my use case

2019-06-10 Thread Martin Gainty
MG>below From: Simon Calvin Sent: Friday, June 7, 2019 3:39 PM To: users@kafka.apache.org Subject: First time building a streaming app and I need help understanding how to build out my use case Hello, everyone. I feel like I have a use case that it is well suite