Ideas to achieve no loss in a 0.10 Publisher?

2018-03-08 Thread R Krishna
We are looking for a way to guarantee no "publisher" loss where waiting until the cluster/network health is resolved is fine. What we are doing right now is to at avoid this loss in favor of at least once guarantee by doing a *future.get() on say 1000th record or 1s (whatever comes first)* and if

RE: Delayed processing

2018-03-08 Thread adrien ruffie
Hi Wim, this topic (processing order) has been cropping up for a while, several article, benchmark, and other think on the subject reaching this conclusion. After that you can, you can ask someone else another opinion on the subject. regards, Adrien De : Wim

Re: Delayed processing

2018-03-08 Thread Wim Van Leuven
Hello Guozhang, we ingest messages that we outgest is user facing datastore, after some additional processing. Due to GDPR, we can only retain that unanonymised data for a maximum period of time. Let's say 6 months. So, right before sending the data to the out topic, we'll branch the data into

Re: Delayed processing

2018-03-08 Thread Wim Van Leuven
Hey Adrien, thank you for the elaborate explanation! We are ingesting call data records here, which due to the nature of a telco network might not arrive in absolute logical order. If I understand your explanation correctly, you are saying that with your setup, Kafka guarantees the processing

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread John Roesler
I think what you're suggesting is to: 1. compile the main streams code, but not the tests 2. compile test-utils (and compile and run the test-utils tests) 3. compile and run the streams tests This works in theory, since the test-utils depends on the main streams code, but not the streams tests.

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread Guozhang Wang
MockProcessorContext is only used in unit tests, and hence we should be able to declare it as a test dependency of `streams` in gradle build file, which is OK. Guozhang On Thu, Mar 8, 2018 at 3:32 PM, John Roesler wrote: > Actually, replacing the MockProcessorContext in

Re: committing offset metadata in kafka streams

2018-03-08 Thread Matthias J. Sax
Thanks for the explanation. Not sure if setting the metadata you want to get committed in punctuation() would be sufficient. But I would think about it in more details if we get a KIP for this. It's correct that flushing and committing offsets is correlated. But it's not related to punctuation.

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread John Roesler
Thanks, Matthias, 1. I can move it into the o.a.k.streams.processor package; that makes sense. 2. I'm expecting most users to use in-memory state stores, so they won't need a state directory. In the "real" code path, the stateDir is extracted from the config by

Re: Delayed processing

2018-03-08 Thread Guozhang Wang
Hello Wim, Not sure if I understand your motivations for delayed processing, could you elaborate a bit more? Do you want to process raw messages, or do you want to process anonymised messages? Guozhang On Thu, Mar 8, 2018 at 12:35 PM, Wim Van Leuven < wim.vanleu...@highestpoint.biz> wrote: >

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread Matthias J. Sax
Isn't MockProcessorContext in o.a.k.test part of the unit-test package but not the main package? This should resolve the dependency issue. -Matthias On 3/8/18 3:32 PM, John Roesler wrote: > Actually, replacing the MockProcessorContext in o.a.k.test could be a bit > tricky, since it would make

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread Matthias J. Sax
Thanks for the KIP John. Couple of minor questions: - What about putting the mock into sub-package `processor` so it's in the same package name as the interface it implements? - What is the purpose of the constructor talking the `File stateDir` argument? The state directory should be encoded in

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread John Roesler
Actually, replacing the MockProcessorContext in o.a.k.test could be a bit tricky, since it would make the "streams" module depend on "streams:test-utils", but "streams:test-utils" already depends on "streams". At first glance, it seems like the options are: 1. leave the two separate

Re: [DISCUSS] KIP-267: Add Processor Unit Test Support to Kafka Streams Test Utils

2018-03-08 Thread John Roesler
Thanks for the review, Guozhang, In response: 1. I missed that! I'll look into it and update the KIP. 2. I was planning to use the real implementation, since folks might register some metrics in the processors and want to verify the values that get recorded. If the concern is about initializing

Re: Kafka 0.10.2.2 release

2018-03-08 Thread Matthias J. Sax
0.10.2.2 was not release. It's unclear atm if it will be. Maybe we need to upgrade your brokers to 0.11, 1.0, or upcoming 1.1 release. -Matthias On 3/8/18 10:58 AM, Ted Yu wrote: > Only found the following from brief search: > >

RE: Delayed processing

2018-03-08 Thread adrien ruffie
Hello Wim, does it matter (I think), because one of the big and principal features of Kafka is: Kafka is to do load balancing of messages and guarantee ordering in a distributed cluster. The order of the messages should be guaranteed, unless several cases: 1] Producer can cause data loss

Delayed processing

2018-03-08 Thread Wim Van Leuven
Hello, I'm wondering how to design a KStreams or regular Kafka application that can hold of processing of messages until a future time. This related to EU's data protection regulation: we can store raw messages for a given time; afterwards we have to store the anonymised message. So, I was

ZK disconnect caused ISR Shrink and Expand in the cluster

2018-03-08 Thread Ojha, Ashish
Hi, We have a setup of 10 node Kafka cluster + 5 node ZK cluster . Kafka version is 0.10.2.1 . We had an issue where a ZK follower lost connection from the ZK leader which triggered a series of ISR Shrink and ISR Expands . This caused some of the partitions to have less number of

Re: Kafka 0.10.2.2 release

2018-03-08 Thread Ted Yu
Only found the following from brief search: http://search-hadoop.com/m/Kafka/uyzND1kSw4Xm3wyj1?subj=0+10+2+2+bug+fix+release FYI On Thu, Mar 8, 2018 at 10:13 AM, Devendar Rao wrote: > Hi, > > We're hitting an issue with log cleaner and I see this is fixed in 0.10.2.2

Kafka 0.10.2.2 release

2018-03-08 Thread Devendar Rao
Hi, We're hitting an issue with log cleaner and I see this is fixed in 0.10.2.2 release as per https://issues.apache.org/jira/browse/KAFKA-5413 . But I don't see 0.10.2.2 release notes in https://kafka.apache.org/downloads was 0.10.2.2 ever released? Thanks, Devendar