Re: Race condition with stream use of Global KTable

2019-04-03 Thread Guozhang Wang
Hi Raman, What I'm not clear is that since topic-2 is a transformed topic of topic-1 via "other stream", then why do you still need to join it with topic-1? Or in other words, are topic-1 and topic-2 containing different data, or topic-2 is just storing similar data of topic-1 but just in differen

Re: kafka scaling

2019-04-03 Thread Evelyn Bayes
Hi Ramz, A good rule of thumb has been no more than 4,000 partitions per broker and no more than 100,000 in a cluster. This includes all replicas and it's related more to Kafka internals then it is resource usage so I strongly advise not pushing these limits. Otherwise, the usual reasons for sc

kafka scaling

2019-04-03 Thread Rammohan Vanteru
Hi users, On what basis should we scale kafka cluster what would be symptoms for scaling kafka. I have a 3 node kafka cluster upto how many max partitions a single broker or kafka cluster can support? If any article or knowledge share would be help on scaling kafka. Thanks, Ramz.

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Liam Clarke
And to share my experience of doing similar - certain messages on our system must not be duplicated, but as they are bounced back to us from third parties, duplication is inevitable. So I deduplicate them using Spark structured streaming's flapMapGroupsWithState to deduplicate based on a business k

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Hans Jespersen
Ok what you are describing is different from accidental duplicate message pruning which is what the idempotent publish feature does. You are describing a situation were multiple independent messages just happen to have the same contents (both key and value). Removing those messages is an applic

Kafka SASL auth setup error: Connection to node 0 (localhost/127.0.0.1:9092) terminated during authentication

2019-04-03 Thread Shantanu Deshmukh
Hello everyone, I am trying to setup Kafka SASL authentication on my single node Kafka on my local machine. version 2. Here's my Kafka broker JAAS file: KafkaServer { org.apache.kafka.common.security.plain.PlainLoginModule required username="admin" password="admin" user_admin="admin"

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Dimitry Lvovsky
I've done this using kafka streams: specifically, I created a processor, and use a keystore (a functionality of streams) to save/check for keys and only forwarding messages that were not in the keystore. Since the keystore is in memory, and backed by the local filesystem on the node the processor i

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Vincent Maurin
Hi, Idempotence flag will guarantee that the message is produce exactly one time on the topic i.e that running your command a single time will produce a single message. It is not a unique enforcement on the message key, there is no such thing in Kafka. In Kafka, a topic containing the "history" o