Writing a custom Converter for Kafka Connect

2020-02-10 Thread Csaba Galyó
Hello, I have a large compressed file which contains lines of text (delimited by \n). I would like to send the file to a topic with kafka Connect. I tried writing a custom Converter but it's not clear how is it supposed to delimit by the \n characters. As I see it, the converter receives an

Re: MM2 for DR

2020-02-10 Thread benitocm
Thanks very much for the response. Please could you elaborate a bit more about "I'd arc in that direction. Instead of migrating A->B->C->D..., active/active is more like having one big cluster". Another thing that I would like to share is that currently my consumers only consumer from one topic

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread John Roesler
Hey all, Sorry for the confusion. Bruno set me straight offline. Previously, we had metrics for each reason for skipping records, and the rationale was that you would monitor the metrics and only turn to the logs if you needed to *debug* unexpected record skipping. Note that skipping records by

Re: MM2 for DR

2020-02-10 Thread Ryanne Dolan
Hello, sounds like you have this all figured out actually. A couple notes: > For now, we just need to handle DR requirements, i.e., we would not need active-active If your infrastructure is sufficiently advanced, active/active can be a lot easier to manage than active/standby. If you are

Re: Why would all consumers pull from the same partition?

2020-02-10 Thread Alex Woolford
As Chandrajeet said, the default behavior is to hash on the key to assign the message to a partition. You may be able to distribute the data more evenly across the brokers by changing the partition strategy in the producer. It *is* possible to change the default behavior to, say, round-robin. That

Re: Why would all consumers pull from the same partition?

2020-02-10 Thread Dylan Martin
Oooh, that sounds like our situation. Is there a way to avoid this with kafka configuration? My access to the messages and consumers is limited. From: Chandrajeet Padhy Sent: Monday, February 10, 2020 9:52 AM To: users@kafka.apache.org Subject: RE: Why would

MM2 for DR

2020-02-10 Thread benitocm
Hi, After having a look to the talk https://www.confluent.io/kafka-summit-lon19/disaster-recovery-with-mirrormaker-2-0 and the https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382 I am trying to understand how I would use it in the setup that I have. For now, we

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread Sophie Blee-Goldman
While I agree that seems like it was probably a refactoring mistake, I'm not convinced it isn't the right thing to do. John, can you reiterate the argument for setting it to debug way back when? I would actually present this exact situation as an argument for keeping it as warn, since something

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread John Roesler
Hi, I’m sorry for the trouble. It looks like it was a mistake during https://github.com/apache/kafka/pull/6521 Specifically, while addressing code review comments to change a bunch of other logs from debugs to warnings, that one seems to have been included by accident:

RE: Why would all consumers pull from the same partition?

2020-02-10 Thread Chandrajeet Padhy
Partition is decided based on message record key. If it's same, it will hit the same partition. -Original Message- From: Dylan Martin Sent: Monday, February 10, 2020 12:35 PM To: users@kafka.apache.org Subject: Why would all consumers pull from the same partition? I have a cluster of

Re: Why would all consumers pull from the same partition?

2020-02-10 Thread Piotr Smolinski
Could you share the evidence you have for this scenario? Which Kafka version? I would first identify the partition leader placing. Before Kafka 2.4 all produce and consume requests were against the leader. So if by any chance your leaders got located to the same broker, then this broker would

Why would all consumers pull from the same partition?

2020-02-10 Thread Dylan Martin
I have a cluster of 20'ish brokers. One topic has 60'ish consumers and 100 partitions, but the consumers all seem to be hitting the same broker, which makes me think they're all hitting the same partition. What would cause that? I assume I've configured something wrong. Thanks! -Dylan The

Fw: Confusingly unbalanced broker

2020-02-10 Thread Dylan Martin
I'll look into those tools, thanks! I was able to turn on the JMX polling and consumer metrics in kafka-manager. I now know which topic & partition is causing the problem. It's basically 80MB of a single partiton on a single topic being hit by 60'odd consumers. Now I need to figure out

Mirror Maker: supplying broker properties with SSL password encrypted

2020-02-10 Thread Chandrajeet Padhy
Is there a way to supply the consumer.config/producer.config property files having encrypted SSL password? I don't want to store the passwords in plaintext format in the filesystem bin/kafka-run-class.sh kafka.tools.MirrorMaker --consumer.config sourceCluster1Consumer.config --consumer.config

Re: "Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread Bruno Cadonna
Hi, I am pretty sure this was intentional. All skipped records log messages are on WARN level. If a lot of your records are skipped on app restart with this log message on WARN-level, they were also skipped with the log message on DEBUG-level. You simply did not know about it before. With an

"Skipping record for expired segment" in InMemoryWindowStore

2020-02-10 Thread Samek , Jiří
Hi, in https://github.com/apache/kafka/commit/9f5a69a4c2d6ac812ab6134e64839602a0840b87#diff-a5cfe68a5931441eff5f00261653dd10R134 log level of "Skipping record for expired segment" was changed from debug to warn. Was it intentional change? Should it be somehow handled by user? How can user handle