Re: Kafka Consumer consuming large number of messages

2016-05-04 Thread Jaikiran Pai
Going by the name of that property (max.partition.fetch.bytes), I'm guessing it's the max fetch bytes per partition of a topic. Are you sure the data you are receiving in that consumers doesn't belong to multiple partitions and hence can/might exceed the value that's set per partition? By the

Re: Kafka just ate my homework?

2016-05-04 Thread Salman Ahmed
Which version of Kafka are you on? On Wed, May 4, 2016 at 4:54 PM John Bickerstaff wrote: > Hi, > > I've been working with Kafka a lot recently and have the log retention set > to over two years. > > Today, as I was trying some things I happened to have the log running

Where does the array come from?

2016-05-04 Thread John Bickerstaff
I get the following error when trying to print out the message list on the command line using: kafka-console-consumer.sh --zookeeper 192.168.56.5:2181/kafka --topic statdxSolrXmlDocs --from-beginning [2016-05-04 19:55:03,690] WARN

Getting a list of consumers using KafkaConsumer/KafkaProducer

2016-05-04 Thread marko
Is there a way to get a list of all consumer groups and consumer group offsets using either KafkaConsumer or KafkaProducer (or some other method) in the new Java client? Best regards, Marko www.kafkatool.com

Kafka just ate my homework?

2016-05-04 Thread John Bickerstaff
Hi, I've been working with Kafka a lot recently and have the log retention set to over two years. Today, as I was trying some things I happened to have the log running in another window and on two different VMs, Kafka decided to delete all my log messages, thus losing the entire topic's worth of

Re: Backing up Kafka data and using it later?

2016-05-04 Thread John Bickerstaff
Thanks - does that mean that the only way to safely back up Kafka is to have replication? (I have done this partially - I can get the entire topic on the command line, after completely recreating the server, but my code that is intended to do the same thing just hangs) On Wed, May 4, 2016 at

Re: Kafka Consumer consuming large number of messages

2016-05-04 Thread Abhinav Solan
Thanks a lot Jens for the reply. One thing is still unclear is this happening only when we set the max.partitions.fetch.bytes to a higher value ? Because I am setting it quite lower at 8192 only instead, because I can control the size of the data coming in Kafka, so even after setting this value

Re: Backing up Kafka data and using it later?

2016-05-04 Thread Rad Gruchalski
John, I believe you mean something along the lines of: http://markmail.org/message/f7xb5okr3ujkplk4 I don’t think something like this has been done. Best regards,
 Radek Gruchalski 
ra...@gruchalski.com (mailto:ra...@gruchalski.com)
 (mailto:ra...@gruchalski.com)

Backing up Kafka data and using it later?

2016-05-04 Thread John Bickerstaff
Hi, I have what is probably an edge use case. I'd like to back up a single Kafka instance such that I can recreate a new server, drop Kafka in, drop the data in, start Kafka -- and have all my data ready to go again for consumers. Is such a thing done? Does anyone have any experience trying

Re: Kafka Consumer consuming large number of messages

2016-05-04 Thread Jens Rantil
Hi, This is a known issue. The 0.10 release will fix this. See https://cwiki.apache.org/confluence/display/KAFKA/KIP-41%3A+KafkaConsumer+Max+Records for some background. Cheers, Jens Den ons 4 maj 2016 19:32Abhinav Solan skrev: > Hi, > > I am using kafka-0.9.0.1 and

Re: Hash partition of key with skew

2016-05-04 Thread Wesley Chow
We don’t do this on the Kafka side, but for a different system that has similar distribution problems we manually maintain a map of “hot” keys. On the Kafka side, we distribute keys with an even distribution in our largest volume topic, and then squash the data and repartition based on a

Re: kafka 0.9 offset unknown after cleanup

2016-05-04 Thread Michael Freeman
Thanks Tom, Thats a lot clearer now. Michael On Wed, May 4, 2016 at 3:27 PM, Tom Crayford wrote: > Michael, > > Increasing the offset retention period means that old consumers will use up > space in the __consumer_offsets topic and in the cache the

Re: Hash partition of key with skew

2016-05-04 Thread Srikanth
Yeah, fixed slicing may help. I'll put more thought into this. You had mentioned that you didn't put custom partitioner into production. Would you mind sharing how you worked around this currently? Srikanth On Tue, May 3, 2016 at 5:43 PM, Wesley Chow wrote: > > > > Upload

Subscribe request

2016-05-04 Thread Qian, Sophie

Re: Hash partition of key with skew

2016-05-04 Thread Srikanth
Having 1 partition and consumer thread per unique key value will result in the hot partition problem. Some keys get disproportionately high records. It should be fine if a consumer has to deal with a few keys, it doesn't have to be 1:1 mapping. May be I should try to solve this some other way.

Re: Kafka Streams: KStream - KTable Left Join

2016-05-04 Thread Matthias J. Sax
+1 I had the same thought and put it on my personal agenda already. -Matthias On 05/04/2016 06:37 PM, Jay Kreps wrote: > Is it possible to make the error message give more an explanation? > > -Jay > > On Wed, May 4, 2016 at 8:46 AM, Matthias J. Sax > wrote: > >> Hi,

Kafka Consumer consuming large number of messages

2016-05-04 Thread Abhinav Solan
Hi, I am using kafka-0.9.0.1 and have configured the Kafka consumer to fetch 8192 bytes by setting max.partition.fetch.bytes Here are the properties I am using props.put("bootstrap.servers", servers); props.put("group.id", "perf-test"); props.put("offset.storage", "kafka");

Re: KafkaProducer block on send

2016-05-04 Thread Oleg Zhurakousky
When I get a chance to reproduce the scenario I’ll follow up, but regardless the real question when advertising send as an "async send" how can it possibly block? It’s not async then. What’s the rational behind that? You are returning the Future (which is good) essentially delegating back to the

Re: KafkaProducer block on send

2016-05-04 Thread Mayuresh Gharat
I am not sure why max.block.ms does not suffice here? Also the waitOnMetadata will block only for the first time, later on it will have the metadata. I am not abler to understand the motivation here. Can you explain with an example? Thanks, Mayuresh On Wed, May 4, 2016 at 9:55 AM, Dana Powers

Re: KafkaProducer block on send

2016-05-04 Thread Dana Powers
I think changes of this sort (design changes as opposed to bugs) typically go through a KIP process before work is assigned. You might consider starting a KIP discussion and see if there is interest in pursuing your proposed changes. -Dana On May 4, 2016 7:58 AM, "Oleg Zhurakousky"

Log level for consumer properties

2016-05-04 Thread hsy...@gmail.com
Hi, Right now, when we initialize kafka consumer, it always log the consumer properties at INFO level, can we put it into DEBUG level? I have to periodically create consumer instance to just pull some metadata of some topic and I don't want to get this noisy log. Regards, Siyuan

Re: Kafka Streams: KStream - KTable Left Join

2016-05-04 Thread Jay Kreps
Is it possible to make the error message give more an explanation? -Jay On Wed, May 4, 2016 at 8:46 AM, Matthias J. Sax wrote: > Hi, > > I am still new to Kafka Streams by myself, but from my understanding if > you change the key, your partitioning changes, ie, is not

Re: Kafka Streams: KStream - KTable Left Join

2016-05-04 Thread Matthias J. Sax
Hi, I am still new to Kafka Streams by myself, but from my understanding if you change the key, your partitioning changes, ie, is not valid anymore. Thus, the joins (which assumes co-located data) cannot be performed (this is the reason why sources get set to null). You can write to an

Re: Kafka Monitoring using JMX Mbeans

2016-05-04 Thread Alexis Lê-Quôc
And if you're looking for some background on the various kafka metrics, we put a guide together: https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/ On Wed, May 4, 2016 at 3:37 AM, 马哲超 wrote: > We use jmxcmd as the command line tool. > > 2016-05-04

Error reading consumer offsets

2016-05-04 Thread Mike Tonks
Hi, I'm receiving an error when trying to read consumer offsets using the kafka-consumer-groups script. The error occurs when I consume using the pykafka library, but not when I consume with the kafka-console-consumer (which works fine) Kafka version: 0.9.0.1 Scala version: 2.10 PyKafka

RE: KafkaProducer block on send

2016-05-04 Thread Paolo Patierno
It's sad that after almost one month it's still "unassigned" :-( Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor Twitter : @ppatierno Linkedin : paolopatierno Blog : DevExperience > Subject: Re: KafkaProducer block on send >

Re: KafkaProducer block on send

2016-05-04 Thread Oleg Zhurakousky
Indeed it is. Oleg > On May 4, 2016, at 10:54 AM, Paolo Patierno wrote: > > It's sad that after almost one month it's still "unassigned" :-( > > Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat > Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor >

Re: Receiving "The session timeout is not within an acceptable range" but AFAIK it is within range

2016-05-04 Thread Ismael Juma
Hi Mario, A warning should be logged by the consumer if you have unused properties in your config (precisely for this reason). In addition, 0.10 will also have improved error messages for cases such as this and we have increased the default `group.max.session.timeout.ms

RE: Receiving "The session timeout is not within an acceptable range" but AFAIK it is within range

2016-05-04 Thread Mario Ricci
Hi Jaikiran, You are correct - I assumed it was a consumer setting and it is a broker setting. The consumer quietly took the broker setting and didn't throw an exception, and when trying to set my session timeout above the broker's allowed max the error message given is very poor. If it

Re: KafkaProducer block on send

2016-05-04 Thread Oleg Zhurakousky
Sure Here are both: https://issues.apache.org/jira/browse/KAFKA-3539 https://issues.apache.org/jira/browse/KAFKA-3540 On May 4, 2016, at 3:24 AM, Paolo Patierno > wrote: Hi Oleg, can you share the JIRA link here because I totally agree with you.

Kafka Streams: KStream - KTable Left Join

2016-05-04 Thread Gaspar Muñoz
Hi there, I am not able to perform a Left Join between a KStream and KTable in Kafka Streams. Exception in thread "main" org.apache.kafka.streams.errors.TopologyBuilderException: Invalid topology building: KSTREAM-FILTER-03 and KTABLE-SOURCE-05 are not joinable at

Re: kafka 0.9 offset unknown after cleanup

2016-05-04 Thread Tom Crayford
Michael, Increasing the offset retention period means that old consumers will use up space in the __consumer_offsets topic and in the cache the brokers hold on that topic. The cache is in memory, so if you churn consumer groups a lot that could be problematic. The other issue is that booting new

Re: 0.8.x consumer group protocol

2016-05-04 Thread Dana Powers
0.8 clients manage groups by connecting directly to zookeeper and implementing shared group management code. There are no broker APIs used. 0.9 clients manage groups using new kafka broker APIs. These clients no longer connect directly to zookeeper. JoinGroupRequest is an 0.9 api. For an 0.8ish

Re: What makes a message key mandatory and how to turn it off?

2016-05-04 Thread I PVP
Kamal, Thank you. Using log.cleanup.policy=delete solved the issue. -- IPVP From: Kamal C Reply: users@kafka.apache.org > Date: May 4, 2016 at 6:58:34 AM To: users@kafka.apache.org

Re: kafka 0.9 offset unknown after cleanup

2016-05-04 Thread Michael Freeman
Hey Tom, Are there any details on the negative side effects of increasing the offset retention period? I'd like to increase it but want to be aware of the risks. Thanks Michael > On 4 May 2016, at 05:06, Tom Crayford wrote: > > Jun, > > Yep, you got

Re: What makes a message key mandatory and how to turn it off?

2016-05-04 Thread Kamal C
Yes. Use *log.cleanup.policy=delete* if you don't want to compact topics. Reference: https://cwiki.apache.org/confluence/display/KAFKA/Log+Compaction On Wed, May 4, 2016 at 3:24 PM, I PVP wrote: > Kamal, > > Could the log.cleanup.policy=compact on server.properties be the >

Re: What makes a message key mandatory and how to turn it off?

2016-05-04 Thread I PVP
Kamal, Could the log.cleanup.policy=compact on server.properties be the cause ? # /opt/kafka/bin/kafka-topics.sh --zookeeper localhost:2181 --topic user_track --describe Topic:user_track PartitionCount:1 ReplicationFactor:2 Configs: Topic: user_track Partition: 0 Leader: 1003 Replicas:

0.8.x consumer group protocol

2016-05-04 Thread Zaiming Shi
Hi there! I'm investigating what it means to implement consumer group protocol for 0.8. However all the documents I can find on line is for 0.9 e.g. https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design Also, in kafka code base, JOIN_GROUP_REQUEST_V0 schema in 0.8

Re: What makes a message key mandatory and how to turn it off?

2016-05-04 Thread Kamal C
Can you describe your topic configuration using the below command ? *sh kafka-topics.sh --zookeeper localhost:2181 --topic --describe* Key for a record is mandatory only for compacted topics. --Kamal On Wed, May 4, 2016 at 2:25 PM, I PVP wrote: > HI all, > > What makes a

What makes a message key mandatory and how to turn it off?

2016-05-04 Thread I PVP
HI all, What makes a message key mandatory and how to turn it off ? I am migrating the messaging piece of a java application from activemq to kafka. The application was publishing messages to kafka(0.9.0) with no issues while running on single broker on my dev machine. After turning it into

one broker id exist in topic ISR but does not exist in broker ids

2016-05-04 Thread yuanjia8...@163.com
Hi all, I have the problem that broker id 1 exist in one topic's ISR and is the only one, but id 1 do not exist in zookeeper path /broker/ids. Any idea why it's happening? Is it split-brain? Thanks. LiYuanJia

FailedToSendMessageException on kafka producer

2016-05-04 Thread ram kumar
Hi, I wrote a java client to send data from file to kafka,(this code worked when ran in local) got following error $ java -cp kafka-0.0.1-SNAPSHOT.jar KafkaProducerFromFile [2016-05-04 03:39:24,996] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for

RE: KafkaProducer block on send

2016-05-04 Thread Paolo Patierno
Sorry ... the callback is called with exception so I can check inside it ... btw send() shouldn't be blocking. Paolo PatiernoSenior Software Engineer (IoT) @ Red Hat Microsoft MVP on Windows Embedded & IoTMicrosoft Azure Advisor Twitter : @ppatierno Linkedin : paolopatierno Blog : DevExperience

Re: Kafka Monitoring using JMX Mbeans

2016-05-04 Thread 马哲超
We use jmxcmd as the command line tool. 2016-05-04 10:47 GMT+08:00 Otis Gospodnetić : > Hi, > > On Mon, Apr 25, 2016 at 4:14 AM, Mudit Kumar wrote: > > > Hi, > > > > Have anyone setup any monitoring using Mbeans ?What kind of command line > >

RE: KafkaProducer block on send

2016-05-04 Thread Paolo Patierno
Hi Oleg, can you share the JIRA link here because I totally agree with you. For me the send() should be totally asynchronous and not blocking for the max.block.ms timeout. Currently I'm using the overload with callback that, of course, isn't called if the send() fails due to timeout. In order

RE: Metadata Request Loop?

2016-05-04 Thread Phil Luckhurst
I found that just sending one in the first 5 minutes was enough. To be more precise it needs to be within your setting for metadata.max.age.ms which by default is 5 minutes. Phil -Original Message- From: Fumo, Vincent [mailto:vincent_f...@cable.comcast.com] Sent: 29 April 2016 15:57

Re: Suggesstion for mixture of 0.8 and 0.9?

2016-05-04 Thread Flybean
So, is it possible that clients(producer/consumer) using 0.8 and 0.9 together with the same 0.9 brokers? More, it seems that the need for Jdk 7+ is Scala(which kafka is built on). But 0.9 provides pure java Consumer/Producer(although just beta). So, is it OK to rebuild jar file using jdk 1.6 ?