Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-15 Thread Lance Laursen
Hey Rajiv, Are you using snappy compression? On Tue, Dec 15, 2015 at 12:52 PM, Rajiv Kurian wrote: > We had to revert to 0.8.3 because three of our topics seem to have gotten > corrupted during the upgrade. As soon as we did the upgrade producers to > the three topics I

Re: managing replication between nodes on specific NICs on a host

2015-12-04 Thread Lance Laursen
Hi, You're not going to be able to get it to do what you want. When a publisher queries a list of seed brokers for cluster information, the host names that are passed back in the metadata response are the same hostnames the brokers use to talk to and replicate between each other. Run zkCli.sh

Re: Kafka 0.8.2.1 - how to read from __consumer_offsets topic?

2015-12-03 Thread Lance Laursen
Hi Marina, You can hop onto your brokers and dump your __consumer_offsets logs manually in order to see if anything is in them. Hop on each of your brokers and run the following command: for f in $(find /path/to/kafka-logs/__consumer_offsets-* -name "*\.log"); do

Re: Custom metadata in message header

2015-11-17 Thread Lance Laursen
Hey Lesong, Unfortunately not. Any metadata you would like to include must be placed in the message payload. This means you must consume each message and process it in order to determine whether it is relevant to your current task. You can achieve some data locality by using keyed messages and

Re: Experiences with corrupted messages

2015-10-01 Thread Lance Laursen
Hey Jörg, Unfortunately when the high level consumer hits a corrupt message, it enters an invalid state and closes. The only way around this is to iterate your offset by 1 in order to skip the corrupt message. This is currently not automated. You can catch this exception if you are using the

Re: port already in use error when trying to add topic

2015-09-14 Thread Lance Laursen
This is not a bug. The java process spawned by kafka-topics.sh is trying to bind to 9998 upon start. The java process spawned by kafka-server-start.sh already owns that port. It's doing this because both of these scripts use kafka-run-class.sh and that is where you defined your 'export JMX_PORT'.

Re: Compression and MirrorMaker

2015-09-01 Thread Lance Laursen
Hi Jörg, You are correct. The producer is wrapping up a batch of messages into one, compressing that one message and slapping a magic byte flag and compression type "snappy" on it, and then sending that single compressed message to your brokers. They hang out on your brokers in compressed format.

Re: Message corruption with new Java client + snappy + broker restart

2015-08-14 Thread Lance Laursen
I am also seeing this issue when using the new producer and snappy compression, running mirrormaker (trunk, aug 10 or so). I'm using snappy 1.1.1.7 [2015-08-14 14:15:27,876] WARN Got error produce response with correlation id 5151552 on topic-partition mytopic-56, retrying (2147480801 attempts

Re: Kafka partitioning is pretty much broken

2015-07-15 Thread Lance Laursen
From the FAQ: To reduce # of open sockets, in 0.8.0 ( https://issues.apache.org/jira/browse/KAFKA-1017), when the partitioning key is not specified or null, a producer will pick a random partition and stick to it for some time (default is 10 mins) before switching to another one. So, if there are

Re: Broker holding all files open

2015-06-03 Thread Lance Laursen
Hey Scott, There's not much you can do about this, other than increasing your log.segment.bytes (max 2GB) or lowering your partition counts on your mirrored cluster (probably not the best strategy unless you're dealing with 100,000+ small topics, at which point you should consider a single

Re: aborting a repartition assignment

2015-05-28 Thread Lance Laursen
Hey, Try clearing out /admin/reassign_partitions on your zookeeper. Additionally, your best bet might be to bring up a new broker with the same broker ID as your failed broker. It'll join the cluster and carry on, though I'm not sure what effect having a now-empty partition is going to have. On

Re: questions

2015-05-22 Thread Lance Laursen
Maximum size of a partition is defined by log.retention.bytes . You can define this in your server.properties as well as upon topic creation with --config retention.bytes=12345 You can also define log.retention.bytes.per.topic https://kafka.apache.org/08/configuration.html On Fri, May 22, 2015

Re: Hitting integer limit when setting log segment.bytes

2015-05-14 Thread Lance Laursen
to the real offset. The key value in index file is of the form Int,Int. Thanks, Mayuresh On Wed, May 13, 2015 at 5:57 PM, Lance Laursen llaur...@rubiconproject.com wrote: Hey folks, Any update on this? On Thu, Apr 30, 2015 at 5:34 PM, Lance Laursen llaur

Re: Hitting integer limit when setting log segment.bytes

2015-05-13 Thread Lance Laursen
Hey folks, Any update on this? On Thu, Apr 30, 2015 at 5:34 PM, Lance Laursen llaur...@rubiconproject.com wrote: Hey all, I am attempting to create a topic which uses 8GB log segment sizes, like so: ./kafka-topics.sh --zookeeper localhost:2181 --create --topic perftest6p2r --partitions 6

Hitting integer limit when setting log segment.bytes

2015-04-30 Thread Lance Laursen
Hey all, I am attempting to create a topic which uses 8GB log segment sizes, like so: ./kafka-topics.sh --zookeeper localhost:2181 --create --topic perftest6p2r --partitions 6 --replication-factor 2 --config max.message.bytes=655360 --config segment.bytes=8589934592 And am getting the following

Re: Monitoring of consumer group lag

2015-03-16 Thread Lance Laursen
Hey Mathias, Kafka Offset Monitor will give you a general idea of where your consumer group(s) are at: http://quantifind.com/KafkaOffsetMonitor/ However, I'm not sure how useful it will be with a large number of topics / turning its output into a script that alerts upon a threshold. Could take