question about compression

2014-07-21 Thread Bert Corderman
In trying to better understand compression I came across the following http://geekmantra.wordpress.com/2013/03/28/compression-in-kafka-gzip-or-snappy/ “in Kafka 0.8, messages for a partition are served by the leader broker. The leader assigns these unique logical offsets to every message it app

Re: How to recover from a disk full situation in Kafka cluster?

2014-07-21 Thread Connie Yang
It looks like org.apache.kafka.clients.producer.KafkaProducer is not available in 0.8.1.1 client jar. So, we'll stay with kafka.javaapi.producer.Producer implementation. Thanks, Connie On Fri, Jul 18, 2014 at 5:13 PM, Neha Narkhede wrote: > One option is to reduce the value of topic.metadata.

Re: how to ensure strong consistency with reasonable availability

2014-07-21 Thread Scott Clasen
You will probably need 0.8.2 which gives https://issues.apache.org/jira/browse/KAFKA-1028 On Mon, Jul 21, 2014 at 6:37 PM, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -) wrote: > Hi everyone, > > With a cluster of 3 brokers and a topic of 3 replicas, we want to achieve > the following two prop

Re: request.required.acks=-1 under high data volume

2014-07-21 Thread Daniel Compton
Interesting, I had missed that. Is it worth updating the documentation to make that more explicit, or do other people find it clear enough? On 22 July 2014 12:47, Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -) < jwu...@bloomberg.net> wrote: > The document says "typical" values, not "valid" value

Re: New Consumer Design

2014-07-21 Thread Robert Withers
...live assignments! :) > On Jul 21, 2014, at 7:44 PM, Robert Withers > wrote: > > Thanks, Jay, for the good summary. Regarding point 2, I would think the > heartbeat would still be desired, to give control over liveness detection > parameters and to directly inform clients when gaining or

Re: New Consumer Design

2014-07-21 Thread Robert Withers
Thanks, Jay, for the good summary. Regarding point 2, I would think the heartbeat would still be desired, to give control over liveness detection parameters and to directly inform clients when gaining or losing a partition (especially when gaining a partition). There would be no barrier and th

how to ensure strong consistency with reasonable availability

2014-07-21 Thread Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -)
Hi everyone, With a cluster of 3 brokers and a topic of 3 replicas, we want to achieve the following two properties: 1. when only one broker is down, there's no message loss, and procuders/consumers are not blocked. 2. in other more serious problems, for example, one broker is restarted twice i

Re: request.required.acks=-1 under high data volume

2014-07-21 Thread Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -)
The document says "typical" values, not "valid" values, are 0, 1, -1. In fact any integer will be accepted. From: users@kafka.apache.org At: Jul 21 2014 18:54:56 To: users@kafka.apache.org Subject: Re: request.required.acks=-1 under high data volume In the docs for 0.8.1.1, there are only three

Re: New Consumer Design

2014-07-21 Thread Jay Kreps
This thread is a bit long, but let me see if I can restate it correctly (not sure I fully follow). There are two suggestions: 1. Allow partial rebalances that move just some partitions. I.e. if a consumer fails and has only one partition only one other consumer should be effected (the one who pick

Re: New Consumer Design

2014-07-21 Thread Guozhang Wang
Hello Rob, If I get your idea right, the idea is that if the rebalance only changes the ownership of a few consumers in the group, the coordinator can just sync with them and do not interrupt with other consumers. I think this approach may work. However it will likely complicates the logic of coo

Re: Kafka cluster setup

2014-07-21 Thread Daniel Compton
Hi Raj There is a Quickstart document for setting up, single and multi-broker Kafka clusters. It doesn't include multi node ZK cluster setup though. You can find that in the ZK docs

Re: request.required.acks=-1 under high data volume

2014-07-21 Thread Daniel Compton
In the docs for 0.8.1.1, there are only three options for request.required.acks , {-1, 0, 1}. How is request.required.acks=3 a valid configuration property? Am I reading it incorrectly or are the docs out of date? On 18 July 2014 06:25,

Re: Performance/Stress tools

2014-07-21 Thread Magnus Edenhill
Hi Dayo, the rdkafka_performance tool from librdkafka is useful for performance measurements / stress testing. https://github.com/edenhill/librdkafka It resides in the examples/ directory. https://github.com/edenhill/librdkafka/blob/master/INTRODUCTION.md#test-details Regards, Magnus 2014-07

Kafka cluster setup

2014-07-21 Thread Tanneru, Raj
Sorry for the spam. I am new to apache kafka. Can someone point me to kafka cluster installation document. I am planning to set up a 3 node ZK quorum and 10 broker cluster. Thanks, Raj Tanneru

Re: Performance/Stress tools

2014-07-21 Thread Dayo Oliyide
Thanks for the pointers guys, I will look into them. --Dayo On Sat, Jul 19, 2014 at 9:07 PM, Steve Morin wrote: > Otis, > Yes this would work for Kafka because it's using to launch containers to > generate load for performance testing. It also works in standalone mode to > run on a single m

Re: Interested in contributing to Kafka?

2014-07-21 Thread Andrew Otto
Oh, BTW, I think Yelp is using this .deb packaging (and shell script) too. On Jul 21, 2014, at 10:16 AM, Andrew Otto wrote: > Hm, curious! > > Would this be useful to contribute upstream? > > https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/bin/kafka > > Wikimedia uses i

Re: message loss for sync producer, acks=2, topic replicas=3

2014-07-21 Thread Jiang Wu (Pricehistory) (BLOOMBERG/ 731 LEX -)
Hi Guozhang, I think such cases are covered by replica.lag.time.max.ms (default 10 seconds). When a broker stops fetching for 10 seconds, an ISR expiration thread will remove it from ISR. Regards, Jiang - Original Message - From: wangg...@gmail.com To: JIANG WU (PRICEHISTORY) (BLOOMBER

Re: How to recover from a disk full situation in Kafka cluster?

2014-07-21 Thread Clark Haskins
I assume you ran out of space on your data partitions? Using the partition-reassignment tool can increase the disk space when using time-based retention for topics as this resets the data file time. -Clark Clark Elliott Haskins III LinkedIn DDS Site Reliability Engineer Kafka, Zookeeper, Samza

Re: message loss for sync producer, acks=2, topic replicas=3

2014-07-21 Thread Guozhang Wang
Hello Jiang, Follower replicas can fail out of ISR for reasons like soft failures, etc besides performance. If you set the replica.lag.max.messages properly to prevent follower replicas to fail out of ISR purely due to high producer throughput, then it when one of the follower did gets slow or sto

Re: Some doubts regarding kafka config parameters

2014-07-21 Thread Jun Rao
Those are good questions. See my answers inlined below. Thanks, Jun On Fri, Jul 18, 2014 at 1:33 PM, shweta khare wrote: > hi, > > I have the following doubts regarding some kafka config parameters: > > For example if I have a Throughput topic with replication factor 1 and a > single partitio

Re: Interested in contributing to Kafka?

2014-07-21 Thread Andrew Otto
Hm, curious! Would this be useful to contribute upstream? https://github.com/wikimedia/operations-debs-kafka/blob/debian/debian/bin/kafka Wikimedia uses it instead of the myriad of bin/*.sh scripts that come with Kafka. We didn’t want to build a .deb package that installed 16ish short shell s