Re: Zookeeper

Jay Kreps Sat, 05 Nov 2011 14:29:07 -0700

That's correct. The option is primarily for testing purposes.

Sent from my iPhone


On Nov 5, 2011, at 2:03 PM, Mark <static.void....@gmail.com> wrote:

> Ok, so no matter what ZooKeeper is still required when using Kafka. One just 
> has the option to either loadbalance producer => broker connections via 
> ZooKeeper or a Loadbalancer.
> 
> Is that correct? If so, I think I finally got it :)
> 
> On 11/5/11 1:29 PM, Jay Kreps wrote:
>> It is also worth mentioning that this is just for producers, consumers
>> always use zookeeper for load balancing and co-ordination. Logically this
>> makes sense--partitioning production is trivial if you don't care about
>> semantics of key=>partition assignment, but partitioning consumption is
>> more complex because you need to divide up the partitions amongst the set
>> of all consumers exactly.
>> 
>> -jay
>> 
>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<jay.kr...@gmail.com>  wrote:
>> 
>>> The motivation here is is that literally every production process at
>>> LinkedIn sends messages to Kafka as part of either user tracking or
>>> operational monitoring or both. We are wary of adding that many zk
>>> connections and watches, so we run this first tier through a simple L2 load
>>> balancer that just randomly balances connections over brokers. The good
>>> part about this is that we can do zookeeper upgrades without redeploying
>>> all the production apps to upgrade their zk jar.
>>> 
>>> As Neha says, the zk producer is used for key-based partitioning by the
>>> smaller number of producers who need that.
>>> 
>>> -Jay
>>> 
>>> 
>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha 
>>> Narkhede<neha.narkh...@gmail.com>wrote:
>>> 
>>>> Mark,
>>>> 
>>>> Most publishers at LinkedIn use a hardware load balancer approach.
>>>> These are configured to do a TCP healthcheck that monitors if the
>>>> kafka port on a broker is working. If it is, then requests are
>>>> forwarded to the broker. Some publishers though are using the software
>>>> load balancer based on zookeeper. Those applications want to do some
>>>> key based partitioning of data.
>>>> 
>>>> Thanks,
>>>> Neha
>>>> 
>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<static.void....@gmail.com>  wrote:
>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
>>>>> instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
>>>>> 
>>>>> Thanks
>>>>> 
>>>>> On 11/4/11 7:09 PM, Jun Rao wrote:
>>>>>> broker.list is used in the producer property file. One caveat is that
>>>> the
>>>>>> broker.list approach doesn't do healthcheck. Which means that if a
>>>> broker
>>>>>> goes down, the client could still try to send messages to it. At
>>>> LinkedIn,
>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based
>>>>>> producer,
>>>>>> on the other hand, does health check.
>>>>>> 
>>>>>> You can find out more details about our ZK design in our design page in
>>>>>> the
>>>>>> website or the paper in
>>>>>> 
>>>>>> 
>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
>>>> .
>>>>>> Jun
>>>>>> 
>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com>
>>>>  wrote:
>>>>>>> I just noticed that there is an option to not use Zookeeper and
>>>> instead
>>>>>>> one can use a static list of brokers (#9 on
>>>>>>> http://incubator.apache.org/**
>>>>>>> 
>>>>>>> kafka/quickstart.html<
>>>> http://incubator.apache.org/kafka/quickstart.html>).
>>>>>>> Do i put this list in server.properties?
>>>>>>> 
>>>>>>> It doesn't seem like you save much either way as you have to either
>>>>>>>  a) list out all the nodes in the zookeeper quorum in
>>>>>>> zookeeper.properties
>>>>>>>  b) list out static brokers in  server.properties.
>>>>>>> 
>>>>>>> What are the benefits of using ZooKeeper over a static list?  Can
>>>> someone
>>>>>>> also explain how Kafka uses ZooKeeper?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> 
>>>

Re: Zookeeper

Reply via email to