Re: Zookeeper

Jay Kreps Sat, 05 Nov 2011 13:29:58 -0700

It is also worth mentioning that this is just for producers, consumers
always use zookeeper for load balancing and co-ordination. Logically this
makes sense--partitioning production is trivial if you don't care about
semantics of key=>partition assignment, but partitioning consumption is
more complex because you need to divide up the partitions amongst the set
of all consumers exactly.


-jay

On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps <jay.kr...@gmail.com> wrote:

> The motivation here is is that literally every production process at
> LinkedIn sends messages to Kafka as part of either user tracking or
> operational monitoring or both. We are wary of adding that many zk
> connections and watches, so we run this first tier through a simple L2 load
> balancer that just randomly balances connections over brokers. The good
> part about this is that we can do zookeeper upgrades without redeploying
> all the production apps to upgrade their zk jar.
>
> As Neha says, the zk producer is used for key-based partitioning by the
> smaller number of producers who need that.
>
> -Jay
>
>
> On Sat, Nov 5, 2011 at 11:56 AM, Neha Narkhede <neha.narkh...@gmail.com>wrote:
>
>> Mark,
>>
>> Most publishers at LinkedIn use a hardware load balancer approach.
>> These are configured to do a TCP healthcheck that monitors if the
>> kafka port on a broker is working. If it is, then requests are
>> forwarded to the broker. Some publishers though are using the software
>> load balancer based on zookeeper. Those applications want to do some
>> key based partitioning of data.
>>
>> Thanks,
>> Neha
>>
>> On Sat, Nov 5, 2011 at 11:49 AM, Mark <static.void....@gmail.com> wrote:
>> > Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
>> > instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
>> >
>> > Thanks
>> >
>> > On 11/4/11 7:09 PM, Jun Rao wrote:
>> >>
>> >> broker.list is used in the producer property file. One caveat is that
>> the
>> >> broker.list approach doesn't do healthcheck. Which means that if a
>> broker
>> >> goes down, the client could still try to send messages to it. At
>> LinkedIn,
>> >> we rely on a load balancer to do healthcheck for us. The zk-based
>> >> producer,
>> >> on the other hand, does health check.
>> >>
>> >> You can find out more details about our ZK design in our design page in
>> >> the
>> >> website or the paper in
>> >>
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
>> .
>> >>
>> >> Jun
>> >>
>> >> On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com>
>>  wrote:
>> >>
>> >>> I just noticed that there is an option to not use Zookeeper and
>> instead
>> >>> one can use a static list of brokers (#9 on
>> >>> http://incubator.apache.org/**
>> >>>
>> >>> kafka/quickstart.html<
>> http://incubator.apache.org/kafka/quickstart.html>).
>> >>> Do i put this list in server.properties?
>> >>>
>> >>> It doesn't seem like you save much either way as you have to either
>> >>>  a) list out all the nodes in the zookeeper quorum in
>> >>> zookeeper.properties
>> >>>  b) list out static brokers in  server.properties.
>> >>>
>> >>> What are the benefits of using ZooKeeper over a static list?  Can
>> someone
>> >>> also explain how Kafka uses ZooKeeper?
>> >>>
>> >>> Thanks
>> >>>
>> >>>
>> >
>>
>
>

Re: Zookeeper

Reply via email to