sure, we are not in production yet, so things might still
change, but our current setup is as follows:

- no zookeeper
- single kafka broker
- second kafka broker as standby
- logs are rsynced to standy every 5 minutes
- topics not (yet) partitioned
- multithreaded jruby consumer
- each thread with separate kafka client instance

cheers
tim


On 2011-11-06, at 18:05 , Mark wrote:

> Tim,
> 
> Would you mind explaining how you use Kafka? Basically the general overview 
> of the messages/events you are capturing and how you go about processing 
> them. We will also be using kafka-rb so I'm particularly interested in how 
> others are using it.
> 
> - M
> 
> On 11/5/11 11:49 PM, Tim Lossen wrote:
>> we are using kafka entirely without zookeeper, and it is working
>> fine so far: single kafka broker, ruby consumers without coordination.
>> 
>> tim
>> 
>> 
>> On 2011-11-05, at 22:03 , Mark wrote:
>> 
>>> Ok, so no matter what ZooKeeper is still required when using Kafka. One 
>>> just has the option to either loadbalance producer =>  broker connections 
>>> via ZooKeeper or a Loadbalancer.
>>> 
>>> Is that correct? If so, I think I finally got it :)
>>> 
>>> On 11/5/11 1:29 PM, Jay Kreps wrote:
>>>> It is also worth mentioning that this is just for producers, consumers
>>>> always use zookeeper for load balancing and co-ordination. Logically this
>>>> makes sense--partitioning production is trivial if you don't care about
>>>> semantics of key=>partition assignment, but partitioning consumption is
>>>> more complex because you need to divide up the partitions amongst the set
>>>> of all consumers exactly.
>>>> 
>>>> -jay
>>>> 
>>>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<jay.kr...@gmail.com>   wrote:
>>>> 
>>>>> The motivation here is is that literally every production process at
>>>>> LinkedIn sends messages to Kafka as part of either user tracking or
>>>>> operational monitoring or both. We are wary of adding that many zk
>>>>> connections and watches, so we run this first tier through a simple L2 
>>>>> load
>>>>> balancer that just randomly balances connections over brokers. The good
>>>>> part about this is that we can do zookeeper upgrades without redeploying
>>>>> all the production apps to upgrade their zk jar.
>>>>> 
>>>>> As Neha says, the zk producer is used for key-based partitioning by the
>>>>> smaller number of producers who need that.
>>>>> 
>>>>> -Jay
>>>>> 
>>>>> 
>>>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha 
>>>>> Narkhede<neha.narkh...@gmail.com>wrote:
>>>>> 
>>>>>> Mark,
>>>>>> 
>>>>>> Most publishers at LinkedIn use a hardware load balancer approach.
>>>>>> These are configured to do a TCP healthcheck that monitors if the
>>>>>> kafka port on a broker is working. If it is, then requests are
>>>>>> forwarded to the broker. Some publishers though are using the software
>>>>>> load balancer based on zookeeper. Those applications want to do some
>>>>>> key based partitioning of data.
>>>>>> 
>>>>>> Thanks,
>>>>>> Neha
>>>>>> 
>>>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<static.void....@gmail.com>   wrote:
>>>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
>>>>>>> instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
>>>>>>> 
>>>>>>> Thanks
>>>>>>> 
>>>>>>> On 11/4/11 7:09 PM, Jun Rao wrote:
>>>>>>>> broker.list is used in the producer property file. One caveat is that
>>>>>> the
>>>>>>>> broker.list approach doesn't do healthcheck. Which means that if a
>>>>>> broker
>>>>>>>> goes down, the client could still try to send messages to it. At
>>>>>> LinkedIn,
>>>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based
>>>>>>>> producer,
>>>>>>>> on the other hand, does health check.
>>>>>>>> 
>>>>>>>> You can find out more details about our ZK design in our design page in
>>>>>>>> the
>>>>>>>> website or the paper in
>>>>>>>> 
>>>>>>>> 
>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
>>>>>> .
>>>>>>>> Jun
>>>>>>>> 
>>>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com>
>>>>>>  wrote:
>>>>>>>>> I just noticed that there is an option to not use Zookeeper and
>>>>>> instead
>>>>>>>>> one can use a static list of brokers (#9 on
>>>>>>>>> http://incubator.apache.org/**
>>>>>>>>> 
>>>>>>>>> kafka/quickstart.html<
>>>>>> http://incubator.apache.org/kafka/quickstart.html>).
>>>>>>>>> Do i put this list in server.properties?
>>>>>>>>> 
>>>>>>>>> It doesn't seem like you save much either way as you have to either
>>>>>>>>>  a) list out all the nodes in the zookeeper quorum in
>>>>>>>>> zookeeper.properties
>>>>>>>>>  b) list out static brokers in  server.properties.
>>>>>>>>> 
>>>>>>>>> What are the benefits of using ZooKeeper over a static list?  Can
>>>>>> someone
>>>>>>>>> also explain how Kafka uses ZooKeeper?
>>>>>>>>> 
>>>>>>>>> Thanks
>>>>>>>>> 
>>>>>>>>> 
>> --
>> http://tim.lossen.de
>> 
>> 
>> 

--
http://tim.lossen.de



Reply via email to