sure, we are not in production yet, so things might still change, but our current setup is as follows:
- no zookeeper - single kafka broker - second kafka broker as standby - logs are rsynced to standy every 5 minutes - topics not (yet) partitioned - multithreaded jruby consumer - each thread with separate kafka client instance cheers tim On 2011-11-06, at 18:05 , Mark wrote: > Tim, > > Would you mind explaining how you use Kafka? Basically the general overview > of the messages/events you are capturing and how you go about processing > them. We will also be using kafka-rb so I'm particularly interested in how > others are using it. > > - M > > On 11/5/11 11:49 PM, Tim Lossen wrote: >> we are using kafka entirely without zookeeper, and it is working >> fine so far: single kafka broker, ruby consumers without coordination. >> >> tim >> >> >> On 2011-11-05, at 22:03 , Mark wrote: >> >>> Ok, so no matter what ZooKeeper is still required when using Kafka. One >>> just has the option to either loadbalance producer => broker connections >>> via ZooKeeper or a Loadbalancer. >>> >>> Is that correct? If so, I think I finally got it :) >>> >>> On 11/5/11 1:29 PM, Jay Kreps wrote: >>>> It is also worth mentioning that this is just for producers, consumers >>>> always use zookeeper for load balancing and co-ordination. Logically this >>>> makes sense--partitioning production is trivial if you don't care about >>>> semantics of key=>partition assignment, but partitioning consumption is >>>> more complex because you need to divide up the partitions amongst the set >>>> of all consumers exactly. >>>> >>>> -jay >>>> >>>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<jay.kr...@gmail.com> wrote: >>>> >>>>> The motivation here is is that literally every production process at >>>>> LinkedIn sends messages to Kafka as part of either user tracking or >>>>> operational monitoring or both. We are wary of adding that many zk >>>>> connections and watches, so we run this first tier through a simple L2 >>>>> load >>>>> balancer that just randomly balances connections over brokers. The good >>>>> part about this is that we can do zookeeper upgrades without redeploying >>>>> all the production apps to upgrade their zk jar. >>>>> >>>>> As Neha says, the zk producer is used for key-based partitioning by the >>>>> smaller number of producers who need that. >>>>> >>>>> -Jay >>>>> >>>>> >>>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha >>>>> Narkhede<neha.narkh...@gmail.com>wrote: >>>>> >>>>>> Mark, >>>>>> >>>>>> Most publishers at LinkedIn use a hardware load balancer approach. >>>>>> These are configured to do a TCP healthcheck that monitors if the >>>>>> kafka port on a broker is working. If it is, then requests are >>>>>> forwarded to the broker. Some publishers though are using the software >>>>>> load balancer based on zookeeper. Those applications want to do some >>>>>> key based partitioning of data. >>>>>> >>>>>> Thanks, >>>>>> Neha >>>>>> >>>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<static.void....@gmail.com> wrote: >>>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer >>>>>>> instead of ZooKeeper or do you use it in conjunction with ZooKeeper? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> On 11/4/11 7:09 PM, Jun Rao wrote: >>>>>>>> broker.list is used in the producer property file. One caveat is that >>>>>> the >>>>>>>> broker.list approach doesn't do healthcheck. Which means that if a >>>>>> broker >>>>>>>> goes down, the client could still try to send messages to it. At >>>>>> LinkedIn, >>>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based >>>>>>>> producer, >>>>>>>> on the other hand, does health check. >>>>>>>> >>>>>>>> You can find out more details about our ZK design in our design page in >>>>>>>> the >>>>>>>> website or the paper in >>>>>>>> >>>>>>>> >>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations >>>>>> . >>>>>>>> Jun >>>>>>>> >>>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com> >>>>>> wrote: >>>>>>>>> I just noticed that there is an option to not use Zookeeper and >>>>>> instead >>>>>>>>> one can use a static list of brokers (#9 on >>>>>>>>> http://incubator.apache.org/** >>>>>>>>> >>>>>>>>> kafka/quickstart.html< >>>>>> http://incubator.apache.org/kafka/quickstart.html>). >>>>>>>>> Do i put this list in server.properties? >>>>>>>>> >>>>>>>>> It doesn't seem like you save much either way as you have to either >>>>>>>>> a) list out all the nodes in the zookeeper quorum in >>>>>>>>> zookeeper.properties >>>>>>>>> b) list out static brokers in server.properties. >>>>>>>>> >>>>>>>>> What are the benefits of using ZooKeeper over a static list? Can >>>>>> someone >>>>>>>>> also explain how Kafka uses ZooKeeper? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >> -- >> http://tim.lossen.de >> >> >> -- http://tim.lossen.de