Hi, Tim, Thanks for sharing this. As part of the replication work (KAFKA-50), partitions will become logical and their physical locations are registered in ZK. This will make it difficult to use Kafka without ZK. Overall, I think that simplifies the client. However, if you have any concerns, please comment in the mailing list or the jira.
Jun On Mon, Nov 7, 2011 at 12:27 AM, Tim Lossen <t...@lossen.de> wrote: > sure, we are not in production yet, so things might still > change, but our current setup is as follows: > > - no zookeeper > - single kafka broker > - second kafka broker as standby > - logs are rsynced to standy every 5 minutes > - topics not (yet) partitioned > - multithreaded jruby consumer > - each thread with separate kafka client instance > > cheers > tim > > > On 2011-11-06, at 18:05 , Mark wrote: > > > Tim, > > > > Would you mind explaining how you use Kafka? Basically the general > overview of the messages/events you are capturing and how you go about > processing them. We will also be using kafka-rb so I'm particularly > interested in how others are using it. > > > > - M > > > > On 11/5/11 11:49 PM, Tim Lossen wrote: > >> we are using kafka entirely without zookeeper, and it is working > >> fine so far: single kafka broker, ruby consumers without coordination. > >> > >> tim > >> > >> > >> On 2011-11-05, at 22:03 , Mark wrote: > >> > >>> Ok, so no matter what ZooKeeper is still required when using Kafka. > One just has the option to either loadbalance producer => broker > connections via ZooKeeper or a Loadbalancer. > >>> > >>> Is that correct? If so, I think I finally got it :) > >>> > >>> On 11/5/11 1:29 PM, Jay Kreps wrote: > >>>> It is also worth mentioning that this is just for producers, consumers > >>>> always use zookeeper for load balancing and co-ordination. Logically > this > >>>> makes sense--partitioning production is trivial if you don't care > about > >>>> semantics of key=>partition assignment, but partitioning consumption > is > >>>> more complex because you need to divide up the partitions amongst the > set > >>>> of all consumers exactly. > >>>> > >>>> -jay > >>>> > >>>> On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<jay.kr...@gmail.com> > wrote: > >>>> > >>>>> The motivation here is is that literally every production process at > >>>>> LinkedIn sends messages to Kafka as part of either user tracking or > >>>>> operational monitoring or both. We are wary of adding that many zk > >>>>> connections and watches, so we run this first tier through a simple > L2 load > >>>>> balancer that just randomly balances connections over brokers. The > good > >>>>> part about this is that we can do zookeeper upgrades without > redeploying > >>>>> all the production apps to upgrade their zk jar. > >>>>> > >>>>> As Neha says, the zk producer is used for key-based partitioning by > the > >>>>> smaller number of producers who need that. > >>>>> > >>>>> -Jay > >>>>> > >>>>> > >>>>> On Sat, Nov 5, 2011 at 11:56 AM, Neha Narkhede< > neha.narkh...@gmail.com>wrote: > >>>>> > >>>>>> Mark, > >>>>>> > >>>>>> Most publishers at LinkedIn use a hardware load balancer approach. > >>>>>> These are configured to do a TCP healthcheck that monitors if the > >>>>>> kafka port on a broker is working. If it is, then requests are > >>>>>> forwarded to the broker. Some publishers though are using the > software > >>>>>> load balancer based on zookeeper. Those applications want to do some > >>>>>> key based partitioning of data. > >>>>>> > >>>>>> Thanks, > >>>>>> Neha > >>>>>> > >>>>>> On Sat, Nov 5, 2011 at 11:49 AM, Mark<static.void....@gmail.com> > wrote: > >>>>>>> Sorry but I'm a bit confused now. So at LinkedIn you use a > loadbalancer > >>>>>>> instead of ZooKeeper or do you use it in conjunction with > ZooKeeper? > >>>>>>> > >>>>>>> Thanks > >>>>>>> > >>>>>>> On 11/4/11 7:09 PM, Jun Rao wrote: > >>>>>>>> broker.list is used in the producer property file. One caveat is > that > >>>>>> the > >>>>>>>> broker.list approach doesn't do healthcheck. Which means that if a > >>>>>> broker > >>>>>>>> goes down, the client could still try to send messages to it. At > >>>>>> LinkedIn, > >>>>>>>> we rely on a load balancer to do healthcheck for us. The zk-based > >>>>>>>> producer, > >>>>>>>> on the other hand, does health check. > >>>>>>>> > >>>>>>>> You can find out more details about our ZK design in our design > page in > >>>>>>>> the > >>>>>>>> website or the paper in > >>>>>>>> > >>>>>>>> > >>>>>> > https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations > >>>>>> . > >>>>>>>> Jun > >>>>>>>> > >>>>>>>> On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com> > >>>>>> wrote: > >>>>>>>>> I just noticed that there is an option to not use Zookeeper and > >>>>>> instead > >>>>>>>>> one can use a static list of brokers (#9 on > >>>>>>>>> http://incubator.apache.org/** > >>>>>>>>> > >>>>>>>>> kafka/quickstart.html< > >>>>>> http://incubator.apache.org/kafka/quickstart.html>). > >>>>>>>>> Do i put this list in server.properties? > >>>>>>>>> > >>>>>>>>> It doesn't seem like you save much either way as you have to > either > >>>>>>>>> a) list out all the nodes in the zookeeper quorum in > >>>>>>>>> zookeeper.properties > >>>>>>>>> b) list out static brokers in server.properties. > >>>>>>>>> > >>>>>>>>> What are the benefits of using ZooKeeper over a static list? Can > >>>>>> someone > >>>>>>>>> also explain how Kafka uses ZooKeeper? > >>>>>>>>> > >>>>>>>>> Thanks > >>>>>>>>> > >>>>>>>>> > >> -- > >> http://tim.lossen.de > >> > >> > >> > > -- > http://tim.lossen.de > > > >