Tim,
Would you mind explaining how you use Kafka? Basically the general
overview of the messages/events you are capturing and how you go about
processing them. We will also be using kafka-rb so I'm particularly
interested in how others are using it.
- M
On 11/5/11 11:49 PM, Tim Lossen wrote:
we are using kafka entirely without zookeeper, and it is working
fine so far: single kafka broker, ruby consumers without coordination.
tim
On 2011-11-05, at 22:03 , Mark wrote:
Ok, so no matter what ZooKeeper is still required when using Kafka. One just has
the option to either loadbalance producer => broker connections via ZooKeeper
or a Loadbalancer.
Is that correct? If so, I think I finally got it :)
On 11/5/11 1:29 PM, Jay Kreps wrote:
It is also worth mentioning that this is just for producers, consumers
always use zookeeper for load balancing and co-ordination. Logically this
makes sense--partitioning production is trivial if you don't care about
semantics of key=>partition assignment, but partitioning consumption is
more complex because you need to divide up the partitions amongst the set
of all consumers exactly.
-jay
On Sat, Nov 5, 2011 at 1:19 PM, Jay Kreps<jay.kr...@gmail.com> wrote:
The motivation here is is that literally every production process at
LinkedIn sends messages to Kafka as part of either user tracking or
operational monitoring or both. We are wary of adding that many zk
connections and watches, so we run this first tier through a simple L2 load
balancer that just randomly balances connections over brokers. The good
part about this is that we can do zookeeper upgrades without redeploying
all the production apps to upgrade their zk jar.
As Neha says, the zk producer is used for key-based partitioning by the
smaller number of producers who need that.
-Jay
On Sat, Nov 5, 2011 at 11:56 AM, Neha Narkhede<neha.narkh...@gmail.com>wrote:
Mark,
Most publishers at LinkedIn use a hardware load balancer approach.
These are configured to do a TCP healthcheck that monitors if the
kafka port on a broker is working. If it is, then requests are
forwarded to the broker. Some publishers though are using the software
load balancer based on zookeeper. Those applications want to do some
key based partitioning of data.
Thanks,
Neha
On Sat, Nov 5, 2011 at 11:49 AM, Mark<static.void....@gmail.com> wrote:
Sorry but I'm a bit confused now. So at LinkedIn you use a loadbalancer
instead of ZooKeeper or do you use it in conjunction with ZooKeeper?
Thanks
On 11/4/11 7:09 PM, Jun Rao wrote:
broker.list is used in the producer property file. One caveat is that
the
broker.list approach doesn't do healthcheck. Which means that if a
broker
goes down, the client could still try to send messages to it. At
LinkedIn,
we rely on a load balancer to do healthcheck for us. The zk-based
producer,
on the other hand, does health check.
You can find out more details about our ZK design in our design page in
the
website or the paper in
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+papers+and+presentations
.
Jun
On Fri, Nov 4, 2011 at 6:52 PM, Mark<static.void....@gmail.com>
wrote:
I just noticed that there is an option to not use Zookeeper and
instead
one can use a static list of brokers (#9 on
http://incubator.apache.org/**
kafka/quickstart.html<
http://incubator.apache.org/kafka/quickstart.html>).
Do i put this list in server.properties?
It doesn't seem like you save much either way as you have to either
a) list out all the nodes in the zookeeper quorum in
zookeeper.properties
b) list out static brokers in server.properties.
What are the benefits of using ZooKeeper over a static list? Can
someone
also explain how Kafka uses ZooKeeper?
Thanks
--
http://tim.lossen.de