Could you mind explaining how you go about:

(1) partitioning and load balancing data across a cluster of machines


On 12/2/11 6:42 AM, Jay Kreps wrote:
I think there are two things here: (1) partitioning and load balancing data
across a cluster of machines, and (2) replicating each message on N
machines. We do (1) but not (2). We are working on (2), as Jun says.

-Jay

On Thu, Dec 1, 2011 at 5:29 PM, Jun Rao<jun...@gmail.com>  wrote:

No, multiple servers in each cluster.

Jun

On Thu, Dec 1, 2011 at 4:48 PM, Mark<static.void....@gmail.com>  wrote:

So at linked in you only use 1 kafka server?


On 12/1/11 9:12 AM, Jun Rao wrote:

Mark,

See my inlined answers below.

Thanks,

Jun

On Thu, Dec 1, 2011 at 8:28 AM, Mark<static.void....@gmail.com**>
  wrote:
  - Does Kafka support pattern matching?
  There is no server-side filtering in Kafka right now.

  - What are the limitations of one Kafka server in terms of number of
topics and number of consumers?

  There is no hard limit. However, at LinkedIn, we are dealing with
hundreds
of topics and tens of consumers. Large # of topics/consumers could be
limited by ZK capacity and OS capacity (e.g., open file handlers). Also,
if
a consumer consumes a large number of topics, time to balance load will
be
longer.


  - Can you load balance publishing/subscribing across multiple Kafka
servers to increase redundancy?


  It's possible, but it's not something that's built-in now. We do plan
to
support intra-cluster replication. See the design in
https://issues.apache.org/**jira/browse/KAFKA-50<
https://issues.apache.org/jira/browse/KAFKA-50>

  - Other than lack of map/reduce support how does Kafka differ than say
Redis Pub/Sub? (http://redis.io/topics/**pubsub**<
http://redis.io/topics/pubsub**>
)


  Don't know about Redis Pub/Sub. However, Kafka differs from some other
pub/sub/messaging systems in that it focuses more on scalability,
efficiency, and throughput.


  - Would anyone mind sharing their Kafka setup in terms of both
functionality/usage and architecture... basically more in depth than
the
usual "Kafka servers our realt-time X" (https://cwiki.apache.org/**
confluence/display/KAFKA/****Powered+By<https://cwiki.**
apache.org/confluence/display/**KAFKA/Powered+By<
https://cwiki.apache.org/confluence/display/KAFKA/Powered+By>
).
Having concrete use cases on the wiki could help gain adoption,
especially
to new users of the pub/sub paradigm, by showing what the powers of
pub/sub
real-time messaging can accomplish.


  Yes, we will update the wiki later.

  - Any good papers on what problems pub/sub in general can solve?

  Some of the design and usage of Kafka can be found in this paper:
http://research.microsoft.com/**en-us/um/people/srikanth/**
netdb11/netdb11papers/netdb11-**final12.pdf<
http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf

Thanks




Reply via email to