Re: General questions on functionality and usage

Mark Fri, 02 Dec 2011 10:10:00 -0800

Could you mind explaining how you go about:

(1) partitioning and load balancing data across a cluster of machines



On 12/2/11 6:42 AM, Jay Kreps wrote:

I think there are two things here: (1) partitioning and load balancing data
across a cluster of machines, and (2) replicating each message on N
machines. We do (1) but not (2). We are working on (2), as Jun says.

-Jay

On Thu, Dec 1, 2011 at 5:29 PM, Jun Rao<jun...@gmail.com>  wrote:

No, multiple servers in each cluster.

Jun

On Thu, Dec 1, 2011 at 4:48 PM, Mark<static.void....@gmail.com>  wrote:

So at linked in you only use 1 kafka server?


On 12/1/11 9:12 AM, Jun Rao wrote:

Mark,

See my inlined answers below.

Thanks,

Jun

On Thu, Dec 1, 2011 at 8:28 AM, Mark<static.void....@gmail.com**>

  wrote:

  - Does Kafka support pattern matching?

  There is no server-side filtering in Kafka right now.


  - What are the limitations of one Kafka server in terms of number of

topics and number of consumers?

  There is no hard limit. However, at LinkedIn, we are dealing with

hundreds
of topics and tens of consumers. Large # of topics/consumers could be
limited by ZK capacity and OS capacity (e.g., open file handlers). Also,
if
a consumer consumes a large number of topics, time to balance load will

be

longer.


  - Can you load balance publishing/subscribing across multiple Kafka

servers to increase redundancy?


  It's possible, but it's not something that's built-in now. We do plan

to

support intra-cluster replication. See the design in
https://issues.apache.org/**jira/browse/KAFKA-50<

https://issues.apache.org/jira/browse/KAFKA-50>


  - Other than lack of map/reduce support how does Kafka differ than say

Redis Pub/Sub? (http://redis.io/topics/**pubsub**<

http://redis.io/topics/pubsub**>

)


  Don't know about Redis Pub/Sub. However, Kafka differs from some other

pub/sub/messaging systems in that it focuses more on scalability,
efficiency, and throughput.


  - Would anyone mind sharing their Kafka setup in terms of both

functionality/usage and architecture... basically more in depth than

the

usual "Kafka servers our realt-time X" (https://cwiki.apache.org/**
confluence/display/KAFKA/****Powered+By<https://cwiki.**
apache.org/confluence/display/**KAFKA/Powered+By<

https://cwiki.apache.org/confluence/display/KAFKA/Powered+By>

).

Having concrete use cases on the wiki could help gain adoption,
especially
to new users of the pub/sub paradigm, by showing what the powers of
pub/sub
real-time messaging can accomplish.


  Yes, we will update the wiki later.


  - Any good papers on what problems pub/sub in general can solve?


  Some of the design and usage of Kafka can be found in this paper:

http://research.microsoft.com/**en-us/um/people/srikanth/**
netdb11/netdb11papers/netdb11-**final12.pdf<

http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf


Thanks

Re: General questions on functionality and usage

Reply via email to