I think there are two things here: (1) partitioning and load balancing data across a cluster of machines, and (2) replicating each message on N machines. We do (1) but not (2). We are working on (2), as Jun says.
-Jay On Thu, Dec 1, 2011 at 5:29 PM, Jun Rao <jun...@gmail.com> wrote: > No, multiple servers in each cluster. > > Jun > > On Thu, Dec 1, 2011 at 4:48 PM, Mark <static.void....@gmail.com> wrote: > > > So at linked in you only use 1 kafka server? > > > > > > On 12/1/11 9:12 AM, Jun Rao wrote: > > > >> Mark, > >> > >> See my inlined answers below. > >> > >> Thanks, > >> > >> Jun > >> > >> On Thu, Dec 1, 2011 at 8:28 AM, Mark<static.void....@gmail.com**> > wrote: > >> > >> - Does Kafka support pattern matching? > >>> > >>> There is no server-side filtering in Kafka right now. > >> > >> > >> - What are the limitations of one Kafka server in terms of number of > >>> topics and number of consumers? > >>> > >>> There is no hard limit. However, at LinkedIn, we are dealing with > >> hundreds > >> of topics and tens of consumers. Large # of topics/consumers could be > >> limited by ZK capacity and OS capacity (e.g., open file handlers). Also, > >> if > >> a consumer consumes a large number of topics, time to balance load will > be > >> longer. > >> > >> > >> - Can you load balance publishing/subscribing across multiple Kafka > >>> servers to increase redundancy? > >>> > >>> > >>> It's possible, but it's not something that's built-in now. We do plan > to > >> support intra-cluster replication. See the design in > >> https://issues.apache.org/**jira/browse/KAFKA-50< > https://issues.apache.org/jira/browse/KAFKA-50> > >> > >> > >> - Other than lack of map/reduce support how does Kafka differ than say > >>> Redis Pub/Sub? (http://redis.io/topics/**pubsub**< > http://redis.io/topics/pubsub**> > >>> ) > >>> > >>> > >>> Don't know about Redis Pub/Sub. However, Kafka differs from some other > >> pub/sub/messaging systems in that it focuses more on scalability, > >> efficiency, and throughput. > >> > >> > >> - Would anyone mind sharing their Kafka setup in terms of both > >>> functionality/usage and architecture... basically more in depth than > the > >>> usual "Kafka servers our realt-time X" (https://cwiki.apache.org/** > >>> confluence/display/KAFKA/****Powered+By<https://cwiki.** > >>> apache.org/confluence/display/**KAFKA/Powered+By< > https://cwiki.apache.org/confluence/display/KAFKA/Powered+By> > >>> >). > >>> > >>> Having concrete use cases on the wiki could help gain adoption, > >>> especially > >>> to new users of the pub/sub paradigm, by showing what the powers of > >>> pub/sub > >>> real-time messaging can accomplish. > >>> > >>> > >>> Yes, we will update the wiki later. > >> > >> > >> - Any good papers on what problems pub/sub in general can solve? > >>> > >>> > >>> Some of the design and usage of Kafka can be found in this paper: > >> http://research.microsoft.com/**en-us/um/people/srikanth/** > >> netdb11/netdb11papers/netdb11-**final12.pdf< > http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf > > > >> > >> > >> Thanks > >> > >>> > >>> > >>> > >>> >