Mark, A topic can have multiple partitions spread over multiple brokers. Those partitions are evenly assigned to consumers within a group for parallel consumption.
Thanks, Jun On Fri, Dec 2, 2011 at 10:09 AM, Mark <static.void....@gmail.com> wrote: > Could you mind explaining how you go about: > > > (1) partitioning and load balancing data across a cluster of machines > > > On 12/2/11 6:42 AM, Jay Kreps wrote: > >> I think there are two things here: (1) partitioning and load balancing >> data >> across a cluster of machines, and (2) replicating each message on N >> machines. We do (1) but not (2). We are working on (2), as Jun says. >> >> -Jay >> >> On Thu, Dec 1, 2011 at 5:29 PM, Jun Rao<jun...@gmail.com> wrote: >> >> No, multiple servers in each cluster. >>> >>> Jun >>> >>> On Thu, Dec 1, 2011 at 4:48 PM, Mark<static.void....@gmail.com**> >>> wrote: >>> >>> So at linked in you only use 1 kafka server? >>>> >>>> >>>> On 12/1/11 9:12 AM, Jun Rao wrote: >>>> >>>> Mark, >>>>> >>>>> See my inlined answers below. >>>>> >>>>> Thanks, >>>>> >>>>> Jun >>>>> >>>>> On Thu, Dec 1, 2011 at 8:28 AM, Mark<static.void....@gmail.com****> >>>>> >>>> wrote: >>> >>>> - Does Kafka support pattern matching? >>>>> >>>>>> There is no server-side filtering in Kafka right now. >>>>>> >>>>> >>>>> - What are the limitations of one Kafka server in terms of number of >>>>> >>>>>> topics and number of consumers? >>>>>> >>>>>> There is no hard limit. However, at LinkedIn, we are dealing with >>>>>> >>>>> hundreds >>>>> of topics and tens of consumers. Large # of topics/consumers could be >>>>> limited by ZK capacity and OS capacity (e.g., open file handlers). >>>>> Also, >>>>> if >>>>> a consumer consumes a large number of topics, time to balance load will >>>>> >>>> be >>> >>>> longer. >>>>> >>>>> >>>>> - Can you load balance publishing/subscribing across multiple Kafka >>>>> >>>>>> servers to increase redundancy? >>>>>> >>>>>> >>>>>> It's possible, but it's not something that's built-in now. We do plan >>>>>> >>>>> to >>> >>>> support intra-cluster replication. See the design in >>>>> https://issues.apache.org/****jira/browse/KAFKA-50<https://issues.apache.org/**jira/browse/KAFKA-50> >>>>> < >>>>> >>>> https://issues.apache.org/**jira/browse/KAFKA-50<https://issues.apache.org/jira/browse/KAFKA-50> >>> > >>> >>>> >>>>> - Other than lack of map/reduce support how does Kafka differ than say >>>>> >>>>>> Redis Pub/Sub? >>>>>> (http://redis.io/topics/****pubsub**<http://redis.io/topics/**pubsub**> >>>>>> < >>>>>> >>>>> http://redis.io/topics/pubsub**** <http://redis.io/topics/pubsub**>> >>> >>>> ) >>>>>> >>>>>> >>>>>> Don't know about Redis Pub/Sub. However, Kafka differs from some >>>>>> other >>>>>> >>>>> pub/sub/messaging systems in that it focuses more on scalability, >>>>> efficiency, and throughput. >>>>> >>>>> >>>>> - Would anyone mind sharing their Kafka setup in terms of both >>>>> >>>>>> functionality/usage and architecture... basically more in depth than >>>>>> >>>>> the >>> >>>> usual "Kafka servers our realt-time X" (https://cwiki.apache.org/** >>>>>> confluence/display/KAFKA/******Powered+By<https://cwiki.** >>>>>> apache.org/confluence/display/****KAFKA/Powered+By<http://apache.org/confluence/display/**KAFKA/Powered+By> >>>>>> < >>>>>> >>>>> https://cwiki.apache.org/**confluence/display/KAFKA/**Powered+By<https://cwiki.apache.org/confluence/display/KAFKA/Powered+By> >>> > >>> >>>> ). >>>>>>> >>>>>> Having concrete use cases on the wiki could help gain adoption, >>>>>> especially >>>>>> to new users of the pub/sub paradigm, by showing what the powers of >>>>>> pub/sub >>>>>> real-time messaging can accomplish. >>>>>> >>>>>> >>>>>> Yes, we will update the wiki later. >>>>>> >>>>> >>>>> - Any good papers on what problems pub/sub in general can solve? >>>>> >>>>>> >>>>>> Some of the design and usage of Kafka can be found in this paper: >>>>>> >>>>> http://research.microsoft.com/****en-us/um/people/srikanth/**<http://research.microsoft.com/**en-us/um/people/srikanth/**> >>>>> netdb11/netdb11papers/netdb11-****final12.pdf< >>>>> >>>> http://research.microsoft.com/**en-us/um/people/srikanth/** >>> netdb11/netdb11papers/netdb11-**final12.pdf<http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf> >>> >>>> >>>>> Thanks >>>>> >>>>> >>>>>> >>>>>> >>>>>>