Re: Kafka/ZK Cluster Example

Jun Rao Thu, 12 Jan 2012 09:10:53 -0800

For 1), roughly speaking, hosts in a ZK cluster replicate among themselves
synchronously. So having multiple ZK hosts improves reliability. ZK
tolerates k failures with 2k+1 hosts.


For 2), that's exactly how our ZK-based producer works.

Thanks,

Jun

On Wed, Jan 11, 2012 at 3:35 PM, Christian Carollo <ccaro...@gmail.com>wrote:

> This leads to two more questions…
>
> 1) Maybe I am not understanding what a ZK cluster typically looks like or
> is made up of.  If I have more than one ZK service/instance running on a
> single node that doesn't sound like it is more reliable when there is a
> server failure.
>
> On the other hand, if I have one ZK on one node and another on another
> node, even as a hot standby via mirroring, that seems like a more reliable
> solution.  I think I must be missing something, am I?
>
> 2) Can the client producer interrogate the ZK service and determine if it
> is available and/or if one or more brokers are available?  And if so get
> there connection information from ZK so that the producer can intelligently
> send messages to the right brokers?  If this is possible the client
> producer could handle failure cases and either contact a different
> (hot-standby) ZK or Broker?
>
> Thanks
> Christian
>
> On Jan 11, 2012, at 3:16 PM, Felix GV wrote:
>
> > As I understand it, you cannot use a mirrored Kafka cluster as a hot
> > fail-over.
> >
> > You could probably use it as a manual fail-over, but I don't know the
> > complexity involved in doing that.
> >
> > Also, if your source cluster fails while producers were putting data into
> > it, there will be an "unconsumed window" of data that is lost. This
> > corresponds to the data that the embedded consumer in the mirrored
> cluster
> > did not have time to consume from the source cluster.
> >
> > All in all, the mirrored cluster is akin to asynchronous replication,
> > without any hot fail-over capability. Thus, it provides data redundancy
> > (outside of the unconsumed window described above) but no extra
> > availability (unless you count manual interventions).
> >
> > KAFKA-50 <https://issues.apache.org/jira/browse/KAFKA-50>, on the other
> > hand, will provide both asynchronous AND synchronous replication
> (although
> > the latter will incur a latency penalty) and will be able to use the
> > replicas (data redundancy) as hot-fail overs.
> >
> > Depending on your personal definition of "highly reliable" (whether it
> > includes data redundancy and/or availability), I think that should
> probably
> > answer your question...?
> >
> > To all the Kafka experts: please correct me if the above explanations are
> > incorrect :) !
> >
> > --
> > Felix
> >
> >
> >
> > On Wed, Jan 11, 2012 at 5:53 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> It's just that the mirroring logic depends on ZK to be available most of
> >> the time.
> >>
> >> Jun
> >>
> >> On Wed, Jan 11, 2012 at 2:35 PM, Christian Carollo <ccaro...@gmail.com
> >>> wrote:
> >>
> >>> I see.  But if I used that configuration and then did the mirroring you
> >>> suggested would that be enough, in your opinion, to be considered
> highly
> >>> reliable?
> >>>
> >>> Christian
> >>>
> >>>
> >>> On Jan 11, 2012, at 2:32 PM, Jun Rao wrote:
> >>>
> >>>>> For example, can I have one ZK instance and one broker on one machine
> >>> and
> >>>> that is enough to define a ZK cluster and a Kafka Cluster?
> >>>>
> >>>> Yes, although you don't get the reliability of ZK now.
> >>>>
> >>>> Jun
> >>>>
> >>>>
> >>>> On Wed, Jan 11, 2012 at 2:06 PM, Christian Carollo <
> ccaro...@gmail.com
> >>>> wrote:
> >>>>
> >>>>> Jun,
> >>>>>
> >>>>> I don't think I ask my question the right way.
> >>>>>
> >>>>> What I am trying to understand is what are the minimum constituent
> >> parts
> >>>>> of a kafka cluster?
> >>>>>
> >>>>> Based on your last email, I am now wondering what are the minimum
> >>>>> constituent parts of a ZK cluster as well as a Kafka cluster?
> >>>>>
> >>>>> For example, can I have one ZK instance and one broker on one machine
> >>> and
> >>>>> that is enough to define a ZK cluster and a Kafka Cluster?
> >>>>>
> >>>>> Thanks,
> >>>>> Christian
> >>>>>
> >>>>>
> >>>>> On Jan 11, 2012, at 1:50 PM, Jun Rao <jun...@gmail.com> wrote:
> >>>>>
> >>>>>> Chrsitan,
> >>>>>>
> >>>>>> A Kafka cluster containers a ZK cluster and a list of brokers. When
> a
> >>>>>> consumer subscribes to a topic in a kafka cluster, it consumes data
> >>>>> stored
> >>>>>> in all brokers in that cluster.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Jun
> >>>>>>
> >>>>>> On Tue, Jan 10, 2012 at 11:28 PM, Christian Carollo <
> >>> ccaro...@gmail.com
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thank you Jun that is quite helpful.  I have a question about Kafka
> >>>>>>> Clusters.  What are the minimum number and types of services that
> >> must
> >>>>> be
> >>>>>>> running to make up a Kafka Cluster?
> >>>>>>>
> >>>>>>> I ask this because the diagrams (in the Kafka Mirroring document)
> >>> allude
> >>>>>>> to a multiple broker environment, however, since each broker does
> >> not
> >>>>>>> appear to provide redundancy (as of today) to any of the other
> >> brokers
> >>>>> in a
> >>>>>>> given zookeeper service, it seems like a Kafka Cluster is nothing
> >> more
> >>>>> than
> >>>>>>> a grouping of a single zookeeper instance with a single Kafka
> >> broker,
> >>> is
> >>>>>>> this the correct understanding?
> >>>>>>>
> >>>>>>> Thanks,
> >>>>>>> Christian
> >>>>>>>
> >>>>>>> On Jan 10, 2012, at 8:47 AM, Jun Rao wrote:
> >>>>>>>
> >>>>>>>> With 0.7, you can set up inter-cluster replication (
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
> >> ).
> >>>>>>>>
> >>>>>>>> For the future 0.8 release, we are working on intra-cluster
> >>> replication
> >>>>>>>> support and details can be found at
> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-50
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Jun
> >>>>>>>>
> >>>>>>>> On Mon, Jan 9, 2012 at 9:52 PM, Christian Carollo <
> >>> ccaro...@gmail.com
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> I am looking to implement Kafka in a production environment,
> >>> however,
> >>>>> I
> >>>>>>>>> haven't found in documentation or examples that
> >>>>>>>>> discuss how to build a redundant implementation.  Is there any
> >>>>>>>>> documentation out their (blogs, articles, etc.) that describes
> >>>>>>>>> how we can implement such a system with Kafka 0.6 or 0.7.
> >>>>>>>>>
> >>>>>>>>> Also, is there a timeframe the community is shooting for, to
> >> release
> >>>>>>> 0.8 w/
> >>>>>>>>> replication?
> >>>>>>>>>
> >>>>>>>>> Thanks
> >>>>>>>>> Christian
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>
> >>>
> >>>
> >>
>
>

Re: Kafka/ZK Cluster Example

Reply via email to