Re: Kafka/ZK Cluster Example

Jun Rao Thu, 12 Jan 2012 09:16:05 -0800

Felix,

We use RAID too. One potential problem with RAID is that if you replace a
broken disk, RAID goes into rebuild mode. This could significantly slow
down I/O and make a broker not fully functional for new requests. Adding
more mirrors doesn't alleviate this problem.


Jun

On Wed, Jan 11, 2012 at 3:50 PM, Felix GV <fe...@mate1inc.com> wrote:

> We've been thinking about this stuff a lot recently, at work.
>
> We've had some HD failures in our Kafka cluster. I don't know all the
> details, but from what I heard, the HDs were mirrored in RAID but several
> of them failed in a close time interval and the array did not have time to
> fully rebuild itself, so we lost all of that data from the Kafka cluster.
> Thankfully, the data was being consumed in near real time, so we only
> really lost a small unconsumed window of data.
>
> Now, we're wondering what we could improve to prevent this scenario in the
> future. I investigated Kafka mirroring but since it relies on consuming
> data, the probability to lose the unconsumed window is still there. If we
> had consumers that were more batch oriented (like hadoop) rather than
> real-time, the benefits of a mirrored Kafka cluster would be greater, but
> for our use cases, where data is consumed near real-time, we would still
> lose as much data as before. Am I right?
>
> KAFKA-50, with sync replication would have solved our problem, but until
> that's done, what are our options?
>
> I came to the conclusion that simply adding more mirrored copies in our
> RAID arrays would be the most cost-effective way to give us both more
> availability and more redundancy. This doesn't deal with the scenario where
> a machine fails and becomes unavailable, in which case the data on it would
> be temporarily unavailable but not lost (although, again, there could be a
> small window of uncommited data). However, in terms of protection against
> data loss from HD failures, it seems like the best option for now, no?
>
> It doesn't feel right to just throw more hardware at problems hehe... but I
> guess sometimes it's the only choice :) ...
>
> Please tell me if that makes sense!
>
> --
> Felix
>
>
>
> On Wed, Jan 11, 2012 at 6:16 PM, Felix GV <fe...@mate1inc.com> wrote:
>
> > As I understand it, you cannot use a mirrored Kafka cluster as a hot
> > fail-over.
> >
> > You could probably use it as a manual fail-over, but I don't know the
> > complexity involved in doing that.
> >
> > Also, if your source cluster fails while producers were putting data into
> > it, there will be an "unconsumed window" of data that is lost. This
> > corresponds to the data that the embedded consumer in the mirrored
> cluster
> > did not have time to consume from the source cluster.
> >
> > All in all, the mirrored cluster is akin to asynchronous replication,
> > without any hot fail-over capability. Thus, it provides data redundancy
> > (outside of the unconsumed window described above) but no extra
> > availability (unless you count manual interventions).
> >
> > KAFKA-50 <https://issues.apache.org/jira/browse/KAFKA-50>, on the other
> > hand, will provide both asynchronous AND synchronous replication
> (although
> > the latter will incur a latency penalty) and will be able to use the
> > replicas (data redundancy) as hot-fail overs.
> >
> > Depending on your personal definition of "highly reliable" (whether it
> > includes data redundancy and/or availability), I think that should
> probably
> > answer your question...?
> >
> > To all the Kafka experts: please correct me if the above explanations are
> > incorrect :) !
> >
> > --
> > Felix
> >
> >
> >
> >
> > On Wed, Jan 11, 2012 at 5:53 PM, Jun Rao <jun...@gmail.com> wrote:
> >
> >> It's just that the mirroring logic depends on ZK to be available most of
> >> the time.
> >>
> >> Jun
> >>
> >> On Wed, Jan 11, 2012 at 2:35 PM, Christian Carollo <ccaro...@gmail.com
> >> >wrote:
> >>
> >> > I see.  But if I used that configuration and then did the mirroring
> you
> >> > suggested would that be enough, in your opinion, to be considered
> highly
> >> > reliable?
> >> >
> >> > Christian
> >> >
> >> >
> >> > On Jan 11, 2012, at 2:32 PM, Jun Rao wrote:
> >> >
> >> > >> For example, can I have one ZK instance and one broker on one
> machine
> >> > and
> >> > > that is enough to define a ZK cluster and a Kafka Cluster?
> >> > >
> >> > > Yes, although you don't get the reliability of ZK now.
> >> > >
> >> > > Jun
> >> > >
> >> > >
> >> > > On Wed, Jan 11, 2012 at 2:06 PM, Christian Carollo <
> >> ccaro...@gmail.com
> >> > >wrote:
> >> > >
> >> > >> Jun,
> >> > >>
> >> > >> I don't think I ask my question the right way.
> >> > >>
> >> > >> What I am trying to understand is what are the minimum constituent
> >> parts
> >> > >> of a kafka cluster?
> >> > >>
> >> > >> Based on your last email, I am now wondering what are the minimum
> >> > >> constituent parts of a ZK cluster as well as a Kafka cluster?
> >> > >>
> >> > >> For example, can I have one ZK instance and one broker on one
> machine
> >> > and
> >> > >> that is enough to define a ZK cluster and a Kafka Cluster?
> >> > >>
> >> > >> Thanks,
> >> > >> Christian
> >> > >>
> >> > >>
> >> > >> On Jan 11, 2012, at 1:50 PM, Jun Rao <jun...@gmail.com> wrote:
> >> > >>
> >> > >>> Chrsitan,
> >> > >>>
> >> > >>> A Kafka cluster containers a ZK cluster and a list of brokers.
> When
> >> a
> >> > >>> consumer subscribes to a topic in a kafka cluster, it consumes
> data
> >> > >> stored
> >> > >>> in all brokers in that cluster.
> >> > >>>
> >> > >>> Thanks,
> >> > >>>
> >> > >>> Jun
> >> > >>>
> >> > >>> On Tue, Jan 10, 2012 at 11:28 PM, Christian Carollo <
> >> > ccaro...@gmail.com
> >> > >>> wrote:
> >> > >>>
> >> > >>>> Thank you Jun that is quite helpful.  I have a question about
> Kafka
> >> > >>>> Clusters.  What are the minimum number and types of services that
> >> must
> >> > >> be
> >> > >>>> running to make up a Kafka Cluster?
> >> > >>>>
> >> > >>>> I ask this because the diagrams (in the Kafka Mirroring document)
> >> > allude
> >> > >>>> to a multiple broker environment, however, since each broker does
> >> not
> >> > >>>> appear to provide redundancy (as of today) to any of the other
> >> brokers
> >> > >> in a
> >> > >>>> given zookeeper service, it seems like a Kafka Cluster is nothing
> >> more
> >> > >> than
> >> > >>>> a grouping of a single zookeeper instance with a single Kafka
> >> broker,
> >> > is
> >> > >>>> this the correct understanding?
> >> > >>>>
> >> > >>>> Thanks,
> >> > >>>> Christian
> >> > >>>>
> >> > >>>> On Jan 10, 2012, at 8:47 AM, Jun Rao wrote:
> >> > >>>>
> >> > >>>>> With 0.7, you can set up inter-cluster replication (
> >> > >>>>>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+mirroring
> >> ).
> >> > >>>>>
> >> > >>>>> For the future 0.8 release, we are working on intra-cluster
> >> > replication
> >> > >>>>> support and details can be found at
> >> > >>>>> https://issues.apache.org/jira/browse/KAFKA-50
> >> > >>>>>
> >> > >>>>> Thanks,
> >> > >>>>>
> >> > >>>>> Jun
> >> > >>>>>
> >> > >>>>> On Mon, Jan 9, 2012 at 9:52 PM, Christian Carollo <
> >> > ccaro...@gmail.com
> >> > >>>>> wrote:
> >> > >>>>>
> >> > >>>>>> I am looking to implement Kafka in a production environment,
> >> > however,
> >> > >> I
> >> > >>>>>> haven't found in documentation or examples that
> >> > >>>>>> discuss how to build a redundant implementation.  Is there any
> >> > >>>>>> documentation out their (blogs, articles, etc.) that describes
> >> > >>>>>> how we can implement such a system with Kafka 0.6 or 0.7.
> >> > >>>>>>
> >> > >>>>>> Also, is there a timeframe the community is shooting for, to
> >> release
> >> > >>>> 0.8 w/
> >> > >>>>>> replication?
> >> > >>>>>>
> >> > >>>>>> Thanks
> >> > >>>>>> Christian
> >> > >>>>>>
> >> > >>>>
> >> > >>>>
> >> > >>
> >> >
> >> >
> >>
> >
> >
>

Re: Kafka/ZK Cluster Example

Reply via email to