Re: confused info about region-regionserver locality

Dave Wang Thu, 04 Apr 2013 11:51:13 -0700

I think the order will matter if you run with say replication factor 2.

- Dave



On Thu, Apr 4, 2013 at 11:30 AM, lars hofhansl <[email protected]> wrote:

> >> When the write request returns to the client there will be a local
> copy, a copy on another machine in the same, and a copy on a machine in a
> different rack, who cares about the ordering inside the pipeline?
> > Not necessary. There might not be any additional copy on a different
> > machine on the same rack. BUT.. As you said, who cares ;) As long as
> > we have the local copy and some replicas.
>
> Really? Doesn't the whole pipeline have to be successful in order to
> return success to the client.
> (I might be confused :) )
>
>
>
> ________________________________
>  From: Jean-Marc Spaggiari <[email protected]>
> To: [email protected]; lars hofhansl <[email protected]>
> Sent: Thursday, April 4, 2013 11:24 AM
> Subject: Re: confused info about region-regionserver locality
>
> >Isn't this done via pipelining anyway?
> Yes, it's the way it's done.
>
> >So there's no notion of ordering with respect 1st, 2nd, and 3rd block,
> either all writes go through the pipeline or none are.
> Still correct.
>
> > When the write request returns to the client there will be a local copy,
> a copy on another machine in the same, and a copy on a machine in a
> different rack, who cares about the ordering inside the pipeline?
> Not necessary. There might not be any additional copy on a different
> machine on the same rack. BUT.. As you said, who cares ;) As long as
> we have the local copy and some replicas.
>
> I have updated the documentation already. I will open the JIRA and
> submit. I have also added subsequent replicas in case replication
> factor is > 3.
>
> JM
>
> 2013/4/4 lars hofhansl <[email protected]>:
> > Isn't this done via pipelining anyway?
> > So there's no notion of ordering with respect 1st, 2nd, and 3rd block,
> either all writes go through the pipeline or none are.
> >
> > When the write request returns to the client there will be a local copy,
> a copy on another machine in the same, and a copy on a machine in a
> different rack, who cares about the ordering inside the pipeline?
> >
> >
> > Seems it would also be inefficient to pipeline from the local rack to
> another another one and then in the same pipeline back into the local rack
> (more load on the switch connecting the racks with no benefit).
> >
> > I'll double check.
> >
> >
> > -- Lars
> >
> >
> >
> > ________________________________
> >  From: Jean-Marc Spaggiari <[email protected]>
> > To: [email protected]
> > Sent: Thursday, April 4, 2013 8:25 AM
> > Subject: Re: confused info about region-regionserver locality
>
>
> >
> > Hi,
> >
> > I think you're right and documentation need to be updated.
> >
> > The 3rd replica is written on a random node in the same rack as the
> > 2nd replica. I will double check. Can you please open a JIRA so this
> > is updated?
> >
> > JM
> >
> > 2013/4/4 KIM JUN YOUNG <[email protected]>:
> >> Hi All.
> >>
> >> There is confused understanding about region-regionser locality.
> >>
> >> from the current document ,
> >>
> >> http://hbase.apache.org/book/regions.arch.html
> >> 9.7.3. Region-RegionServer Locality
> >> Over time, Region-RegionServer locality is achieved via HDFS block
> replication. The HDFS client does the following by default when choosing
> locations to write replicas:
> >>
> >> First replica is written to local node
> >> Second replica is written to another node in same rack
> >> Third replica is written to a node in another rack (if sufficient nodes)
> >>
> >>
> >> but, my understanding is different
> >> HDFS write blocks for replica
> >>
> >>         first, local node
> >>         second, another node in another rack
> >>         third, random another node in same rack
> >>
> >> need to be changed? or am I missing something?
>

Re: confused info about region-regionserver locality

Reply via email to