Re: Questions about HBase Cluster Replication

Otis Gospodnetic Thu, 03 Mar 2011 14:04:42 -0800

Here it is: https://issues.apache.org/jira/browse/HBASE-3597


I think we'll have the opportunity to test out cluster replication and provide 
feedback soon.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Jean-Daniel Cryans <[email protected]>
> To: [email protected]
> Sent: Thu, March 3, 2011 3:41:04 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> Yep, it just occurred to me while answering you :) I'm the only dev
> who  worked on the replication stuff, any contribution or just testing
> out the  software is really appreciated.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 12:10  PM, Otis Gospodnetic
> <[email protected]>  wrote:
> > Aha, so the fact that the age doesn't change when replication  keeps 
> > retrying 
>is
> > really a bug?
> >
> >  Otis
> >
> >
> >
> >
> > ----- Original Message  ----
> >> From: Jean-Daniel Cryans <[email protected]>
> >> To: [email protected]
> >> Sent:  Thu, March 3, 2011 2:17:08 PM
> >> Subject: Re: Questions about HBase  Cluster Replication
> >>
> >> No it's the age in  ms:
> >>
> >> ageOfLastAppliedOp.set(System.currentTimeMillis()  -  timestamp);
> >>
> >> And the timestamp is the one given to the  HLogEdit, not the  timestamp
> >> of the cell.
> >>
> >>  J-D
> >>
> >> On Thu, Mar 3, 2011 at 11:13 AM,  Otis  Gospodnetic
> >> <[email protected]>   wrote:
> >> > Is that really the *age* really the *timestamp* of last   successful log
> >>shipment?
> >> > If so, one could calculate  the real age with  age = now() -
> >> >  ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful  
to
> >>have.
> >> >
> >> > Otis
> >> >  ----
> >> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
> >> > Lucene  ecosystem search :: http://search-lucene.com/
> >> >
> >> >
> >>  >
> >> > ----- Original  Message ----
> >> >> From:  Jean-Daniel Cryans <[email protected]>
> >>  >> To: [email protected]
> >>  >> Sent:  Thu, March 3, 2011 12:21:09 PM
> >> >> Subject: Re:  Questions about HBase  Cluster Replication
> >> >>
> >>  >> It's a work in progress, that  information is currently published  by
> >> >> every  region server in the  master cluster (since it's  push
> >> >> replication, not pull)  through JMX  under the  name
> >> >> "ageOfLastShippedOp". It's really not perfect    though, since if it
> >> >> fails to replicate and starts retrying  then the  age won't  change but
> >> >> the actual lag will go up.  Also it will have  to be revisited when  we
> >> >> add multiple  slaves since you don't really  want to publish the  same
> >> >>  metric for multiple slaves... it really  wouldn't  work.
> >>  >>
> >> >> J-D
> >> >>
> >> >> On  Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <[email protected]>  
> wrote:
> >> >> >  Actually, how far behind replication is  w.r.t. edit  logs is 
>different
> >> >> >  than how out of sync  they are, but you get  the idea.
> >> >> >
> >> >>  > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <[email protected]>
> >>  wrote:
> >> >> >> One more question for the FAQ:
> >>  >>  >>
> >> >> >> 6. Is  it possible for an admin  to tell just how  out of sync the 
two
> >> >> >>  clusters  are? Something like  Seconds_Behind_Master in MySQL's SHOW
> >> >>  >>  SLAVE  STATUS?
> >> >> >>
> >> >>  >>
> >> >> >> On Wed,  Mar 2, 2011 at 9:32  PM,  Jean-Daniel Cryans
> >><[email protected]>
> >>  >>wrote:
> >> >>  >>> Although, I would add that  this feature is still  experimental so 
> who
> >>knows
> >>  >>:)
> >> >> >>>
> >> >> >>> I   think the worst  that happened to us was that replication was  
>broken
> >> >>  >>> (see the  jira where if the master  loses it's zk session with the
> >>slave
> >> >>  >>> zk  ensemble, it requires a HBase restart on the  master side) for 
> >> >>  
>a
> >> few
> >> >> >>> days because of maintenance  of  the link between the two 
> datacenters
> >> >> >>>  which took more  than a minute. When we finally did  restart the  
>master
> >> >> >>>  cluster, it had to process about 2TBs  of  HLogs... those ICVs can
> >> >>  >>> really generate a  lot of  data!
> >> >>  >>>
> >> >>  >>> J-D
> >> >> >>>
> >> >>   >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel  Cryans
> >><[email protected]>
> >>  >>wrote:
> >> >>  >>>>> 5. If one is adding  replication on the  *production*  Master
> > cluster,
> >>  >>what's the
> >> >> >>>>> worst  thing that   can happen to this Master cluster?  Nothing 
>scary
> >>other
> >>  >>than
> >> >> >>>>> changing configs +   interruption during a restart?  (which is
> >>currently
> >>  >>still  bad
> >> >> >>>>> because of region    assignments?)
> >> >> >>>>>
> >> >>   >>>>
> >> >> >>>>  The replication code is  pretty  much encapsulated from the rest 
of
> >> the
> >> >>  >>>> region  server code, it won't mess with your Puts or  change  your
> >> >>  >>>> birthday  date.
> >> >>  >>>>
> >> >>  >>>> With 0.90 the regions  are  reassigned where they were before,  so
> > it's
> >> >>  >>>> really just the  block cache that gets  screwed.
> >>  >> >>>>
> >> >> >>>>    J-D
> >> >> >>>>
> >> >>  >>>
> >> >>  >>
> >> >>  >
> >> >>
> >> >
> >>
> >
>

Re: Questions about HBase Cluster Replication

Reply via email to