Is that really the *age* really the *timestamp* of last successful log shipment?
If so, one could calculate the real age with age = now() - 
ageOfLastShippedOnWhichIsReallyTimestamp .  And that would be useful to have.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/



----- Original Message ----
> From: Jean-Daniel Cryans <[email protected]>
> To: [email protected]
> Sent: Thu, March 3, 2011 12:21:09 PM
> Subject: Re: Questions about HBase Cluster Replication
> 
> It's a work in progress, that information is currently published by
> every  region server in the master cluster (since it's push
> replication, not pull)  through JMX under the name
> "ageOfLastShippedOp". It's really not perfect  though, since if it
> fails to replicate and starts retrying then the age won't  change but
> the actual lag will go up. Also it will have to be revisited when  we
> add multiple slaves since you don't really want to publish the  same
> metric for multiple slaves... it really wouldn't  work.
> 
> J-D
> 
> On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <[email protected]> wrote:
> >  Actually, how far behind replication is w.r.t. edit logs is different
> >  than how out of sync they are, but you get the idea.
> >
> > On Thu, Mar  3, 2011 at 9:07 AM, Bill Graham <[email protected]>  wrote:
> >> One more question for the FAQ:
> >>
> >> 6. Is  it possible for an admin to tell just how out of sync the two
> >>  clusters are? Something like Seconds_Behind_Master in MySQL's SHOW
> >>  SLAVE STATUS?
> >>
> >>
> >> On Wed, Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans <[email protected]>  
>wrote:
> >>> Although, I would add that this feature is still  experimental so who 
> >>> knows 
>:)
> >>>
> >>> I think the worst  that happened to us was that replication was broken
> >>> (see the  jira where if the master loses it's zk session with the slave
> >>> zk  ensemble, it requires a HBase restart on the master side) for a  few
> >>> days because of maintenance of the link between the two  datacenters
> >>> which took more than a minute. When we finally did  restart the master
> >>> cluster, it had to process about 2TBs of  HLogs... those ICVs can
> >>> really generate a lot of  data!
> >>>
> >>> J-D
> >>>
> >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <[email protected]> 
> >>>  
>wrote:
> >>>>> 5. If one is adding replication on the  *production* Master cluster, 
>what's the
> >>>>> worst thing that  can happen to this Master cluster?  Nothing scary 
> >>>>> other  
>than
> >>>>> changing configs + interruption during a restart?  (which is currently 
>still bad
> >>>>> because of region  assignments?)
> >>>>>
> >>>>
> >>>>  The replication code is pretty much encapsulated from the rest of  the
> >>>> region server code, it won't mess with your Puts or  change your
> >>>> birthday  date.
> >>>>
> >>>> With 0.90 the regions are  reassigned where they were before, so it's
> >>>> really just the  block cache that gets screwed.
> >>>>
> >>>>  J-D
> >>>>
> >>>
> >>
> >
> 

Reply via email to