Is that really the *age* really the *timestamp* of last successful log shipment? If so, one could calculate the real age with age = now() - ageOfLastShippedOnWhichIsReallyTimestamp . And that would be useful to have.
Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ ----- Original Message ---- > From: Jean-Daniel Cryans <[email protected]> > To: [email protected] > Sent: Thu, March 3, 2011 12:21:09 PM > Subject: Re: Questions about HBase Cluster Replication > > It's a work in progress, that information is currently published by > every region server in the master cluster (since it's push > replication, not pull) through JMX under the name > "ageOfLastShippedOp". It's really not perfect though, since if it > fails to replicate and starts retrying then the age won't change but > the actual lag will go up. Also it will have to be revisited when we > add multiple slaves since you don't really want to publish the same > metric for multiple slaves... it really wouldn't work. > > J-D > > On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <[email protected]> wrote: > > Actually, how far behind replication is w.r.t. edit logs is different > > than how out of sync they are, but you get the idea. > > > > On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <[email protected]> wrote: > >> One more question for the FAQ: > >> > >> 6. Is it possible for an admin to tell just how out of sync the two > >> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW > >> SLAVE STATUS? > >> > >> > >> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans <[email protected]> >wrote: > >>> Although, I would add that this feature is still experimental so who > >>> knows >:) > >>> > >>> I think the worst that happened to us was that replication was broken > >>> (see the jira where if the master loses it's zk session with the slave > >>> zk ensemble, it requires a HBase restart on the master side) for a few > >>> days because of maintenance of the link between the two datacenters > >>> which took more than a minute. When we finally did restart the master > >>> cluster, it had to process about 2TBs of HLogs... those ICVs can > >>> really generate a lot of data! > >>> > >>> J-D > >>> > >>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans <[email protected]> > >>> >wrote: > >>>>> 5. If one is adding replication on the *production* Master cluster, >what's the > >>>>> worst thing that can happen to this Master cluster? Nothing scary > >>>>> other >than > >>>>> changing configs + interruption during a restart? (which is currently >still bad > >>>>> because of region assignments?) > >>>>> > >>>> > >>>> The replication code is pretty much encapsulated from the rest of the > >>>> region server code, it won't mess with your Puts or change your > >>>> birthday date. > >>>> > >>>> With 0.90 the regions are reassigned where they were before, so it's > >>>> really just the block cache that gets screwed. > >>>> > >>>> J-D > >>>> > >>> > >> > > >
