Aha, so the fact that the age doesn't change when replication keeps retrying is really a bug?
Otis ----- Original Message ---- > From: Jean-Daniel Cryans <[email protected]> > To: [email protected] > Sent: Thu, March 3, 2011 2:17:08 PM > Subject: Re: Questions about HBase Cluster Replication > > No it's the age in ms: > > ageOfLastAppliedOp.set(System.currentTimeMillis() - timestamp); > > And the timestamp is the one given to the HLogEdit, not the timestamp > of the cell. > > J-D > > On Thu, Mar 3, 2011 at 11:13 AM, Otis Gospodnetic > <[email protected]> wrote: > > Is that really the *age* really the *timestamp* of last successful log >shipment? > > If so, one could calculate the real age with age = now() - > > ageOfLastShippedOnWhichIsReallyTimestamp . And that would be useful to >have. > > > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > > ----- Original Message ---- > >> From: Jean-Daniel Cryans <[email protected]> > >> To: [email protected] > >> Sent: Thu, March 3, 2011 12:21:09 PM > >> Subject: Re: Questions about HBase Cluster Replication > >> > >> It's a work in progress, that information is currently published by > >> every region server in the master cluster (since it's push > >> replication, not pull) through JMX under the name > >> "ageOfLastShippedOp". It's really not perfect though, since if it > >> fails to replicate and starts retrying then the age won't change but > >> the actual lag will go up. Also it will have to be revisited when we > >> add multiple slaves since you don't really want to publish the same > >> metric for multiple slaves... it really wouldn't work. > >> > >> J-D > >> > >> On Thu, Mar 3, 2011 at 9:10 AM, Bill Graham <[email protected]> wrote: > >> > Actually, how far behind replication is w.r.t. edit logs is different > >> > than how out of sync they are, but you get the idea. > >> > > >> > On Thu, Mar 3, 2011 at 9:07 AM, Bill Graham <[email protected]> > wrote: > >> >> One more question for the FAQ: > >> >> > >> >> 6. Is it possible for an admin to tell just how out of sync the two > >> >> clusters are? Something like Seconds_Behind_Master in MySQL's SHOW > >> >> SLAVE STATUS? > >> >> > >> >> > >> >> On Wed, Mar 2, 2011 at 9:32 PM, Jean-Daniel Cryans ><[email protected]> > >>wrote: > >> >>> Although, I would add that this feature is still experimental so > >> who >knows > >>:) > >> >>> > >> >>> I think the worst that happened to us was that replication was broken > >> >>> (see the jira where if the master loses it's zk session with the >slave > >> >>> zk ensemble, it requires a HBase restart on the master side) for a > few > >> >>> days because of maintenance of the link between the two datacenters > >> >>> which took more than a minute. When we finally did restart the master > >> >>> cluster, it had to process about 2TBs of HLogs... those ICVs can > >> >>> really generate a lot of data! > >> >>> > >> >>> J-D > >> >>> > >> >>> On Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans ><[email protected]> > >>wrote: > >> >>>>> 5. If one is adding replication on the *production* Master cluster, > >>what's the > >> >>>>> worst thing that can happen to this Master cluster? Nothing scary > >> >>>>> >other > >>than > >> >>>>> changing configs + interruption during a restart? (which is >currently > >>still bad > >> >>>>> because of region assignments?) > >> >>>>> > >> >>>> > >> >>>> The replication code is pretty much encapsulated from the rest of > the > >> >>>> region server code, it won't mess with your Puts or change your > >> >>>> birthday date. > >> >>>> > >> >>>> With 0.90 the regions are reassigned where they were before, so it's > >> >>>> really just the block cache that gets screwed. > >> >>>> > >> >>>> J-D > >> >>>> > >> >>> > >> >> > >> > > >> > > >
