Yep, it just occurred to me while answering you :) I'm the only dev
who worked on the replication stuff, any contribution or just testing
out the software is really appreciated.

J-D

On Thu, Mar 3, 2011 at 12:10 PM, Otis Gospodnetic
<[email protected]> wrote:
> Aha, so the fact that the age doesn't change when replication keeps retrying 
> is
> really a bug?
>
> Otis
>
>
>
>
> ----- Original Message ----
>> From: Jean-Daniel Cryans <[email protected]>
>> To: [email protected]
>> Sent: Thu, March 3, 2011 2:17:08 PM
>> Subject: Re: Questions about HBase Cluster Replication
>>
>> No it's the age in ms:
>>
>> ageOfLastAppliedOp.set(System.currentTimeMillis()  - timestamp);
>>
>> And the timestamp is the one given to the HLogEdit, not the  timestamp
>> of the cell.
>>
>> J-D
>>
>> On Thu, Mar 3, 2011 at 11:13 AM,  Otis Gospodnetic
>> <[email protected]>  wrote:
>> > Is that really the *age* really the *timestamp* of last  successful log
>>shipment?
>> > If so, one could calculate the real age with  age = now() -
>> > ageOfLastShippedOnWhichIsReallyTimestamp .  And that would  be useful to
>>have.
>> >
>> > Otis
>> > ----
>> > Sematext :: http://sematext.com/ :: Solr -  Lucene - Nutch
>> > Lucene ecosystem search :: http://search-lucene.com/
>> >
>> >
>> >
>> > ----- Original  Message ----
>> >> From: Jean-Daniel Cryans <[email protected]>
>> >> To: [email protected]
>> >> Sent:  Thu, March 3, 2011 12:21:09 PM
>> >> Subject: Re: Questions about HBase  Cluster Replication
>> >>
>> >> It's a work in progress, that  information is currently published by
>> >> every  region server in the  master cluster (since it's push
>> >> replication, not pull)  through JMX  under the name
>> >> "ageOfLastShippedOp". It's really not perfect   though, since if it
>> >> fails to replicate and starts retrying then the  age won't  change but
>> >> the actual lag will go up. Also it will have  to be revisited when  we
>> >> add multiple slaves since you don't really  want to publish the  same
>> >> metric for multiple slaves... it really  wouldn't  work.
>> >>
>> >> J-D
>> >>
>> >> On Thu, Mar  3, 2011 at 9:10 AM, Bill Graham <[email protected]>  
>> >> wrote:
>> >> >  Actually, how far behind replication is w.r.t. edit  logs is different
>> >> >  than how out of sync they are, but you get  the idea.
>> >> >
>> >> > On Thu, Mar  3, 2011 at 9:07 AM,  Bill Graham <[email protected]>
>> wrote:
>> >> >> One more question for the FAQ:
>> >>  >>
>> >> >> 6. Is  it possible for an admin to tell just how  out of sync the two
>> >> >>  clusters are? Something like  Seconds_Behind_Master in MySQL's SHOW
>> >> >>  SLAVE  STATUS?
>> >> >>
>> >> >>
>> >> >> On Wed,  Mar 2, 2011 at 9:32  PM, Jean-Daniel Cryans
>><[email protected]>
>> >>wrote:
>> >>  >>> Although, I would add that this feature is still  experimental so  
>> >> who
>>knows
>> >>:)
>> >> >>>
>> >> >>> I  think the worst  that happened to us was that replication was 
>> >> >>> broken
>> >>  >>> (see the  jira where if the master loses it's zk session with the
>>slave
>> >> >>> zk  ensemble, it requires a HBase restart on the  master side) for a
>> few
>> >> >>> days because of maintenance of  the link between the two  datacenters
>> >> >>> which took more  than a minute. When we finally did  restart the 
>> >> >>> master
>> >> >>>  cluster, it had to process about 2TBs of  HLogs... those ICVs can
>> >>  >>> really generate a lot of  data!
>> >>  >>>
>> >> >>> J-D
>> >> >>>
>> >>  >>> On  Wed, Mar 2, 2011 at 9:25 PM, Jean-Daniel Cryans
>><[email protected]>
>> >>wrote:
>> >>  >>>>> 5. If one is adding replication on the  *production*  Master
> cluster,
>> >>what's the
>> >> >>>>> worst  thing that  can happen to this Master cluster?  Nothing scary
>>other
>> >>than
>> >> >>>>> changing configs +  interruption during a restart?  (which is
>>currently
>> >>still  bad
>> >> >>>>> because of region   assignments?)
>> >> >>>>>
>> >>  >>>>
>> >> >>>>  The replication code is pretty  much encapsulated from the rest of
>> the
>> >> >>>> region  server code, it won't mess with your Puts or  change your
>> >>  >>>> birthday  date.
>> >> >>>>
>> >>  >>>> With 0.90 the regions are  reassigned where they were before,  so
> it's
>> >> >>>> really just the  block cache that gets  screwed.
>> >> >>>>
>> >> >>>>   J-D
>> >> >>>>
>> >> >>>
>> >>  >>
>> >> >
>> >>
>> >
>>
>

Reply via email to