Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write

Jerry Lam Tue, 01 May 2012 19:33:06 -0700

Hi Himanshu:

My team is particularly interested in the cyclic replication so I have enable 
the master-master replication (so each cluster has the other cluster as its 
replication peer), although the replication was one direction (from cluster A 
to cluster B) in the test. I didn't stop_replication on the other cluster if 
that is what you mean by disabling the replication.


Thanks!

Jerry

On 2012-05-01, at 10:08 PM, Himanshu Vashishtha wrote:

> Yeah, I should have mentioned that: its master-master, and on cdh4b1.
> But, replication on that specific slave table is disabled (so,
> effectively its master-slave for this test).
> 
> Is this same as yours (replication config wise), or shall I enable
> replication on the destination table too?
> 
> Thanks,
> Himanshu
> 
> On Tue, May 1, 2012 at 8:01 PM, Jerry Lam <[email protected]> wrote:
>> Hi Himanshu:
>> 
>> Thanks for following up! I did looked up the log and there were some 
>> exceptions. I'm not sure if those exceptions contribute to the problem I've 
>> seen a week ago.
>> I did aware of the latency between the time that the master said "Nothing to 
>> replicate" and the actual time it takes to actually replicate on the slave. 
>> I remember I wait 12 hours for the replication to finish (i.e. start the 
>> test before leaving office and check the result the next day) and data still 
>> not fully replicated.
>> 
>> By the way, is your test running with master-slave replication or 
>> master-master replication?
>> 
>> I will resume this again. I was busy on something else for the past week or 
>> so.
>> 
>> Best Regards,
>> 
>> Jerry
>> 
>> On 2012-05-01, at 6:41 PM, Himanshu Vashishtha wrote:
>> 
>>> Hello Jerry,
>>> 
>>> Did you try this again.
>>> 
>>> Whenever you try next, can you please share the logs somehow.
>>> 
>>> I tried replicating your scenario today, but no luck. I used the same
>>> workload you have copied here; master cluster has 5 nodes and slave
>>> has just 2 nodes; and made tiny regions of 8MB (memstore flushing at
>>> 8mb too), so that I have around 1200+ regions even for 200k rows; ran
>>> the workload with 16, 24 and 32 client threads, but the verifyrep
>>> mapreduce job says its good.
>>> Yes, I ran the verifyrep command after seeing "there is nothing to
>>> replicate" message on all the regionservers; sometimes it was a bit
>>> slow.
>>> 
>>> 
>>> Thanks,
>>> Himanshu
>>> 
>>> On Mon, Apr 23, 2012 at 11:57 AM, Jean-Daniel Cryans
>>> <[email protected]> wrote:
>>>>> I will try your suggestion today with a master-slave replication enabled 
>>>>> from Cluster A -> Cluster B.
>>>> 
>>>> Please do.
>>>> 
>>>>> Last Friday, I tried to limit the variability/the moving part of the 
>>>>> replication components. I reduced the size of Cluster B to have only 1 
>>>>> regionserver and having Cluster A to replicate data from one region only 
>>>>> without region splitting (therefore I have 1-to-1 region replication 
>>>>> setup). During the benchmark, I moved the region between different 
>>>>> regionservers in Cluster A (note there are still 3 regionservers in 
>>>>> Cluster A). I ran this test for 5 times and no data were lost. Does it 
>>>>> mean something? My feeling is there are some glitches/corner cases that 
>>>>> have not been covered in the cyclic replication (or hbase replication in 
>>>>> general). Note that, this happens only when the load is high.
>>>> 
>>>> And have you looked at the logs? Any obvious exceptions coming up?
>>>> Replication uses the normal HBase client to insert the data on the
>>>> other cluster and this is what handles regions moving around.
>>>> 
>>>>> 
>>>>> By the way, why do we need to have a zookeeper not handled by hbase for 
>>>>> the replication to work (it is described in the hbase documentation)?
>>>> 
>>>> It says you *should* do it, not you *need* to do it :)
>>>> 
>>>> But basically replication is zk-heavy and getting a better
>>>> understanding of it starts with handling it yourself.
>>>> 
>>>> J-D
>>

Re: HBase Cyclic Replication Issue: some data are missing in the replication for intensive write

Reply via email to