[ 
https://issues.apache.org/jira/browse/HBASE-12770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358796#comment-15358796
 ] 

Phil Yang commented on HBASE-12770:
-----------------------------------

Hi all
This is an old issue, I'm going to pick up [~cuijianwei]'s work. We can 
decrease the delay of replication a lot by balancing the number of queues in 
each RS. I'll upload a new patch whose logic may be a little different from 
Jianwei's. The basic idea is:

When we get event of RSRemoved, we can do a repeatable job (read the list of 
queues in dead RS, and try to transfer a random queue) until the list is empty. 
Between the repeatable jobs we should sleep several seconds to let other region 
servers move some queues (sleep shorter if it doesn't claim any queue last 
time, sleep longer if it moved a queue, two fixed sleep time here, no random). 
If we disable hbase.zookeeper.useMulti, the lock is still RS level (different 
from Jianwei's patch for compatibility).

Although we may still can not balance the number of queues perfectly, I think 
it is much better than now.

Any thoughts? Thanks.

> Don't transfer all the queued hlogs of a dead server to the same alive server
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-12770
>                 URL: https://issues.apache.org/jira/browse/HBASE-12770
>             Project: HBase
>          Issue Type: Improvement
>          Components: Replication
>            Reporter: Jianwei Cui
>            Assignee: Phil Yang
>            Priority: Minor
>         Attachments: HBASE-12770-trunk.patch
>
>
> When a region server is down(or the cluster restart), all the hlog queues 
> will be transferred by the same alive region server. In a shared cluster, we 
> might create several peers replicating data to different peer clusters. There 
> might be lots of hlogs queued for these peers caused by several reasons, such 
> as some peers might be disabled, or errors from peer cluster might prevent 
> the replication, or the replication sources may fail to read some hlog 
> because of hdfs problem. Then, if the server is down or restarted, another 
> alive server will take all the replication jobs of the dead server, this 
> might bring a big pressure to resources(network/disk read) of the alive 
> server and also is not fast enough to replicate the queued hlogs. And if the 
> alive server is down, all the replication jobs including that takes from 
> other dead servers will once again be totally transferred to another alive 
> server, this might cause a server have a large number of queued hlogs(in our 
> shared cluster, we find one server might have thousands of queued hlogs for 
> replication). As an optional way, is it reasonable that the alive server only 
> transfer one peer's hlogs from the dead server one time? Then, other alive 
> region servers might have the opportunity to transfer the hlogs of rest 
> peers. This may also help the queued hlogs be processed more fast. Any 
> discussion is welcome.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to