[ 
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15447847#comment-15447847
 ] 

Vikas Vishwakarma commented on HBASE-16499:
-------------------------------------------

Hi [~churromorales] similar logic is already there in this equation in 
HBaseInterClusterReplicationEndpoint

int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
replicationSinkMgr.getSinks().size());

using max instead of min can be a problem if the cluster size or number of 
batches available to ship is less than the max constant value. We saw good 
improvement by increasing replication source ratio for our DR setup and cluster 
size. I am just trying to verify that it is fine to have a slightly higher 
ratio in general cases also by testing it against different cluster size setup. 
Currently backlogged because of setup availability.  Max is limited by 
this.maxThreads so changing the ratio won't affect the behavior for large 
clusters by much in this case also and it gives the fine tuning flexibility for 
different setups. 



> slow replication for small HBase clusters
> -----------------------------------------
>
>                 Key: HBASE-16499
>                 URL: https://issues.apache.org/jira/browse/HBASE-16499
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Vikas Vishwakarma
>            Assignee: Vikas Vishwakarma
>             Fix For: 0.98.20
>
>
> For small clusters 10-20 nodes we recently observed that replication is 
> progressing very slowly when we do bulk writes and there is lot of lag 
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed 
> that the number of threads used for shipping wal edits in parallel comes from 
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
>       replicationSinkMgr.getSinks().size());
> ... 
>       for (int i=0; i<n; i++) {
>         entryLists.add(new ArrayList<HLog.Entry>(entries.size()/n+1)); <-- 
> batch size
>       }
> ...
>         for (int i=0; i<entryLists.size(); i++) {
>          .....
>             // RuntimeExceptions encountered here bubble up and are handled 
> in ReplicationSource
>             pool.submit(createReplicator(entryLists.get(i), i));  <-- 
> concurrency 
>             futures++;
>           }
>         }
> maxThreads is fixed & configurable and since we are taking min of the three 
> values n gets decided based replicationSinkMgr.getSinks().size() when we have 
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on 
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio", 
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small 
> clusters of size 10-20 RegionServers  the value we get for numSinks and hence 
> n is very small like 1 or 2. This substantially reduces the pool concurrency 
> used for shipping wal edits in parallel effectively slowing down replication 
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped. 
> Sometimes it takes tens of hours to clear off the entire replication queue 
> even after the client has finished writing on the source side. 
> We are running tests by varying replication.source.ratio and have seen 
> multi-fold improvement in total replication time (will update the results 
> here). I wanted to propose here that we should increase the default value for 
> replication.source.ratio also so that we have sufficient concurrency even for 
> small clusters. We figured it out after lot of iterations and debugging so 
> probably slightly higher default will save the trouble. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to