[
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vikas Vishwakarma reassigned HBASE-16499:
-----------------------------------------
Assignee: Ashish Singhi (was: Vikas Vishwakarma)
> slow replication for small HBase clusters
> -----------------------------------------
>
> Key: HBASE-16499
> URL: https://issues.apache.org/jira/browse/HBASE-16499
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Vikas Vishwakarma
> Assignee: Ashish Singhi
> Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.0
>
> Attachments: HBASE-16499.patch
>
>
> For small clusters 10-20 nodes we recently observed that replication is
> progressing very slowly when we do bulk writes and there is lot of lag
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed
> that the number of threads used for shipping wal edits in parallel comes from
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
> replicationSinkMgr.getSinks().size());
> ...
> for (int i=0; i<n; i++) {
> entryLists.add(new ArrayList<HLog.Entry>(entries.size()/n+1)); <--
> batch size
> }
> ...
> for (int i=0; i<entryLists.size(); i++) {
> .....
> // RuntimeExceptions encountered here bubble up and are handled
> in ReplicationSource
> pool.submit(createReplicator(entryLists.get(i), i)); <--
> concurrency
> futures++;
> }
> }
> maxThreads is fixed & configurable and since we are taking min of the three
> values n gets decided based replicationSinkMgr.getSinks().size() when we have
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio",
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small
> clusters of size 10-20 RegionServers the value we get for numSinks and hence
> n is very small like 1 or 2. This substantially reduces the pool concurrency
> used for shipping wal edits in parallel effectively slowing down replication
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped.
> Sometimes it takes tens of hours to clear off the entire replication queue
> even after the client has finished writing on the source side.
> We are running tests by varying replication.source.ratio and have seen
> multi-fold improvement in total replication time (will update the results
> here). I wanted to propose here that we should increase the default value for
> replication.source.ratio also so that we have sufficient concurrency even for
> small clusters. We figured it out after lot of iterations and debugging so
> probably slightly higher default will save the trouble.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)