[
https://issues.apache.org/jira/browse/HBASE-16499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16418580#comment-16418580
]
Hadoop QA commented on HBASE-16499:
-----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m
0s{color} | {color:green} The patch appears to include 1 new or modified test
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m
54s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 6m
13s{color} | {color:green} branch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m
6s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m
7s{color} | {color:green} patch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}
19m 45s{color} | {color:green} Patch does not cause any errors with Hadoop
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m
59s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 10m 37s{color}
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
10s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 56m 50s{color} |
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-16499 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12916755/HBASE-16499.patch |
| Optional Tests | asflicense javac javadoc unit findbugs shadedjars
hadoopcheck hbaseanti checkstyle compile |
| uname | Linux c1fbadb041b1 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
|
| git revision | master / d8b550fabc |
| maven | version: Apache Maven 3.5.3
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| unit |
https://builds.apache.org/job/PreCommit-HBASE-Build/12200/artifact/patchprocess/patch-unit-hbase-server.txt
|
| Test Results |
https://builds.apache.org/job/PreCommit-HBASE-Build/12200/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/12200/console |
| Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
> slow replication for small HBase clusters
> -----------------------------------------
>
> Key: HBASE-16499
> URL: https://issues.apache.org/jira/browse/HBASE-16499
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Vikas Vishwakarma
> Assignee: Vikas Vishwakarma
> Priority: Major
> Fix For: 3.0.0, 2.1.0, 2.0.0
>
> Attachments: HBASE-16499.patch
>
>
> For small clusters 10-20 nodes we recently observed that replication is
> progressing very slowly when we do bulk writes and there is lot of lag
> accumulation on AgeOfLastShipped / SizeOfLogQueue. From the logs we observed
> that the number of threads used for shipping wal edits in parallel comes from
> the following equation in HBaseInterClusterReplicationEndpoint
> int n = Math.min(Math.min(this.maxThreads, entries.size()/100+1),
> replicationSinkMgr.getSinks().size());
> ...
> for (int i=0; i<n; i++) {
> entryLists.add(new ArrayList<HLog.Entry>(entries.size()/n+1)); <--
> batch size
> }
> ...
> for (int i=0; i<entryLists.size(); i++) {
> .....
> // RuntimeExceptions encountered here bubble up and are handled
> in ReplicationSource
> pool.submit(createReplicator(entryLists.get(i), i)); <--
> concurrency
> futures++;
> }
> }
> maxThreads is fixed & configurable and since we are taking min of the three
> values n gets decided based replicationSinkMgr.getSinks().size() when we have
> enough edits to replicate
> replicationSinkMgr.getSinks().size() is decided based on
> int numSinks = (int) Math.ceil(slaveAddresses.size() * ratio);
> where ratio is this.ratio = conf.getFloat("replication.source.ratio",
> DEFAULT_REPLICATION_SOURCE_RATIO);
> Currently DEFAULT_REPLICATION_SOURCE_RATIO is set to 10% so for small
> clusters of size 10-20 RegionServers the value we get for numSinks and hence
> n is very small like 1 or 2. This substantially reduces the pool concurrency
> used for shipping wal edits in parallel effectively slowing down replication
> for small clusters and causing lot of lag accumulation in AgeOfLastShipped.
> Sometimes it takes tens of hours to clear off the entire replication queue
> even after the client has finished writing on the source side.
> We are running tests by varying replication.source.ratio and have seen
> multi-fold improvement in total replication time (will update the results
> here). I wanted to propose here that we should increase the default value for
> replication.source.ratio also so that we have sufficient concurrency even for
> small clusters. We figured it out after lot of iterations and debugging so
> probably slightly higher default will save the trouble.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)