KarthikP created HBASE-16015: -------------------------------- Summary: Usability - VerifyReplication performance is too slow Key: HBASE-16015 URL: https://issues.apache.org/jira/browse/HBASE-16015 Project: HBase Issue Type: Improvement Components: Usability Reporter: KarthikP Priority: Critical
I see VerifyReplication is too slow in Geo replication cluster, then I dig into the code where default Input scanner caching set as 1 for target cluster request. This value should be optimal or could be exposed in usage command. -Dhbase.mapreduce.scan.cachedrows=100 {code:title=TableInputFormat.java|borderStyle=solid} public static final String SCAN_CACHEDROWS = "hbase.mapreduce.scan.cachedrows"; {code} {code:title=VerifyReplication.java|borderStyle=solid} Configuration conf = context.getConfiguration(); final Scan scan = new Scan(); scan.setCaching(conf.getInt(TableInputFormat.SCAN_CACHEDROWS, 1)); {code} If agree, then I will add this line into printUsage method as shown below, {code:title=VerifyReplication.java|borderStyle=solid} System.err.println("For performance consider the following option, Input scanner caching for source to target cluster request\n" + "-Dhbase.mapreduce.scan.cachedrows=100"); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)