KarthikP created HBASE-16015:
--------------------------------
Summary: Usability - VerifyReplication performance is too slow
Key: HBASE-16015
URL: https://issues.apache.org/jira/browse/HBASE-16015
Project: HBase
Issue Type: Improvement
Components: Usability
Reporter: KarthikP
Priority: Critical
I see VerifyReplication is too slow in Geo replication cluster, then I dig into
the code where default Input scanner caching set as 1 for target cluster
request.
This value should be optimal or could be exposed in usage command.
-Dhbase.mapreduce.scan.cachedrows=100
{code:title=TableInputFormat.java|borderStyle=solid}
public static final String SCAN_CACHEDROWS = "hbase.mapreduce.scan.cachedrows";
{code}
{code:title=VerifyReplication.java|borderStyle=solid}
Configuration conf = context.getConfiguration();
final Scan scan = new Scan();
scan.setCaching(conf.getInt(TableInputFormat.SCAN_CACHEDROWS, 1));
{code}
If agree, then I will add this line into printUsage method as shown below,
{code:title=VerifyReplication.java|borderStyle=solid}
System.err.println("For performance consider the following option, Input
scanner caching for source to target cluster request\n"
+ "-Dhbase.mapreduce.scan.cachedrows=100");
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)