[
https://issues.apache.org/jira/browse/HBASE-17871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15958211#comment-15958211
]
Ted Yu commented on HBASE-17871:
--------------------------------
{code}
HBASE-17871 patch is being downloaded at Wed Apr 5 22:38:21 UTC 2017 from
https://issues.apache.org/jira/secure/attachment/12862187/after.png ->
Downloaded
ERROR: Unsure how to process HBASE-17871.
{code}
In the future, attach patch after attaching pictures.
> scan#setBatch(int) call leads wrong result of VerifyReplication
> ---------------------------------------------------------------
>
> Key: HBASE-17871
> URL: https://issues.apache.org/jira/browse/HBASE-17871
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0, 1.4.0
> Reporter: Tomu Tsuruhara
> Assignee: Tomu Tsuruhara
> Priority: Minor
> Attachments: after.png, beforethepatch.png,
> HBASE-17871.master.001.patch, HBASE-17871.master.002.patch,
> HBASE-17871.master.003.patch
>
>
> VerifyReplication tool printed weird logs.
> {noformat}
> 2017-04-03 23:30:50,252 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> CONTENT_DIFFERENT_ROWS, rowkey=a00001001930000
> 2017-04-03 23:30:50,280 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001001930000
> 2017-04-03 23:30:50,387 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> CONTENT_DIFFERENT_ROWS, rowkey=a00001003850000
> 2017-04-03 23:30:50,414 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001003850000
> 2017-04-03 23:30:50,480 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> CONTENT_DIFFERENT_ROWS, rowkey=a00001005320000
> 2017-04-03 23:30:50,508 ERROR [main]
> org.apache.hadoop.hbase.mapreduce.replication.VerifyReplication:
> ONLY_IN_PEER_TABLE_ROWS, rowkey=a00001005320000
> {noformat}
> Here, each bad rows were marked as both {{CONTENT_DIFFERENT_ROWS}} and
> {{ONLY_IN_PEER_TABLE_ROWS}}.
> This should never happen so I took a look at code and found scan.setBatch
> call.
> {code}
> @Override
> public void map(ImmutableBytesWritable row, final Result value,
> Context context)
> throws IOException {
> if (replicatedScanner == null) {
> ...
> final Scan scan = new Scan();
> scan.setBatch(batch);
> {code}
> As stated in HBASE-16376, {{scan#setBatch(int)}} call implicitly allows scan
> results to be partial.
> Since {{VerifyReplication}} is assuming each {{scanner.next()}} call returns
> entire row,
> partial results break compare logic.
> We should avoid setBatch call here.
> Thanks to RPC chunking (explained in this blog
> https://blogs.apache.org/hbase/entry/scan_improvements_in_hbase_1),
> it's safe and acceptable I think.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)