[
https://issues.apache.org/jira/browse/PHOENIX-3817?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16689850#comment-16689850
]
Akshita Malhotra commented on PHOENIX-3817:
-------------------------------------------
[~vincentpoon] Apologies for getting back late on this, also thanks for a
detailed review.
1a) Target only row(s) that fall out of source scan boundaries (based on a
select query) are unaccounted for. The tool will not detect those rows as the
tool assume source as the source of truth or the comparison benchmark. I
pondered about this situation many times on whether the tool should be dealing
with such cases, seems like HBase verify replication tool doesn't deal with
this scenario either but will keep in mind as an extension.
1b) Same reasoning as above. This is definitely possible, for example based on
a select scan target might have a certain extra rows (extra key range) which
might not be present completely on the source side.
2) Thanks for pointing this out, will add tests to analyze the secondary
indexes and see if I need to prevent the optimization.
3) This definitely is a great service protection, I will add it.
>A future enhancement might be to write the bad/missing rowkeys to a Phoenix
>table, such that they're queryable.
That sounds good! I will create a new Jira for the same!
> VerifyReplication using SQL
> ---------------------------
>
> Key: PHOENIX-3817
> URL: https://issues.apache.org/jira/browse/PHOENIX-3817
> Project: Phoenix
> Issue Type: Improvement
> Reporter: Alex Araujo
> Assignee: Akshita Malhotra
> Priority: Minor
> Fix For: 4.15.0
>
> Attachments: PHOENIX-3817-final.patch, PHOENIX-3817-final2.patch,
> PHOENIX-3817.v1.patch, PHOENIX-3817.v2.patch, PHOENIX-3817.v3.patch,
> PHOENIX-3817.v4.patch, PHOENIX-3817.v5.patch, PHOENIX-3817.v6.patch,
> PHOENIX-3817.v7.patch
>
>
> Certain use cases may copy or replicate a subset of a table to a different
> table or cluster. For example, application topologies may map data for
> specific tenants to different peer clusters.
> It would be useful to have a Phoenix VerifyReplication tool that accepts an
> SQL query, a target table, and an optional target cluster. The tool would
> compare data returned by the query on the different tables and update various
> result counters (similar to HBase's VerifyReplication).
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)