palashc commented on code in PR #21: URL: https://github.com/apache/phoenix-site/pull/21#discussion_r3276626457
########## app/pages/_docs/docs/_mdx/(multi-page)/features/phoenix-sync-table.mdx: ########## @@ -0,0 +1,159 @@ +--- +title: "PhoenixSyncTable Tool" +description: "Detect data divergence between a source and a target Phoenix table across two HBase clusters via a chunked hash comparison driven by MapReduce." +--- + +`PhoenixSyncTableTool` is a MapReduce-based divergence detector for Phoenix +tables that are replicated (or migrated) between two HBase clusters. It +compares chunks of source and target data without transferring full rows over +the network and records any chunk whose hashes disagree to a Phoenix system +table for later inspection. Available in Phoenix 5.3.1 +([PHOENIX-7751](https://issues.apache.org/jira/browse/PHOENIX-7751)). + +The tool is conceptually similar to HBase's `HashTable`/`SyncTable` pair but +is Phoenix-aware (respects TTL, `CURRENT_SCN`, tenant id, indexes, and the +column-encoding scheme) and runs as a **single** MapReduce job with no HDFS +intermediate. Output is a Phoenix table, queryable with SQL. + +`PhoenixSyncTableTool` performs **detection only** in 5.3.1; it does not +modify the target cluster. Review Comment: done, used what you provided -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
