[
https://issues.apache.org/jira/browse/HIVE-18988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16458066#comment-16458066
]
Sankar Hariappan commented on HIVE-18988:
-----------------------------------------
Test failures are irrelevant to the patch.
> Support bootstrap replication of ACID tables
> --------------------------------------------
>
> Key: HIVE-18988
> URL: https://issues.apache.org/jira/browse/HIVE-18988
> Project: Hive
> Issue Type: Sub-task
> Components: HiveServer2, repl
> Affects Versions: 3.0.0
> Reporter: Sankar Hariappan
> Assignee: Sankar Hariappan
> Priority: Major
> Labels: ACID, DR, pull-request-available, replication
> Fix For: 3.1.0
>
> Attachments: HIVE-18988.01.patch, HIVE-18988.02.patch,
> HIVE-18988.03.patch, HIVE-18988.04.patch, HIVE-18988.05.patch,
> HIVE-18988.06.patch
>
>
> Bootstrapping of ACID tables, need special handling to replicate a stable
> state of data.
> - If ACID feature enables, then perform bootstrap dump for ACID tables with
> in read txn.
> -> Dump table/partition metadata.
> -> Get the list of valid data files for a table using same logic as read txn
> do.
> -> Dump latest ValidWriteIdList as per current read txn.
> - Set the valid last replication state such that it doesn't miss any open
> txn started after triggering bootstrap dump.
> - If any txns on-going which was opened before triggering bootstrap dump,
> then it is not guaranteed that if open_txn event captured for these txns.
> Also, if these txns are opened for streaming ingest case, then dumped ACID
> table data may include data of open txns which impact snapshot isolation at
> target. To avoid that, bootstrap dump should wait for timeout (new
> configuration: hive.repl.bootstrap.dump.open.txn.timeout). After timeout,
> just force abort those txns and continue.
> - If any txns force aborted belongs to a streaming ingest case, then dumped
> ACID table data may have aborted data too. So, it is necessary to replicate
> the aborted write ids to target to mark those data invalid for any readers.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)