Harshal Patel created HIVE-28772:
------------------------------------
Summary: Clear REPL_TXN_MAP table on DR when deleting replication
policy
Key: HIVE-28772
URL: https://issues.apache.org/jira/browse/HIVE-28772
Project: Hive
Issue Type: Bug
Components: repl
Reporter: Harshal Patel
Assignee: Harshal Patel
Currently, if you create scheduled queries for repl dump and repl load, then by
design during incremental replication transactions can span across multiple
replication runs.
And if OPEN_TXN is replicated, then on the DR side it will add an entry in the
REPL_TXN_MAP table to keep track of open transactions for that database.
But before replaying COMMIT_TXN/ABORT_TXN, if user deletes the policy and drops
the database on the DR to re-bootstrap, then we are not cleaning up the
REPL_TXN_MAP table
Which can lead to 2 major issues:
# It can bring down AcidHouseKeeper cleaner services, which clean up
transaction-related tables in the HMS, because it reads the database name in
the REPL_TXN_MAP and tries to get the database information for that and fails
with a database doesn't exist error. This can bloat up the HMS.
# If user creates a policy quickly before AcidHouseKeeper kicks in the in that
case newly created database will be marked as repl incompatible from the
AcidHouseKeeper while cleaning up the TXNS and REPL_TXN_MAP table after 11 days
--
This message was sent by Atlassian Jira
(v8.20.10#820010)