Harshal Patel created HIVE-28772:
------------------------------------

             Summary: Clear REPL_TXN_MAP table on DR when deleting replication 
policy
                 Key: HIVE-28772
                 URL: https://issues.apache.org/jira/browse/HIVE-28772
             Project: Hive
          Issue Type: Bug
          Components: repl
            Reporter: Harshal Patel
            Assignee: Harshal Patel


Currently, if you create scheduled queries for repl dump and repl load, then by 
design during incremental replication transactions can span across multiple 
replication runs.

And if OPEN_TXN is replicated, then on the DR side it will add an entry in the 
REPL_TXN_MAP table to keep track of open transactions for that database.

But before replaying COMMIT_TXN/ABORT_TXN, if user deletes the policy and drops 
the database on the DR to re-bootstrap, then we are not cleaning up the 
REPL_TXN_MAP table

Which can lead to 2 major issues:
 # It can bring down AcidHouseKeeper cleaner services, which clean up 
transaction-related tables in the HMS, because it reads the database name in 
the REPL_TXN_MAP and tries to get the database information for that and fails 
with a database doesn't exist error. This can bloat up the HMS.
 # If user creates a policy quickly before AcidHouseKeeper kicks in the in that 
case newly created database will be marked as repl incompatible from the 
AcidHouseKeeper while cleaning up the TXNS and REPL_TXN_MAP table after 11 days



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to