ashutosh-bapat commented on a change in pull request #581: HIVE-21529 : Bootstrap ACID tables as part of incremental dump. URL: https://github.com/apache/hive/pull/581#discussion_r275215849
########## File path: ql/src/java/org/apache/hadoop/hive/ql/exec/repl/ReplDumpTask.java ########## @@ -397,22 +469,13 @@ private String getValidWriteIdList(String dbName, String tblName, String validTx return openTxns; } - String getValidTxnListForReplDump(Hive hiveDb) throws HiveException { - // Key design point for REPL DUMP is to not have any txns older than current txn in which dump runs. - // This is needed to ensure that Repl dump doesn't copy any data files written by any open txns - // mainly for streaming ingest case where one delta file shall have data from committed/aborted/open txns. - // It may also have data inconsistency if the on-going txns doesn't have corresponding open/write - // events captured which means, catch-up incremental phase won't be able to replicate those txns. - // So, the logic is to wait for configured amount of time to see if all open txns < current txn is - // getting aborted/committed. If not, then we forcefully abort those txns just like AcidHouseKeeperService. - ValidTxnList validTxnList = getTxnMgr().getValidTxns(); - long timeoutInMs = HiveConf.getTimeVar(conf, - HiveConf.ConfVars.REPL_BOOTSTRAP_DUMP_OPEN_TXN_TIMEOUT, TimeUnit.MILLISECONDS); - long waitUntilTime = System.currentTimeMillis() + timeoutInMs; + ValidTxnList getValidTxnListForReplDump(Hive hiveDb, ValidTxnList validTxnList, Review comment: See my explanation above about validTxnList being passed. Earlier the transaction snapshot was being obtained within getValidTxnListFoReplDump(), and thus was fine to return a String representation. Now that we are passing snapshot to this function, it's easier to return it as is if required. Furthermore the caller then use the snapshot object as is or convert to a string as required - bit of flexibility to the caller in that case. That's why I have changed the return type to snapshot instead of string, which needs to be parsed back to get snapshot object if that's required. Since there are only two callers it's an easy change to make now that later in case the number of callers increases. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org For additional commands, e-mail: gitbox-h...@hive.apache.org