SourabhBadhya commented on code in PR #4091:
URL: https://github.com/apache/hive/pull/4091#discussion_r1167991523
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList
createValidTxnListForCleaner(GetOpenTxnsResponse txns
bitSet.set(0, abortedTxns.length);
//add ValidCleanerTxnList? - could be problematic for all the places that
read it from
// string as they'd have to know which object to instantiate
- return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark,
Long.MAX_VALUE);
+ return new ValidReadTxnList(abortedTxns, bitSet, highWatermark,
Long.MAX_VALUE);
+ }
+
+ public static ValidTxnList
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long
minOpenTxn) {
+ long highWatermark = minOpenTxn - 1;
+ long[] exceptions = new long[txns.getOpen_txnsSize()];
+ int i = 0;
+ BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits());
+ // getOpen_txns() guarantees that the list contains only aborted & open
txns.
+ // exceptions list must contain both txn types since validWriteIdList
filters out the aborted ones and valid ones for that table.
+ // If a txn is not in exception list, it is considered as a valid one and
thought of as an uncompacted write.
+ // See TxnHandler#getValidWriteIdsForTable() for more details.
+ for(long txnId : txns.getOpen_txns()) {
Review Comment:
This loop is limited by the value of highWatermark. Mainly used for creating
the exception list.
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList
createValidTxnListForCleaner(GetOpenTxnsResponse txns
bitSet.set(0, abortedTxns.length);
//add ValidCleanerTxnList? - could be problematic for all the places that
read it from
// string as they'd have to know which object to instantiate
- return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark,
Long.MAX_VALUE);
+ return new ValidReadTxnList(abortedTxns, bitSet, highWatermark,
Long.MAX_VALUE);
+ }
+
+ public static ValidTxnList
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long
minOpenTxn) {
+ long highWatermark = minOpenTxn - 1;
+ long[] exceptions = new long[txns.getOpen_txnsSize()];
+ int i = 0;
+ BitSet abortedBits = BitSet.valueOf(txns.getAbortedBits());
+ // getOpen_txns() guarantees that the list contains only aborted & open
txns.
+ // exceptions list must contain both txn types since validWriteIdList
filters out the aborted ones and valid ones for that table.
+ // If a txn is not in exception list, it is considered as a valid one and
thought of as an uncompacted write.
+ // See TxnHandler#getValidWriteIdsForTable() for more details.
+ for(long txnId : txns.getOpen_txns()) {
+ if(txnId > highWatermark) {
+ break;
+ }
+ exceptions[i] = txnId;
+ i++;
+ }
+ exceptions = Arrays.copyOf(exceptions, i);
+ //add ValidCleanerTxnList? - could be problematic for all the places that
read it from
Review Comment:
Removed it. Done.
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/txn/TxnUtils.java:
##########
@@ -82,7 +82,29 @@ public static ValidTxnList
createValidTxnListForCleaner(GetOpenTxnsResponse txns
bitSet.set(0, abortedTxns.length);
//add ValidCleanerTxnList? - could be problematic for all the places that
read it from
// string as they'd have to know which object to instantiate
- return new ValidReadTxnList(abortedTxns, bitSet, highWaterMark,
Long.MAX_VALUE);
+ return new ValidReadTxnList(abortedTxns, bitSet, highWatermark,
Long.MAX_VALUE);
+ }
+
+ public static ValidTxnList
createValidTxnListForAbortedTxnCleaner(GetOpenTxnsResponse txns, long
minOpenTxn) {
Review Comment:
I have renamed `createValidTxnListForCleaner` to
`createValidTxnListForCompactionCleaner`. This is different from
`createValidTxnListForAbortedTxnCleaner`, mainly that we dont truncate the
abortedBits which seems unnecessary. We are also not concerned if there are
open txns from other tables present in this list (open txn on the same table
will obviously be handled since highWatermark will be updated to min open for
that table - 1). We just create an exception list based on the highWatermark
and use it for the creating the validWriteIdList.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]