[
https://issues.apache.org/jira/browse/HIVE-12439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196731#comment-15196731
]
Eugene Koifman commented on HIVE-12439:
---------------------------------------
1. CompactionTxnHandler.cleanEmptyAborted() - why rewrite
"String s = "select txn_id from TXNS where " +
"txn_id not in (select tc_txnid from TXN_COMPONENTS) and " +
"txn_state = '" + TXN_ABORTED + "'";"
The IN clause here doesn't list values - it's not (cannot in fact be) subject
to 1000 or any other limit.
Also, part of your rewrite lost
"LOG.info("Removed " + rc + " empty Aborted transactions: " + txnIdBatch + "
from TXNS");"
This is a critical debug/support log statement - it logs the actual txn IDs
that were cleared.
2. TxnHandler.openTxns()
" if (i > first) {
valuesClause.append(", ");
}
"
this will generate a query with "values,(..." if the previous "if" with
METASTORE_DIRECT_SQL_MAX_ELEMENTS_VALUES_CLAUSE executes.
This is a nit but this class has quoteString() and quoteChar() to generate SQL
with string values
3. TxnHandler.timeOutLocks() - why does this need a suffix at all? The extra
parentheses seem redundant.
4. TxnHandler.abortTxns() - there seems to be a redundant set or parentheses
wrapping the IN clause. Why is this necessary?
5. TestTxnUtils - I think this test is very limited. It would be better (in
addition) to add some tests that will actually cause the new queries to execute
in a DB (Derby in practice). In particular, once the 2 new properties are
exceeded. I think that would provide better test coverage.
> CompactionTxnHandler.markCleaned() and TxnHandler.openTxns() misc improvements
> ------------------------------------------------------------------------------
>
> Key: HIVE-12439
> URL: https://issues.apache.org/jira/browse/HIVE-12439
> Project: Hive
> Issue Type: Improvement
> Components: Metastore, Transactions
> Affects Versions: 1.0.0
> Reporter: Eugene Koifman
> Assignee: Wei Zheng
> Attachments: HIVE-12439.1.patch
>
>
> # add a safeguard to make sure IN clause is not too large; break up by txn id
> to delete from TXN_COMPONENTS where tc_txnid in ...
> # TxnHandler. openTxns() - use 1 insert with many rows in values() clause,
> rather than 1 DB roundtrip per row
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)