[ https://issues.apache.org/jira/browse/TEPHRA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15645751#comment-15645751 ]
ASF GitHub Bot commented on TEPHRA-35: -------------------------------------- Github user poornachandra commented on a diff in the pull request: https://github.com/apache/incubator-tephra/pull/19#discussion_r86889731 --- Diff: tephra-core/src/main/java/org/apache/tephra/util/TxUtils.java --- @@ -149,4 +149,15 @@ private static long getMaxTTL(Map<byte[], Long> ttlByFamily) { public static boolean isPreExistingVersion(long version) { return version < MAX_NON_TX_TIMESTAMP; } + + /** + * Returns the maximum transaction that can be removed from the invalid list for the state represented by the given + * transaction. + */ + public static long getPruneUpperBound(Transaction tx) { + long maxInvalidTx = + tx.getInvalids().length > 0 ? tx.getInvalids()[tx.getInvalids().length - 1] : Transaction.NO_TX_IN_PROGRESS; + long firstInProgress = tx.getFirstInProgress(); + return Math.min(maxInvalidTx, firstInProgress - 1); --- End diff -- Good catch, updated the code to use the current read pointer in such a case > Prune invalid transaction set once all data for a given invalid transaction > has been dropped > -------------------------------------------------------------------------------------------- > > Key: TEPHRA-35 > URL: https://issues.apache.org/jira/browse/TEPHRA-35 > Project: Tephra > Issue Type: New Feature > Reporter: Gary Helmling > Assignee: Poorna Chandra > Priority: Blocker > Attachments: ApacheTephraAutomaticInvalidListPruning-v2.pdf > > > In addition to dropping the data from invalid transactions we need to be able > to prune the invalid set of any transactions where data cleanup has been > completely performed. Without this, the invalid set will grow indefinitely > and become a greater and greater cost to in-progress transactions over time. > To do this correctly, the TransactionDataJanitor coprocessor will need to > maintain some bookkeeping for the transaction data that it removes, so that > the transaction manager can reason about when all of a given transaction's > data has been removed. Only at this point can the transaction manager safely > drop the transaction ID from the invalid set. > One approach would be for the TransactionDataJanitor to update a table > marking when a major compaction was performed on a region and what > transaction IDs were filtered out. Once all regions in a table containing the > transaction data have been compacted, we can remove the filtered out > transaction IDs from the invalid set. However, this will need to cope with > changing region names due to splits, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)