[ https://issues.apache.org/jira/browse/TEPHRA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630903#comment-15630903 ]
ASF GitHub Bot commented on TEPHRA-35: -------------------------------------- GitHub user poornachandra opened a pull request: https://github.com/apache/incubator-tephra/pull/19 Save compaction state for pruning invalid list JIRA - https://issues.apache.org/jira/browse/TEPHRA-35 Adds ability to save prune upper bound from the transaction snapshot used for compaction. Note that the first two commits are re-factoring existing tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/poornachandra/incubator-tephra feature/transaction-pruning Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-tephra/pull/19.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #19 ---- commit 3075cb3cf1b2b52c8946f18e9adec21e8a90d589 Author: poorna <poo...@cask.co> Date: 2016-10-28T22:12:23Z Save compaction state for pruning invalid list commit be048335024fe03ec567090f0dc2c121d9bff08a Author: poorna <poo...@cask.co> Date: 2016-10-29T00:46:01Z Refactor existing test commit 40ab5259722e8e138524c81b90bac2a16d455d24 Author: poorna <poo...@cask.co> Date: 2016-11-01T03:59:23Z Refactor createTable to not add transaction co-processor by default ---- > Prune invalid transaction set once all data for a given invalid transaction > has been dropped > -------------------------------------------------------------------------------------------- > > Key: TEPHRA-35 > URL: https://issues.apache.org/jira/browse/TEPHRA-35 > Project: Tephra > Issue Type: New Feature > Reporter: Gary Helmling > Assignee: Poorna Chandra > Priority: Blocker > Attachments: ApacheTephraAutomaticInvalidListPruning-v2.pdf > > > In addition to dropping the data from invalid transactions we need to be able > to prune the invalid set of any transactions where data cleanup has been > completely performed. Without this, the invalid set will grow indefinitely > and become a greater and greater cost to in-progress transactions over time. > To do this correctly, the TransactionDataJanitor coprocessor will need to > maintain some bookkeeping for the transaction data that it removes, so that > the transaction manager can reason about when all of a given transaction's > data has been removed. Only at this point can the transaction manager safely > drop the transaction ID from the invalid set. > One approach would be for the TransactionDataJanitor to update a table > marking when a major compaction was performed on a region and what > transaction IDs were filtered out. Once all regions in a table containing the > transaction data have been compacted, we can remove the filtered out > transaction IDs from the invalid set. However, this will need to cope with > changing region names due to splits, etc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)