[ 
https://issues.apache.org/jira/browse/TEPHRA-35?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15630903#comment-15630903
 ] 

ASF GitHub Bot commented on TEPHRA-35:
--------------------------------------

GitHub user poornachandra opened a pull request:

    https://github.com/apache/incubator-tephra/pull/19

    Save compaction state for pruning invalid list

    JIRA - https://issues.apache.org/jira/browse/TEPHRA-35
    
    Adds ability to save prune upper bound from the transaction snapshot used 
for compaction.
    
    Note that the first two commits are re-factoring existing tests.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/poornachandra/incubator-tephra 
feature/transaction-pruning

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-tephra/pull/19.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19
    
----
commit 3075cb3cf1b2b52c8946f18e9adec21e8a90d589
Author: poorna <poo...@cask.co>
Date:   2016-10-28T22:12:23Z

    Save compaction state for pruning invalid list

commit be048335024fe03ec567090f0dc2c121d9bff08a
Author: poorna <poo...@cask.co>
Date:   2016-10-29T00:46:01Z

    Refactor existing test

commit 40ab5259722e8e138524c81b90bac2a16d455d24
Author: poorna <poo...@cask.co>
Date:   2016-11-01T03:59:23Z

    Refactor createTable to not add transaction co-processor by default

----


> Prune invalid transaction set once all data for a given invalid transaction 
> has been dropped
> --------------------------------------------------------------------------------------------
>
>                 Key: TEPHRA-35
>                 URL: https://issues.apache.org/jira/browse/TEPHRA-35
>             Project: Tephra
>          Issue Type: New Feature
>            Reporter: Gary Helmling
>            Assignee: Poorna Chandra
>            Priority: Blocker
>         Attachments: ApacheTephraAutomaticInvalidListPruning-v2.pdf
>
>
> In addition to dropping the data from invalid transactions we need to be able 
> to prune the invalid set of any transactions where data cleanup has been 
> completely performed. Without this, the invalid set will grow indefinitely 
> and become a greater and greater cost to in-progress transactions over time.
> To do this correctly, the TransactionDataJanitor coprocessor will need to 
> maintain some bookkeeping for the transaction data that it removes, so that 
> the transaction manager can reason about when all of a given transaction's 
> data has been removed. Only at this point can the transaction manager safely 
> drop the transaction ID from the invalid set.
> One approach would be for the TransactionDataJanitor to update a table 
> marking when a major compaction was performed on a region and what 
> transaction IDs were filtered out. Once all regions in a table containing the 
> transaction data have been compacted, we can remove the filtered out 
> transaction IDs from the invalid set. However, this will need to cope with 
> changing region names due to splits, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to