[ 
https://issues.apache.org/jira/browse/JCR-1138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12528377
 ] 

Thomas Mueller commented on JCR-1138:
-------------------------------------

To better support garbage collection for the data store, I suggest to add a new 
method to AbstractBundlePersistenceManager:

    /**
     * Get all node ids. 
     * A typical application will call this method multiple times, where 'after'
     * is the last row read. The maxCount parameter defines the maximum number 
of 
     * node ids returned, 0 meaning no limit. The order of the node ids is 
specific for the 
     * given persistent manager. Items that are added concurrently may not be 
included.
     * 
     * @param after the lower limit, or null for no limit.
     * @param maxCount the maximum number of node ids to return, or 0 for no 
limit.
     * @return an iterator of all bundles.
     * @throws ItemStateException if an error while loading occurs.
     */
    public abstract NodeIdIterator getAllNodeIds(NodeId after, int maxCount)
            throws ItemStateException;

Only for the Bundle PersistenceManagers, because those persistence managers are 
the most important ones (in my view).

This method is then called from the garbage collection process (or from a 
background thread from time to time, with a low maxCount and with enough sleep 
time in between). After all nodes are processed, the objects in the data store 
that were never scanned are deleted. This mechanism is better than the current 
mechanism as it can be restarted: only the last visited node needs to be 
persisted. It is also more efficient as the persistence manager can return the 
data in the order it is stored (which is easy for BundleFsPersistenceManager).

What do you think, is this approach OK? 
Thomas

> Data store garbage collection
> -----------------------------
>
>                 Key: JCR-1138
>                 URL: https://issues.apache.org/jira/browse/JCR-1138
>             Project: Jackrabbit
>          Issue Type: Improvement
>          Components: core
>            Reporter: Thomas Mueller
>            Assignee: Thomas Mueller
>
> Currently the data store garbage collection needs to be run manually. It 
> should be simpler to use (maybe tool based), or automatic.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to