Thank you... yes it seems we have a simple solution to do the gc on a single node using just a TransientRepository that's configured to use a slightly specialized version of the repository.xml. The plan is to have it run as simple standalone app triggered by a cronjob.
The respository.xml is specialized only in that it has a unique cluster id (which of course it needs) and a datasource with concrete information in it rather than the jndi based one that all the other cluster participants use because they are appserver based. Thanks again, -- Langley On Thu, May 26, 2011 at 8:09 AM, Thomas Mueller <[email protected]> wrote: > Hi, > > The way garbage collection works, I don't see a potential problem if you > run garbage collection concurrently. > > When garbage collection is running, each file that is accessed is > 'touched' (the last modified time is changed to the current time). If you > run it concurrently, this still will happen. At the end of the GC, old > files (untouched files) are deleted. > > So it shouldn't be a problem. Of course I would avoid to run it > concurrently, because it's enough to run it on one cluster node (it's > simply a waste of time to run it concurrently). > > Regards, > Thomas > > > On 5/26/11 1:22 PM, "John Langley" <[email protected]> wrote: > > >First off, thanks to writers of this great little description of how to do > >garbage collection and Fabian for pointing it out. > >http://wiki.apache.org/jackrabbit/DataStore#Data_Store_Garbage_Collection > > > >My next question concerns running garbage collection in a cluster. If had > >a > >number of identical nodes running in a cluster, each of them periodically > >running a garbage collection task, where the periods may overlap... say > >nodes 1 starts and then in the middle of either the mark or the sweep, > >node > >2 starts it's mark or perhaps even overlaps it's sweep.... what will > >the consequences be? Will they "collide", i.e. will their be unexpected > >errors (explicit exception based errors) or mis-behaviors (implicit > >non-identified errors)? > > > >Of course, the alternative is to guarantee that only one node in the > >cluster > >is responsible for the periodic mark and sweep. > > > >Thanks in advance for any pointers or insights. This community has been > >GREAT at responding to questions with very helpful solutions and bug > >fixes. > > > >-- Langley > >
