I think this use case makes sense; such logic (for a distributed / ref counted deletion policy) would make a nice contribution ... it's the "proper" way to delete commits when multiple nodes are in use (vs eg using a timeout deletion policy).
You can actually do it today: call IndexWriter.deleteUnusedFiles. That visits the deletion policy and then you have a chance to delete commit points (it'd mean you have to set a real deletion policy on the writer, which in turn goes and checks the reference counts across all nodes). Mike McCandless http://blog.mikemccandless.com On Wed, Jun 6, 2012 at 7:16 AM, Colin Goodheart-Smithe <colings86....@googlemail.com> wrote: > I was looking at the Lucene API for IndexCommit and noticed that the > JavaDoc states that > > *'Decision that a commit-point should be deleted is taken by the > IndexDeletionPolicy<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html> > in > effect and therefore this should only be called by its > onInit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onInit(java.util.List)> > or > onCommit()<http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexDeletionPolicy.html#onCommit(java.util.List)> > methods.'* > ( > http://lucene.apache.org/core/3_6_0/api/all/org/apache/lucene/index/IndexCommit.html#delete() > ) > > I was wondering why this is the case and whether deleting IndexCommits > outside of a IndexDeletionPolicy is actually a bad idea? > > To put some context around this I am looking to implement a deletion policy > which is independant of the IndexWriter commit and more dependant on > Processes using particular Commit points being finished with it. > The logic would look something like the following and state would be stored > in something like ZooKeeper so I can have use of ephremal nodes and watcher > events: > > - IndexWriters would have a NoDeletionPolicy set > - Each time a process opens a session it registers an ephremal node > - The session is assigned the current (latest) commit point > - Each time a process removes the node (either through crashing or > having finished the job) a watch event is fired where a separate process > will delete the commit point the process was using if no other processes > are using the commit point and if it is not the latest commit point > > Processes may have fairly long running sessions so across all the processes > a reasonable number of commit points might be in use. I don't really want > to have to wait for a commit from the IndexWriter (which may not happen for > a while) to clear up the older commit points I no longer need. Would this > logic pose any issues given that it is going to be deleting Commit points > outside of the IndexDeletionPolicy --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org