Hello all.
I recently ran into a problem where errors during indexing or optimization (perhaps related to running out of disk space) left me with a working index in a directory but with additional segment files (partial) that were unneeded. The solution for finding the ~40 files to keep out of the ~900 files in the directory amounted to dumping the segments file and noting that only 5 segments were in fact "live". The index is a non-compound index using FSDirectory. Is there (or would it be possible to add (and I'd be willing to submit code if it made sense to people)) some sort of interrogation on the index of what files belonged to it? I looked first as FSDirectory itself thinking that it's "list()" method should return the subset of index-related files but looking deeper it looks like Directory is at a lower level abstracting simple I/O and thus wouldn't "know". So any thoughts? Would it make sense to have a form of clean on IndexWriter()? I hesitate since it seems there isn't a charter that only Lucene files could exist in the directory thus what is ideal for my application (since I know I won't mingle other files) might not be ideal for all. Would it be fair to look for Lucene known extensions and file naming signatures to identify unused files that might be failed or dead segments? Thanks, -George