Hi. Just a thought, should you be calling IndexReader#decref in your code once you are done working with the directory? I see it happening in Solr on close https://github.com/apache/solr/blob/ec8f23622b04de80c1dcb85638a73f9d9566d1bf/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L575
best, alex On Sun, Sep 10, 2023 at 7:06 PM Rahul Goswami <rahul196...@gmail.com> wrote: > Shawn , > Thanks for looking further into this. Although many of our Solr instances > do run on Windows servers, for testing this particular reindexing program, > I have been running it on Linux to get the OS variable out of the equation > for now. The behavior I described in my original email occurs on Linux. > After enough troubleshooting (and code reading), it seems like there is a > ref count maintained internally at the Lucene level which is not going down > to 0 thereby making the segments ineligible for deletion. > > What is baffling is that even after the reader is closed and I am done > processing all the required segments, when I issue a commit through the > code, it still doesn't have any effect. > > Only two things help with the cleanup...i) Solr restart ii) Core reload. > And unfortunately neither of these approaches are practical for my use case > since I can't wait for the whole processing to finish before reclaiming the > space, especially when some of the cores are 3-4 TB large. > > Thanks, > Rahul > > On Sat, Sep 2, 2023 at 4:45 PM Shawn Heisey <apa...@elyograg.org> wrote: > > > On 9/1/23 16:30, Rahul Goswami wrote: > > > Thanks for your response. To your question about locking, I am not > doing > > > anything explicitly here. If you are alluding to deleting the > write.lock > > > file and opening a new IndexWriter, I am not doing that . Only an > > > IndexReader. > > > > > > Are you suggesting opening an IndexReader from within Solr could > > interfere > > > with Solr's working and in turn file deletions? I think an answer to > this > > > question would really help me understand what is going wrong. > > > > I don't know what exactly the effects are of opening just a reader with > > Lucene. > > > > I had another thought, and then I did a little searching on my list > > archive to see if I could answer a question: What OS is this on? > > > > Other messages you've written say that you're running on Windows. > > > > Windows does something that on the surface sounds like a good thing: If > > a file is open in ANY mode, including read-only, Windows will not allow > > that file to be deleted. > > > > So I think the problem here is that you've got a Lucene program keeping > > those segment files open, so when the Lucene code running in Solr tries > > to delete them as a normal part of commit operations, it can't. > > > > If you were running this on pretty much any other OS, you probably > > wouldn't be having this problem. Other operating systems like Linux > > allow file deletion even if the file is open elsewhere. The file will > > continue to exist on the filesystem until the last program that has it > > open exits or closes the file, at which time the filesystem will finish > > the deletion. > > > > If you have to stick with Windows, then you're going to have to do > > something after your program closes its reader to trigger Lucene's > > auto-cleanup of segments. I believe a Solr index reload would > > accomplish that. Another way might be to index a dummy document, delete > > that document, and issue a commit. > > > > Thanks, > > Shawn > > > > >