Hi.

Just a thought, should you be calling IndexReader#decref in your code once
you are done working with the directory?
I see it happening in Solr on close
https://github.com/apache/solr/blob/ec8f23622b04de80c1dcb85638a73f9d9566d1bf/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L575

best,
alex


On Sun, Sep 10, 2023 at 7:06 PM Rahul Goswami <rahul196...@gmail.com> wrote:

> Shawn ,
> Thanks for looking further into this. Although many of our Solr instances
> do run on Windows servers, for testing this particular reindexing program,
> I have been running it on Linux to get the OS variable out of the equation
> for now. The behavior I described in my original email occurs on Linux.
> After enough troubleshooting (and code reading), it seems like there is a
> ref count maintained internally at the Lucene level which is not going down
> to 0 thereby making the segments ineligible for deletion.
>
> What is baffling is that even after the reader is closed and I am done
> processing all the required segments, when I issue a commit through the
> code, it still doesn't have any effect.
>
> Only two things help with the cleanup...i) Solr restart ii) Core reload.
> And unfortunately neither of these approaches are practical for my use case
> since I can't wait for the whole processing to finish before reclaiming the
> space, especially when some of the cores are 3-4 TB large.
>
> Thanks,
> Rahul
>
> On Sat, Sep 2, 2023 at 4:45 PM Shawn Heisey <apa...@elyograg.org> wrote:
>
> > On 9/1/23 16:30, Rahul Goswami wrote:
> > > Thanks for your response. To your question about locking, I am not
> doing
> > > anything explicitly here. If you are alluding to deleting the
> write.lock
> > > file and opening a new IndexWriter, I am not doing that . Only an
> > > IndexReader.
> > >
> > > Are you suggesting opening an IndexReader from within Solr could
> > interfere
> > > with Solr's working and in turn file deletions? I think an answer to
> this
> > > question would really help me understand what is going wrong.
> >
> > I don't know what exactly the effects are of opening just a reader with
> > Lucene.
> >
> > I had another thought, and then I did a little searching on my list
> > archive to see if I could answer a question:  What OS is this on?
> >
> > Other messages you've written say that you're running on Windows.
> >
> > Windows does something that on the surface sounds like a good thing:  If
> > a file is open in ANY mode, including read-only, Windows will not allow
> > that file to be deleted.
> >
> > So I think the problem here is that you've got a Lucene program keeping
> > those segment files open, so when the Lucene code running in Solr tries
> > to delete them as a normal part of commit operations, it can't.
> >
> > If you were running this on pretty much any other OS, you probably
> > wouldn't be having this problem.  Other operating systems like Linux
> > allow file deletion even if the file is open elsewhere.  The file will
> > continue to exist on the filesystem until the last program that has it
> > open exits or closes the file, at which time the filesystem will finish
> > the deletion.
> >
> > If you have to stick with Windows, then you're going to have to do
> > something after your program closes its reader to trigger Lucene's
> > auto-cleanup of segments.  I believe a Solr index reload would
> > accomplish that.  Another way might be to index a dummy document, delete
> > that document, and issue a commit.
> >
> > Thanks,
> > Shawn
> >
> >
>

Reply via email to