There is good testing that Solr closes most things that should be closed
including cores.  Still... I could see UNLOAD being enhanced to insist the
core be closed after a few minutes.

On Tue, Dec 10, 2024 at 2:17 PM Zack Kendall <zachariahkend...@gmail.com>
wrote:

> We have scripts that use the Solr Replica management APIs. The scripts use
> the async parameter and poll for it to be finished.
>
> Fairly regularly the DELETEREPLICA action will *never* finish.
>
> I have eventually enabled enough logging to see that it is spinning on
> this:
>
> > INFO
>  (parallelCoreAdminExecutor-19-thread-4-processing-n:myHost:8984_solr
> x:my_colleciton_shard105_0_replica_n2695 OFYOHGJY3554330096761208 UNLOAD) [
>   ] o.a.s.c.SolrCore Core my_colleciton_shard105_0_replica_n2695 is not yet
> closed, waiting 100 ms before checking again.
>
> We have left this for tens of MINUTES (I see a recent example in our logs
> of this spinning for 25 minutes) without it progressing on its own. When we
> notice this we have restart the Solr process, which seems to correct the
> state for practical purposes and move on. This manual intervention is very
> painful.
>
> The log statement appears to come from the SolrCore class, in the
> closeAndWait
> <
> https://github.com/apache/solr/blob/33b74e65caf46062737bbc6bc3507a39b1049f67/solr/core/src/java/org/apache/solr/core/SolrCore.java#L1536-L1539
> >
> method
> (called by unload method). It has a while loop checking for `isClosed`. And
> isClosed just checks if references are 0.
>
> So the question is what could cause references to not go to zero for such a
> long period of time? Any way to get visibility on what references are
> remaining? Is this a known or documented issue anywhere?
>
> Thanks
>

Reply via email to