[ 
https://issues.apache.org/jira/browse/SOLR-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151118#comment-16151118
 ] 

Shawn Heisey commented on SOLR-11297:
-------------------------------------

Repeating something I said on the mailing list, and expanding on it:

I have never seen messages for this problem on my "build" cores or my "broker" 
cores, only on "live" cores.  The build cores are used for full index rebuilds 
and then swapped with live cores.  They do not receive any requests unless a 
(rare) full index rebuild is underway.  Because full index rebuilds involve 
DIH, the build cores are almost never getting requests (other than DIH status 
requests) after a Solr restart. The broker cores have the shards parameter in 
the request handler "defaults" section, so clients do not need to know that 
they are accessing a distributed index.

I think there may be a race condition.  Best idea is that it starts when the 
core opens its searcher (or possibly slightly offset from that moment), and 
ends when the core is fully functional.  If requests are received for the index 
during that timeframe, it appears to cause Solr to try to create the core 
again.  The second creation fails because the first is already underway and has 
opened a searcher.  In my case, it only seems to affect shard requests, but 
this might simply be a result of the fact that requests to the broker cores 
don't involve the local index in that core (which is empty), only indexes on 
the shard cores.

I typically see "Lock held" messages in the log for more cores than are shown 
in the admin UI as having failed to start, and sometimes the message appears 
MANY times for one core.  I think there is an example of that in the log that I 
attached to the issue.


> Message "Lock held by this virtual machine" during startup.  Solr is trying 
> to start some cores twice
> -----------------------------------------------------------------------------------------------------
>
>                 Key: SOLR-11297
>                 URL: https://issues.apache.org/jira/browse/SOLR-11297
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 6.6
>            Reporter: Shawn Heisey
>            Assignee: Erick Erickson
>         Attachments: solr6_6-startup.log
>
>
> Sometimes when Solr is restarted, I get some "lock held by this virtual 
> machine" messages in the log, and the admin UI has messages about a failure 
> to open a new searcher.  It doesn't happen on all cores, and the list of 
> cores that have the problem changes on subsequent restarts.  The cores that 
> exhibit the problems are working just fine -- the first core load is 
> successful, the failure to open a new searcher is on a second core load 
> attempt, which fails.
> None of the cores in the system are sharing an instanceDir or dataDir.  This 
> has been verified several times.
> The index is sharded manually, and the servers are not running in cloud mode.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to