[
https://issues.apache.org/jira/browse/SOLR-11297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151118#comment-16151118
]
Shawn Heisey commented on SOLR-11297:
-------------------------------------
Repeating something I said on the mailing list, and expanding on it:
I have never seen messages for this problem on my "build" cores or my "broker"
cores, only on "live" cores. The build cores are used for full index rebuilds
and then swapped with live cores. They do not receive any requests unless a
(rare) full index rebuild is underway. Because full index rebuilds involve
DIH, the build cores are almost never getting requests (other than DIH status
requests) after a Solr restart. The broker cores have the shards parameter in
the request handler "defaults" section, so clients do not need to know that
they are accessing a distributed index.
I think there may be a race condition. Best idea is that it starts when the
core opens its searcher (or possibly slightly offset from that moment), and
ends when the core is fully functional. If requests are received for the index
during that timeframe, it appears to cause Solr to try to create the core
again. The second creation fails because the first is already underway and has
opened a searcher. In my case, it only seems to affect shard requests, but
this might simply be a result of the fact that requests to the broker cores
don't involve the local index in that core (which is empty), only indexes on
the shard cores.
I typically see "Lock held" messages in the log for more cores than are shown
in the admin UI as having failed to start, and sometimes the message appears
MANY times for one core. I think there is an example of that in the log that I
attached to the issue.
> Message "Lock held by this virtual machine" during startup. Solr is trying
> to start some cores twice
> -----------------------------------------------------------------------------------------------------
>
> Key: SOLR-11297
> URL: https://issues.apache.org/jira/browse/SOLR-11297
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Affects Versions: 6.6
> Reporter: Shawn Heisey
> Assignee: Erick Erickson
> Attachments: solr6_6-startup.log
>
>
> Sometimes when Solr is restarted, I get some "lock held by this virtual
> machine" messages in the log, and the admin UI has messages about a failure
> to open a new searcher. It doesn't happen on all cores, and the list of
> cores that have the problem changes on subsequent restarts. The cores that
> exhibit the problems are working just fine -- the first core load is
> successful, the failure to open a new searcher is on a second core load
> attempt, which fails.
> None of the cores in the system are sharing an instanceDir or dataDir. This
> has been verified several times.
> The index is sharded manually, and the servers are not running in cloud mode.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]