[
https://issues.apache.org/jira/browse/SOLR-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hoss Man updated SOLR-10726:
----------------------------
Attachment: SOLR-10726.semaphore.newsearcher.test.patch
semaphore.newsearcher.test.log.txt
FWIW...
I was attempting to write a test that would prove/disprove if waitSearcher=true
actually worked in SolrCloud, by having a 'newSearcher' event listener that
used a semaphore to try and detect if/when a newSearcher was being warmed after
the client's commit call had already been returned.
I ran into some weird problems, and in mentioning them in passing to shalin, he
pointed me to this jira.
I'm attaching a patch showing what i have at the moment -- it doesn't really do
much towards my current goal, but it does help demonstrate a few weird things
about when/how newSerchers are being opened in SolrCloud that seems relevant to
the related problems shalin mentioned when creating this jira...
* I had to put some special code in to do an initial commit (on the empty
index) to work around the fact that evidently SolrCore will re-open a
newSearcher after the very first commit -- even if no documents have been added
to it's index.
** BUT: This doesn't happen on _every_ SolrCore ??? ... it seems to be an "N-1"
situation, where N is the total number of cores. Ie: in a 2 shard collection
with repFactor=2, aparently only 3 of the cores open a newSearcher on this
(empty) commit
** see the usage of {{nocommit_HACK_ON_HACK_nocommit_seriously_nocommit}} for
details
* Once the test actaully starts adding docs to the index, things work
predictible -- for a bit...
** The test sequentially does an add followed by a commit, and verifies (using
the semaphore) that only 2 replicas (presumably of the shard the added document
belongs to) open a newSearcher
** in reality, eventually a commit happens where every SolrCore re-opens a
newSearcher (even though nothing in the index has changed on the 2 nodes of the
other shards) and there aren't evenough permits in the semaphore.
----
I'm not planning to pursue this at the moment, but i wanted to share it in case
it can serve as a useful starting point for anyone else who wants to look into
figuring out why it's happening and/or reducing how often SolrCloud is opening
newSearchers.
> SolrCloud opens multiple searchers on replica creation/startup
> --------------------------------------------------------------
>
> Key: SOLR-10726
> URL: https://issues.apache.org/jira/browse/SOLR-10726
> Project: Solr
> Issue Type: Bug
> Security Level: Public(Default Security Level. Issues are Public)
> Components: search, SolrCloud
> Affects Versions: 6.5.1
> Reporter: Shalin Shekhar Mangar
> Labels: difficulty-medium, impact-high
> Attachments: semaphore.newsearcher.test.log.txt,
> SOLR-10726.semaphore.newsearcher.test.patch
>
>
> I was investigating some curious behavior reported by a customer around first
> searcher event listeners and multiple searchers being opened when adding a
> new replica.
> Turns out that if you add a new replica to solrcloud:
> 1) Searchers are opened at least twice and possibly a third time
> 2) the first time is because of a new core coming online and opening searcher
> on an empty index -- only firstSearcher event listeners are fired here
> 3) second time is after replication is complete and we have new index files
> available -- firstSearcher event listeners are fired again because the old
> searcher opened on core load has already been closed and disposed so this is
> technically again a first searcher
> 4) third time happens after documents buffered during recovery are replayed
> -- if there was no indexing happening on leader then this step is skipped --
> a newSearcher event is fired here because we had already opened a searcher in
> the last step
> Now if instead of a new replica, a solr node is restarted then there can be
> upto four searcher opens -- the additional open is because of log replay on
> startup.
> So Solr spends a lot of time on unnecessary warming/autowarming on searchers
> that are discarded. It is not just warming because sometimes plugins such as
> SpellCheckComponent and SuggestComponent can also tie in to these listener
> events.
> We can probably cut a few of them or at least defer the decision of whether
> to fire these listeners to places such as RecoveryStrategy which have a
> better idea of whether it is worth it.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]