Mike Drob commented on SOLR-9330:

You're right about there still being a race condition in my solution. For some 
reason I wasn't thinking about what happens in that case.

re: jmx exception - yea, it's the same thing. we have a custom jmx handler on 
top of the mbean endpoint that has run into this same issue. I should have 
cleaned up the commit message there.

[~werder], the reload operation isn't the only place where this can throw 
exceptions - it can also happen during core delete or during shutdown. So if we 
add synchronization, we will need to look at {{CoreContainer::shutdown}}, 
{{SolrCores::close}}, etc. I don't have tests written for all of these, but 
it's a nearly identical stack trace, given that reload is essentially unload + 

> Race condition between core reload and statistics request
> ---------------------------------------------------------
>                 Key: SOLR-9330
>                 URL: https://issues.apache.org/jira/browse/SOLR-9330
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>    Affects Versions: 5.5
>            Reporter: Andrey Kudryavtsev
>         Attachments: SOLR-9330.patch, SOLR-9390.patch, SOLR-9390.patch, 
> SOLR-9390.patch, too_sync.patch
> Things happened that we execute this two requests consecutively in Solr 5.5:
> * Core reload: /admin/cores?action=RELOAD&core=_coreName_
> * Check core statistics: /_coreName_/admin/mbeans?stats=true
> And sometimes second request ends with this error:
> {code}
> ERROR org.apache.solr.servlet.HttpSolrCall - 
> null:org.apache.lucene.store.AlreadyClosedException: this IndexReader is 
> closed
>       at org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:274)
>       at 
> org.apache.lucene.index.StandardDirectoryReader.getVersion(StandardDirectoryReader.java:331)
>       at 
> org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:119)
>       at 
> org.apache.lucene.index.FilterDirectoryReader.getVersion(FilterDirectoryReader.java:119)
>       at 
> org.apache.solr.search.SolrIndexSearcher.getStatistics(SolrIndexSearcher.java:2404)
>       at 
> org.apache.solr.handler.admin.SolrInfoMBeanHandler.addMBean(SolrInfoMBeanHandler.java:164)
>       at 
> org.apache.solr.handler.admin.SolrInfoMBeanHandler.getMBeanInfo(SolrInfoMBeanHandler.java:134)
>       at 
> org.apache.solr.handler.admin.SolrInfoMBeanHandler.handleRequestBody(SolrInfoMBeanHandler.java:65)
>       at 
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
>       at org.apache.solr.core.SolrCore.execute(SolrCore.java:2082)
>       at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:670)
>       at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:458)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:225)
>       at 
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:183)
> {code}
> If my understanding of SolrCore internals is correct, it happens because of 
> async nature of reload request:
> * New searcher is "registered" in separate thread
> * Old searcher is closed in same separate thread and only after new one is 
> registered
> * When old searcher is closing, it removes itself from map with MBeans 
> * If statistic requests happens before old searcher is completely removed 
> from everywhere - exception can happen. 
> What do you think if we will introduce new parameter for reload request which 
> makes it fully synchronized? Basically it will force it to call {code}  
> SolrCore#getSearcher(boolean forceNew, boolean returnSearcher, final Future[] 
> waitSearcher, boolean updateHandlerReopens) {code} with waitSearcher!= null

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to