[
https://issues.apache.org/jira/browse/SOLR-7069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Forest Soup updated SOLR-7069:
------------------------------
Description:
When querying a collection with a core in "down" state, if we send the request
to the server containing the "down" core, while the server is active, it cannot
failover to the good replica of same shard on another server.
The steps to make a core "down" on an active server is:
1, delete the content of the data folder of the core
2, restart the solr server the core locates.
Then we can see the core is "down" while other cores on the same server is
still active. See attached picture.
When we issue a query to the collection, if we send the request to the server
containing the "down" core, we receive below errors:
HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not available
due to init failure: Error opening new
searcher,trace=org.apache.solr.common.SolrException: SolrCore
'collection5_shard1_replica2' is not available due to init failure: Error
opening new searcher at
org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950) at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
at
org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
at
org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
at
org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
at
org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Thread.java:804) Caused by:
org.apache.solr.common.SolrException: Error opening new searcher at
org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at
org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at
org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at
org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at
org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at
java.util.concurrent.FutureTask.run(FutureTask.java:273) at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at
java.util.concurrent.FutureTask.run(FutureTask.java:273) at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
... 1 more Caused by: org.apache.solr.common.SolrException: Error opening new
searcher at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at
org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused by:
java.io.FileNotFoundException:
/mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No
such file or directory) at
java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at
org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at
org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
at
org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at
org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at
org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at
org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at
org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at
org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
at
org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13
more ,code=500}
was:When querying a collection with a core in "down" state, if we send the
request to the server containing the "down" core, while the server is active,
it cannot failover to the good replica of same shard on another server.
> A down core(shard replica) on an active node cannot failover the query to its
> good peer
> ---------------------------------------------------------------------------------------
>
> Key: SOLR-7069
> URL: https://issues.apache.org/jira/browse/SOLR-7069
> Project: Solr
> Issue Type: Bug
> Components: SolrCloud
> Affects Versions: 4.7
> Reporter: Forest Soup
>
> When querying a collection with a core in "down" state, if we send the
> request to the server containing the "down" core, while the server is active,
> it cannot failover to the good replica of same shard on another server.
> The steps to make a core "down" on an active server is:
> 1, delete the content of the data folder of the core
> 2, restart the solr server the core locates.
> Then we can see the core is "down" while other cores on the same server is
> still active. See attached picture.
> When we issue a query to the collection, if we send the request to the server
> containing the "down" core, we receive below errors:
> HTTP Status 500 - {msg=SolrCore 'collection5_shard1_replica2' is not
> available due to init failure: Error opening new
> searcher,trace=org.apache.solr.common.SolrException: SolrCore
> 'collection5_shard1_replica2' is not available due to init failure: Error
> opening new searcher at
> org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:827) at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:309)
> at
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:217)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
> at
> org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1040)
> at
> org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:607)
> at
> org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:316)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
> at
> org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
> at java.lang.Thread.run(Thread.java:804) Caused by:
> org.apache.solr.common.SolrException: Error opening new searcher at
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:844) at
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:630) at
> org.apache.solr.core.ZkContainer.createFromZk(ZkContainer.java:244) at
> org.apache.solr.core.CoreContainer.create(CoreContainer.java:595) at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:258) at
> org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:250) at
> java.util.concurrent.FutureTask.run(FutureTask.java:273) at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:482) at
> java.util.concurrent.FutureTask.run(FutureTask.java:273) at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1156)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:626)
> ... 1 more Caused by: org.apache.solr.common.SolrException: Error opening
> new searcher at
> org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1521) at
> org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1633) at
> org.apache.solr.core.SolrCore.<init>(SolrCore.java:827) ... 11 more Caused
> by: java.io.FileNotFoundException:
> /mnt/solrdata1/solr/home/collection5_shard1_replica2/data/index/_12x.si (No
> such file or directory) at
> java.io.RandomAccessFile.<init>(RandomAccessFile.java:252) at
> org.apache.lucene.store.MMapDirectory.openInput(MMapDirectory.java:193) at
> org.apache.lucene.store.NRTCachingDirectory.openInput(NRTCachingDirectory.java:233)
> at
> org.apache.lucene.codecs.lucene46.Lucene46SegmentInfoReader.read(Lucene46SegmentInfoReader.java:49)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:340) at
> org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:404) at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:843)
> at
> org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:694)
> at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:400) at
> org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:741) at
> org.apache.solr.update.SolrIndexWriter.<init>(SolrIndexWriter.java:77) at
> org.apache.solr.update.SolrIndexWriter.create(SolrIndexWriter.java:64) at
> org.apache.solr.update.DefaultSolrCoreState.createMainIndexWriter(DefaultSolrCoreState.java:267)
> at
> org.apache.solr.update.DefaultSolrCoreState.getIndexWriter(DefaultSolrCoreState.java:110)
> at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1484) ... 13
> more ,code=500}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]