Re: Shard and replica went down in Solr 6.1.0

vishal patel Sat, 13 Apr 2019 20:30:13 -0700

Thanks for your reply.

2> In production, lots of documents come for indexing within a second.If i do 
hard commit interval to 60 seconds then in less times open searchers when hard 
commit execute. Is it ohk for performance?
3> soft commit 60 second we can not do as of now because our product implement 
like NRT after indexing instant show that changes.


Actually I have more analysed my production issue logs ::
Replica becomes leader and after some times some documents come for indexing in 
new leader as a understanding of previously replica.

My production log :

Shard Log(10.102.119.85) :

2019-04-08 12:54:01.405 ERROR 
(updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product
 x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) 
[c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients error
org.apache.solr.common.SolrException: Service Unavailable
request: 
http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
at 
org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
at 
org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
 Source)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)


Replica Log(10.101.111.80) :

2019-04-08 12:52:19.717 INFO  
(zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader A live node change: [WatchedEvent state:SyncConnected 
type:NodeChildrenChanged path:/live_nodes], has occurred - updating... (live 
nodes size: [4])
2019-04-08 12:52:19.725 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Running the 
leader process for shard=shard1 and weAreReplacement=true and 
leaderVoteWait=180000
2019-04-08 12:52:19.727 INFO  
(zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3)
2019-04-08 12:52:19.764 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext Checking if 
I should try and be the leader.
2019-04-08 12:52:19.765 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext My last 
published State was Active, it's okay to be the leader.
2019-04-08 12:52:19.766 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I may be 
the new leader - try and sync
2019-04-08 12:52:19.767 WARN  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.RecoveryStrategy Stopping recovery for 
core=[product] coreNodeName=[core_node3]
2019-04-08 12:52:20.291 INFO  
(zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json], has occurred 
- updating... (live nodes size: [3])
2019-04-08 12:52:20.532 INFO  
(zkCallback-4-thread-207-processing-n:10.101.111.80:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader A cluster state change: [WatchedEvent 
state:SyncConnected type:NodeDataChanged path:/clusterstate.json], has occurred 
- updating... (live nodes size: [3])
2019-04-08 12:52:22.274 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync replicas to 
http://10.101.111.80:8983/solr/product/
2019-04-08 12:52:22.274 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy Sync Success - now sync 
replicas to me
2019-04-08 12:52:22.274 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.SyncStrategy 
http://10.101.111.80:8983/solr/product/ has no replicas
2019-04-08 12:52:22.275 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContextBase 
Creating leader registration node /collections/product/leaders/shard1/leader 
after winning as 
/collections/product/leader_elect/shard1/election/245953614547976270-core_node3-n_0000000001
2019-04-08 12:52:22.384 INFO  
(zkCallback-4-thread-202-processing-n:10.101.111.80:8983_solr) [c:product 
s:shard1 r:core_node3 x:product] o.a.s.c.ShardLeaderElectionContext I am the 
new leader: http://10.101.111.80:8983/solr/product/ shard1

2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 
r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it is 
coming from leader, but we are the leader: 
update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 
r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  
webapp=/solr path=/update 
params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
 0 0
2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 
r:core_node3 x:product] o.a.s.h.RequestHandlerBase 
org.apache.solr.common.SolrException: Request says it is coming from leader, 
but we are the leader
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)


First for which reason replica becomes leader? and if replica becomes leader 
then why after 2 minutes document come in new leader as a behave it is still 
replica? We do index document using zoo keeper instead of direct on shard.

Sent from Outlook<http://aka.ms/weboutlook>
________________________________
From: Erick Erickson <erickerick...@gmail.com>
Sent: Friday, April 12, 2019 8:52 PM
To: vishal patel
Subject: Re: Shard and replica went down in Solr 6.1.0

What was unclear about the statement "These config options will _not_ open any 
new searchers. Period. “?

You _cannot_ be opening new searchers automatically, some client somewhere 
_must_ be issuing a commit. And you shouldn’t do that. Until you find the 
external source that’s issuing the commit, you’ll make no progress on this.

Is this an explicit enough recommendation?

1> find the source of the commits and stop doing that.
2> change your hard commit interval to 60 seconds. Keep openSearcher set to 
false.
3> change your soft commit to 60 seconds (this will imply that you can’t search 
docs for a minute).

This may or may not be the root of the “Request says it is coming from the 
leader” message. But it’s certainly a contributing factor. We can identify this 
positively as a bad configuration so let’s eliminate it and _then_ look for 
more issues.

Best,
Erick

> On Apr 11, 2019, at 10:36 PM, vishal patel <vishalpatel200...@outlook.com> 
> wrote:
>
> Thanks your reply
>
> Actually all cache are removed in my solrconfig.xml so no matter for autowarm 
> count for us.
>
> And I read your given link about hard commit and soft commit concept.
> In my production scenario like - index-heavy, Query-heavy [Near Real Time]
> So we set the hardcommit max time 10 minutes and soft commit -1 for each 
> collection.[27 collections in production]
> Is it ohk for production? Please give some input.
>
> And shard is going down for which reason how can we know from log?
>
> And please tell me what is meaning of below error because I can not find this 
> type of error in any web page.
>
> Shard-1-replica
> -----------------
> 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 
> r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says it 
> is coming from leader, but we are the leader: 
> update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 
> r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  
> webapp=/solr path=/update 
> params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
>  0 0
> 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 
> r:core_node3 x:product] o.a.s.h.RequestHandlerBase 
> org.apache.solr.common.SolrException: Request says it is coming from leader, 
> but we are the leader
> at
>
> Shard-1
> ------------------
> 2019-04-08 12:54:01.534 ERROR 
> (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3
>  x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) 
> [c:product3 s:shard1 r:core_node1 x:product3] o.a.s.u.StreamingSolrClients 
> error
> org.apache.solr.common.SolrException: Service Unavailable
>
>
> First we got error in replica then shard.
>
> Sent from Outlook
> From: Erick Erickson <erickerick...@gmail.com>
> Sent: Friday, April 12, 2019 12:04 AM
> To: vishal patel
> Subject: Re: Shard and replica went down in Solr 6.1.0
>
> Well, it leads to “too many on deck searchers”, obviously.
>
> Here’s most of what you want to know about commits and the like: 
> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> I always push back on any setting that opens a new searcher every second. You 
> might as well set all your autowarm counts in solrconfig.xml to zero, 
> autowarming isn’t doing you any good. You’re also creating a lot of garbage 
> churn etc.
>
> Set your soft commit to as long as you can. If you can set expectations like 
> “it may take up to a minute before your changes are visible"
>
> > On Apr 11, 2019, at 10:09 AM, vishal patel <vishalpatel200...@outlook.com> 
> > wrote:
> >
> > Actually in our application , bulky documents are needed to index and on 
> > same time we want to see that documents. So in production we keep auto 
> > commit 10 minutes and auto soft commit 1 second.
> > Is it ohk?
> >
> > Get Outlook for Android
> >
> >
> >
> > From: Erick Erickson <erickerick...@gmail.com>
> > Sent: Thursday, April 11, 2019 10:23:00 PM
> > To: vishal patel
> > Subject: Re: Shard and replica went down in Solr 6.1.0
> >
> > We’re not quite on the same page.
> >
> > These config options will _not_ open any new searchers. Period. They’re not 
> > the source of your max warming searchers error. Therefore _somebody_ is 
> > issuing commits. You need to find out who and stop them.
> >
> > Then change your settings in solrconfig to
> > 1> remove maxDoc from autoCommit. It’s probably leading to useless work
> > 2> I’d shorten my maxTime in autoCommit to, say, 1 minute. This isn’t very 
> > important, but 10 minutes is unnecessary
> > 3> Change your autoSoftCommit to as long as your application can tolerate, 
> > say 10 minutes if possible.
> > 4> Find out who is issuing commits and stop them.
> >
> > With these settings, unless you have outrageous “autowarmCount” settings in 
> > solrconfig.xml for the caches, you should not see any overlapping on deck 
> > searchers. I usually start with autowarmCount settings in the 10-16 range.
> >
> > Best,
> > Erick
> >
> > > On Apr 11, 2019, at 5:40 AM, vishal patel <vishalpatel200...@outlook.com> 
> > > wrote:
> > >
> > >
> > > Thanks Erick,
> > > I got your point. As per you, Solr will not go down due to “performance 
> > > warning” and no need to change the maxdoc value. You talked about number 
> > > of searchers but in solrconfig.xml there is only 
> > > 2.[<maxWarmingSearchers>2</maxWarmingSearchers>]
> > >
> > > In production, we have 27 collection, 2 shards and 2 replicas and 3 zoo 
> > > keepers and more than 30000 documents indexed within a 10 minutes.
> > > How can I know how many searchers are open at the specific time?
> > > As per my understanding, solr searcher will open when soft or hard 
> > > commit. Am I right? And my commit time below
> > > <autoCommit>
> > > <maxTime>600000</maxTime>
> > >     <maxDocs>20000</maxDocs>
> > >     <openSearcher>false</openSearcher>
> > > </autoCommit>
> > >
> > > <autoSoftCommit>
> > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > </autoSoftCommit>
> > >
> > > I do not write any code for opening the solr index searcher and can not 
> > > find any error related that.
> > >
> > > My all collection has same configuration mean same hard and soft commit 
> > > time for all 27 collection. is it any issue if on same time two or more 
> > > collections will come for hard commit?
> > >
> > > Below, again send more and accurate log details.
> > >
> > > Shard-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-102-thread-1) 
> > > [c:collection1 s:shard1 r:core_node1 x:collection1] 
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-118-thread-1) 
> > > [c:collection2 s:shard1 r:core_node1 x:collection2] 
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.395 INFO  (commitScheduler-110-thread-1) 
> > > [c:collection3 s:shard1 r:core_node1 x:collection3] 
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-109-thread-1) 
> > > [c:collection4 s:shard1 r:core_node1 x:collection4] 
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > > 2019-04-08 12:54:01.394 INFO  (commitScheduler-99-thread-1) 
> > > [c:collection5 s:shard1 r:core_node1 x:collection5] 
> > > o.a.s.u.DirectUpdateHandler2 start 
> > > commit{,optimize=false,openSearcher=false,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
> > >
> > > 2019-04-08 12:54:01.405 ERROR 
> > > (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product
> > >  x:product r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product) 
> > > [c:product s:shard1 r:core_node1 x:product] o.a.s.u.StreamingSolrClients 
> > > error
> > > org.apache.solr.common.SolrException: Service Unavailable
> > >
> > >
> > >
> > > request: 
> > > http://10.101.111.80:8983/solr/product/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct%2F&wt=javabin&version=2
> > > at 
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > at 
> > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > at 
> > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > at 
> > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
> > >  Source)
> > > at 
> > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > at 
> > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > >
> > > Shard-1-replica-1 Log ::
> > > --------------
> > > 2019-04-08 12:54:01.294 ERROR (qtp1239731077-1022464) [c:product s:shard1 
> > > r:core_node3 x:product] o.a.s.u.p.DistributedUpdateProcessor Request says 
> > > it is coming from leader, but we are the leader: 
> > > update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2
> > > 2019-04-08 12:54:01.294 INFO  (qtp1239731077-1022464) [c:product s:shard1 
> > > r:core_node3 x:product] o.a.s.u.p.LogUpdateProcessorFactory [product]  
> > > webapp=/solr path=/update 
> > > params={update.distrib=FROMLEADER&distrib.from=http://10.102.119.85:8983/solr/product/&wt=javabin&version=2}{}
> > >  0 0
> > > 2019-04-08 12:54:01.295 ERROR (qtp1239731077-1022464) [c:product s:shard1 
> > > r:core_node3 x:product] o.a.s.h.RequestHandlerBase 
> > > org.apache.solr.common.SolrException: Request says it is coming from 
> > > leader, but we are the leader
> > > at 
> > > org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:621)
> > > at 
> > > org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:392)
> > > at 
> > > org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:320)
> > > at 
> > > org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:679)
> > > at 
> > > org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
> > > at 
> > > org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:97)
> > > at 
> > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:179)
> > > at 
> > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readIterator(JavaBinUpdateRequestCodec.java:135)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:274)
> > > at 
> > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$1.readNamedList(JavaBinUpdateRequestCodec.java:121)
> > > at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:239)
> > > at 
> > > org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:157)
> > > at 
> > > org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:186)
> > > at 
> > > org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:107)
> > > at 
> > > org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:54)
> > > at 
> > > org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
> > > at 
> > > org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
> > > at 
> > > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:156)
> > > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2036)
> > > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:657)
> > > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:464)
> > > at 
> > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
> > > at 
> > > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
> > > at 
> > > org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1668)
> > > at 
> > > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:581)
> > > at 
> > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
> > > at 
> > > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
> > > at 
> > > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)
> > > at 
> > > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1160)
> > > at 
> > > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:511)
> > > at 
> > > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
> > > at 
> > > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1092)
> > > at 
> > > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
> > > at 
> > > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)
> > > at 
> > > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)
> > > at 
> > > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)
> > > at org.eclipse.jetty.server.Server.handle(Server.java:518)
> > > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:308)
> > > at 
> > > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:244)
> > > at 
> > > org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:273)
> > > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:95)
> > > at 
> > > org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)
> > > at 
> > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceAndRun(ExecuteProduceConsume.java:246)
> > > at 
> > > org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:156)
> > > at 
> > > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:654)
> > > at 
> > > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:572)
> > > at java.lang.Thread.run(Thread.java:745)
> > >
> > > After this one by one all collection got down and We can not identify 
> > > that why this ERROR came and what is meaning of that error.
> > >
> > > In Production my version
> > > zookeeper : 3.4.6
> > > java : 8
> > > solrj : 5.2.0
> > >
> > > Please guide me...
> > > From: Erick Erickson <erickerick...@gmail.com>
> > > Sent: Wednesday, April 10, 2019 8:35 PM
> > > To: solr-user@lucene.apache.org
> > > Subject: Re: Shard and replica went down in Solr 6.1.0
> > >
> > > Solr will not go down due to “performance warning”. All that means is 
> > > that you’re opening too many searchers in a short time. If you’re somehow 
> > > opening a huge number of searchers, then maybe but IIRC you can’t really 
> > > open more than two. And don’t think you can “fix” this by upping 
> > > maxWarmingSearchers, that’ll only make the problem worse as every new 
> > > searcher chews up memory that’s kept until the old searcher is done with 
> > > outstanding requests.
> > >
> > > Which is weird because with those settings you shouldn’t be opening _any_ 
> > > new searchers. So my guess is that some external client is doing that and 
> > > this is usually an anti-pattern. Don’t do it, please. Just set your soft 
> > > commit to what you need and leave it at that. As long as possible. Here 
> > > are all the gory details:
> > > https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> > >
> > > I’d also not bother with maxDocs, that’s much less predictable than just 
> > > setting a reasonable time for autocommit, a minute or two is usually 
> > > reasonable especially when openSearcher is false.
> > >
> > > Not sure what’s really generating that error, take a look at all your 
> > > other Solr logs to see if there’s a cause.
> > >
> > > Best,
> > > Erick
> > >
> > >
> > > > On Apr 10, 2019, at 5:21 AM, vishal patel 
> > > > <vishalpatel200...@outlook.com> wrote:
> > > >
> > > > I have 2 shards and 2 replicas of Solr 6.1.0. one shard and one replica 
> > > > went down and I got below ERROR
> > > >
> > > > 2019-04-08 12:54:01.469 INFO  (commitScheduler-131-thread-1) 
> > > > [c:products s:shard1 r:core_node1 x:product1] o.a.s.s.SolrIndexSearcher 
> > > > Opening [Searcher@24b9127f[product1] main]
> > > > 2019-04-08 12:54:01.468 INFO  (commitScheduler-110-thread-1) 
> > > > [c:product2 s:shard1 r:core_node1 x:product2] 
> > > > o.a.s.c.SolrDeletionPolicy SolrDeletionPolicy.onCommit: commits: num=2
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he5,generation=22541}
> > > > commit{dir=G:\SolrCloud\solr1\server\solr\product2\data\index.20180412060518798,segFN=segments_he6,generation=22542}
> > > > 2019-04-08 12:54:01.556 INFO  (commitScheduler-110-thread-1) 
> > > > [c:product2 s:shard1 r:core_node1 x:product2] 
> > > > o.a.s.c.SolrDeletionPolicy newest commit generation = 22542
> > > > 2019-04-08 12:54:01.465 WARN (commitScheduler-136-thread-1) [c:product3 
> > > > s:shard1 r:core_node1 x:product3] o.a.s.c.SolrCore [product3] 
> > > > PERFORMANCE WARNING: Overlapping onDeckSearchers=2
> > > >
> > > > 2019-04-08 12:54:01.534 ERROR 
> > > > (updateExecutor-2-thread-36358-processing-http:////10.101.111.80:8983//solr//product3
> > > >  x:product3 r:core_node1 n:10.102.119.85:8983_solr s:shard1 c:product3) 
> > > > [c:product3 s:shard1 r:core_node1 x:product3] 
> > > > o.a.s.u.StreamingSolrClients error
> > > > org.apache.solr.common.SolrException: Service Unavailable
> > > >
> > > > request: 
> > > > http://10.101.111.80:8983/solr/product3/update?update.distrib=FROMLEADER&distrib.from=http%3A%2F%2F10.102.119.85%3A8983%2Fsolr%2Fproduct3%2F&wt=javabin&version=2
> > > > at 
> > > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.sendUpdateStream(ConcurrentUpdateSolrClient.java:320)
> > > > at 
> > > > org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient$Runner.run(ConcurrentUpdateSolrClient.java:185)
> > > > at 
> > > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$22(ExecutorUtil.java:229)
> > > > at 
> > > > org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$3/30175207.run(Unknown
> > > >  Source)
> > > > at 
> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> > > > at 
> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> > > > at java.lang.Thread.run(Thread.java:745)
> > > >
> > > > Note : product1,product2 and product3 are my collection.
> > > >
> > > > In my solrconfig.xml
> > > > <autoCommit>
> > > > <maxTime>600000</maxTime>
> > > >    <maxDocs>20000</maxDocs>
> > > >    <openSearcher>false</openSearcher>
> > > > </autoCommit>
> > > >
> > > > <autoSoftCommit>
> > > > <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime>
> > > > </autoSoftCommit>
> > > > <maxWarmingSearchers>2</maxWarmingSearchers>
> > > >
> > > > There are many documents committed at that time and I found out so many 
> > > > commitScheduler threads in Log.
> > > > Solr went down due to warn PERFORMANCE WARNING: Overlapping 
> > > > onDeckSearchers=2 is it possible?
> > > > Need to update my autoCommit or  maxWarmingSearchers?
> > > >
> > > > Sent from Outlook<http://aka.ms/weboutlook>

Re: Shard and replica went down in Solr 6.1.0

Reply via email to