Re: Meet CorruptIndexException while shutdown one node in Solr cloud

2017-09-20 Thread wg85907
Hi Erick,
Thanks for your advice about having openSearcher set to true
unnecessary for my case. For CorruptIndexException issue, I think Solr
should handle this quite well too. Because I always shutdown tomcat
gracefully. 
 Recently I did a couple of tests about this issue. When keep
posting update request to Solr and stop one of three tomcat node in a single
shard cluster, it is easy to reproduction CorruptIndexException, no matter
the stop node is leader node or replica node. So I think this is a Bug of
Solr. Any idea how can I avoid meeting this issue? For example if I can
remove one node from zookeeper before stop it. Also please show me if reboot
tomcat node is the only way to resolve the memory issue. If I can control
the field cache size, then reboot is unnecessary.

Below is the trace when start tomcat and first time meet
CorruptIndexException issue:
2017-09-19 10:18:57,614 ERROR [RecoveryThread][RQ-Init]
(SolrException.java:142) - SnapPull failed
:org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at
org.apache.solr.handler.SnapPuller.openNewSearcherAndUpdateCommitPoint(SnapPuller.java:673)
at
org.apache.solr.handler.SnapPuller.fetchLatestIndex(SnapPuller.java:493)
at
org.apache.solr.handler.ReplicationHandler.doFetch(ReplicationHandler.java:337)
at
org.apache.solr.cloud.RecoveryStrategy.replicate(RecoveryStrategy.java:163)
at
org.apache.solr.cloud.RecoveryStrategy.doRecovery(RecoveryStrategy.java:447)
at
org.apache.solr.cloud.RecoveryStrategy.run(RecoveryStrategy.java:235)
Caused by: org.apache.lucene.index.CorruptIndexException:
liveDocs.count()=10309577 info.docCount=15057819 info.getDelCount()=4748252
(filename=_4y65a_13g.del)
at
org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:96)
at
org.apache.lucene.index.SegmentReader.(SegmentReader.java:116)
at
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144)
at
org.apache.lucene.index.ReadersAndUpdates.getReadOnlyClone(ReadersAndUpdates.java:238)
at
org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:104)
at
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:422)
at
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:279)
at
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476)
... 7 more


Regards.
Geng, Wei 




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Meet CorruptIndexException while shutdown one node in Solr cloud

2017-09-15 Thread wg85907
Hi team,
Currently I am using Solr 4.10 in tomcat. I have a one shard Solr
Cloud with 3 replicas. I set heap size to 15GB for each node. As I have big
data volume and large amount of query request. So always meet frequent full
GC issue. We have checked this and found that many memory was used as field
cache by Solr. To avoid this, we begin to reboot tomcat instance one by one
in schedule. We don't kill any process but run script  "catalina.sh stop" to
shutdown tomcat gracefully. To keep message not pending,  we receive message
from user all the time and send update request to Solr once get new message.
This means Solr may get update request during shutdown. I think that is the
reason we get  CorruptIndexException. Since we begin to do the reboot, we
always get CorruptIndexException. The trace is as below:
2017-09-14 04:25:49,241
ERROR[commitScheduler-15-thread-1][R31609](CommitTracker) - auto commit
error...:org.apache.solr.common.SolrException: Error opening new searcher
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1565)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:1677)
at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:607)
at org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.lucene.index.CorruptIndexException:
liveDocs.count()=33574 info.docCount=34156 info.getDelCount()=584
(filename=_1uvck_k.del)
at
org.apache.lucene.codecs.lucene40.Lucene40LiveDocsFormat.readLiveDocs(Lucene40LiveDocsFormat.java:96)
at
org.apache.lucene.index.SegmentReader.(SegmentReader.java:116)
at
org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:144)
at
org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:282)
at
org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3271)
at
org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3262)
at
org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:421)
at
org.apache.lucene.index.StandardDirectoryReader.doOpenIfChanged(StandardDirectoryReader.java:279)
at
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476)
... 10 more


As we shutdown Solr gracefully, I think Solr should be strong enough
to handle this case. Please give me some advice about why this happen and
what we can do to avoid this. Ps below is some of our solrConfig cotent:


6
true


1000


Regards,
Geng, Wei



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-20 Thread wg85907
Hi Walter, Shawn,
Thanks for your quickly reply, the information you provide is really
helpful. Now I know how to find a right way to resolve my issue.
Regards,
Geng, Wei 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346944.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-18 Thread wg85907
I am not mean my Zookeeper cluster is rebooting frequently, just want to
ensure my query service can be stable when Zookeeper cluster has issue or
reboot. Will do some test to check if there is some issue here. Maybe
current Zookeeper client can handle this case well. Hacking the client will
always be the last choice.
Regards,
Geng, Wei



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346528.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-17 Thread wg85907
Hi Shawn,
Thanks for your detail explanation. The reason I want to shutdown
the CloudSolrServer instance and create a new one is that I have concern
that if it can successfully reconnect to Zookeeper server if Zookeeper
cluster has some issue and reboot. I will do related test with version
6.5.0, which is the version I want to upgrade to. If there is any issue, I
will report the issue to you and your team as you suggested. Anyway I will
abandon the way that shutdown/close the CloudSolrServer instance and create
a new one. The alternative opinion is to manage Zookeeper connection myself
by extending Class ZkClientClusterStateProvider. 
Regards,
Geng, Wei



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040p4346295.html
Sent from the Solr - User mailing list archive at Nabble.com.


Create too many zookeeper connections when recreate CloudSolrServer instance

2017-07-14 Thread wg85907
Hi Community,
I use Solr(4.10.2) as indexing tool. I use a singleton
CloudSolrServer instance to query Solr. When meet exception, for example
current Solr server not response, i will create a new CloudSolrServer
instance and shutdown the old one. We have many query threads that share the
same CloudSolrServer instance. In a case, when thread A meet an Exception it
create a new CloudSolrServer instance and begin to shutdown current
CloudSolrServer, from Solr code I know the first step is to close the
Zookeeper connection; while at the same time, thread B may still doing query
with this instance, the first step of query is to check Zookeeper
connection, if the connection is not exist, then create one. Thread A can
processed to do the shutdown. Then the Zookeeper connection created by
thread B is over there without access. Due to this, we may have more and
more zookeeper connections at the same time till we can't create one new and
get below exception on zookeeper server side:   

2017-07-06 09:42:37,595 [myid:5] - WARN 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:10199:NIOServerCnxnFactory@193] - Too
many connections from /169.171.87.37 - max is 60
  So I just want to know if I operate CloudSolrServer in a wrong way and
do you have any suggestions about how to fill my requirement.
Regards,
Geng, Wei



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Create-too-many-zookeeper-connections-when-recreate-CloudSolrServer-instance-tp4346040.html
Sent from the Solr - User mailing list archive at Nabble.com.