Khalid Alharbi created SOLR-9591:
------------------------------------
Summary: Shards and replicas go down when indexing large number of
files
Key: SOLR-9591
URL: https://issues.apache.org/jira/browse/SOLR-9591
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Components: SolrCloud
Affects Versions: 5.5.2
Reporter: Khalid Alharbi
Solr shards and replicas go down when indexing a large number of text files
using the default [extracting request
handler|https://cwiki.apache.org/confluence/x/c4DxAQ].
{code}
curl 'http://localhost:8983/solr/myCollection/update/extract?literal.id=someId'
-F "myfile=/data/file1.txt"
{code}
and committing after indexing 5,000 files using:
{code}
curl 'http://localhost:8983/solr/myCollection/update?commit=true&wt=json'
{code}
This was on Solr (SolrCloud) version 5.5.2 with an external zookeeper cluster
of five nodes. I also tried this on a single node SolrCloud with the embedded
ZooKeeper but the collection went down as well. In both cases the error message
is always "ERROR null DistributedUpdateProcessor ClusterState says we are the
leader, but locally we don't think so"
I managed to come up with a work around that helped me index over 400K files
without getting replicas down with that error message. The work around is to
index 5K files, restart Solr, wait for shards and replicas to get active, then
index the next 5K files, and repeat the previous steps.
If this is not enough to investigate this issue, I will be happy to provide
more details regarding this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]