[ 
https://issues.apache.org/jira/browse/SOLR-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15545446#comment-15545446
 ] 

Khalid Alharbi commented on SOLR-9591:
--------------------------------------

{quote}
Does the issue happen if you commit after a smaller number of documents? Say 50 
or 100 or 1000?
{quote}
Yes, I did try committing after indexing 500 documents in a loop but the shards 
went down in the fifth iteration (after indexing and committing around 2500 
docs). Interestingly, when the collection is empty, it can index up to 20,000 
docs without an issue. This issue surfaces when I have over 20K docs in a 
collection.

{quote}
What is your heap size set to?
{quote}
The heap size is 5g. 
Running ./bin/solr status shows this: "memory":"2.1 GB (%44) of 4.8 GB"

> Shards and replicas go down when indexing large number of files
> ---------------------------------------------------------------
>
>                 Key: SOLR-9591
>                 URL: https://issues.apache.org/jira/browse/SOLR-9591
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 5.5.2
>            Reporter: Khalid Alharbi
>         Attachments: solr_log_20161002_1504
>
>
> Solr shards and replicas go down when indexing a large number of text files 
> using the default [extracting request 
> handler|https://cwiki.apache.org/confluence/x/c4DxAQ].
> {code}
> curl 
> 'http://localhost:8983/solr/myCollection/update/extract?literal.id=someId' -F 
> "myfile=/data/file1.txt"
> {code}
> and committing after indexing 5,000 files using:
> {code}
> curl 'http://localhost:8983/solr/myCollection/update?commit=true&wt=json'
> {code}
> This was on Solr (SolrCloud) version 5.5.2 with an external zookeeper cluster 
> of five nodes. I also tried this on a single node SolrCloud with the embedded 
> ZooKeeper but the collection went down as well. In both cases the error 
> message is always "ERROR null DistributedUpdateProcessor ClusterState says we 
> are the leader,​ but locally we don't think so"
> I managed to come up with a work around that helped me index over 400K files 
> without getting replicas down with that error message. The work around is to 
> index 5K files, restart Solr, wait for shards and replicas to get active, 
> then index the next 5K files, and repeat the previous steps.
> If this is not enough to investigate this issue, I will be happy to provide 
> more details regarding this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to