[jira] [Commented] (SOLR-9591) Shards and replicas go down when indexing large number of files

Khalid Alharbi (JIRA) Mon, 03 Oct 2016 14:05:43 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-9591?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15543419#comment-15543419
 ]


Khalid Alharbi commented on SOLR-9591:
--------------------------------------

Thank you Kevin for looking at this.
{quote}
Are these plain text files or something else?
{quote}
Yes, plain text files that contain the Android framework API calls used by 
apps. Each text file belongs to an Android app and has all the Android API 
methods that the app uses.

{quote}
Does it happen with any set of text files you tried?
{quote}
No. In fact, I thought that was the case but after indexing files and 
restarting Solr, Solr indexed the ones that failed at without any problem. If I 
do not restart Solr, it will fail with the aforementioned error message.

{quote}
Are these text files publicly available?
{quote}
Yes, but only 60 text files. [GitHub 
Link|https://github.com/sikuli/sieveable/tree/master/fixtures/code]

{quote}
Since you trying single node with embedded, does this same error occur with the 
latest Solr release?
{quote}
I tried this on each Solr 5.x releases between 5.1.0 to 5.5.2.

{quote}
Can you provide the full stack trace?
{quote}

Sure, I have attached one of the log files and below is the error message that 
keeps occurring over and over when I index those files.

{code}
2016-10-02 14:13:29.194 ERROR (qtp263094610-15) [c:my_collection s:shard1 
r:core_node2 x:my_collection_shard1_replica1] o.a.s.h.RequestHandlerBase 
org.apache.solr.common.SolrException: ClusterState says we are the leader 
(http://XX.XX.XX.XX:8983/solr/my_collection_shard1_replica1), but locally we 
don't think so. Request came from null
        at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:622)
        at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:385)
        at 
org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:315)
        at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:671)
        at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
        at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
        at 
org.apache.solr.update.processor.AbstractDefaultValueUpdateProcessorFactory$DefaultValueUpdateProcessor.processAdd(AbstractDefaultValueUpdateProcessorFactory.java:93)
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:126)
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:131)
        at 
org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:237)
        at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:69)
        at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:155)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:2102)
        at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:654)
        at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:460)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:257)
        at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:208)
        at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
        at 
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
        at 
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577)
        at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
        at 
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
        at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
        at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
        at 
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
        at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215)
        at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110)
        at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
        at org.eclipse.jetty.server.Server.handle(Server.java:499)
        at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310)
        at 
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
        at 
org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
        at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
        at java.lang.Thread.run(Thread.java:745)

{code}

> Shards and replicas go down when indexing large number of files
> ---------------------------------------------------------------
>
>                 Key: SOLR-9591
>                 URL: https://issues.apache.org/jira/browse/SOLR-9591
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: SolrCloud
>    Affects Versions: 5.5.2
>            Reporter: Khalid Alharbi
>         Attachments: solr_log_20161002_1504
>
>
> Solr shards and replicas go down when indexing a large number of text files 
> using the default [extracting request 
> handler|https://cwiki.apache.org/confluence/x/c4DxAQ].
> {code}
> curl 
> 'http://localhost:8983/solr/myCollection/update/extract?literal.id=someId' -F 
> "myfile=/data/file1.txt"
> {code}
> and committing after indexing 5,000 files using:
> {code}
> curl 'http://localhost:8983/solr/myCollection/update?commit=true&wt=json'
> {code}
> This was on Solr (SolrCloud) version 5.5.2 with an external zookeeper cluster 
> of five nodes. I also tried this on a single node SolrCloud with the embedded 
> ZooKeeper but the collection went down as well. In both cases the error 
> message is always "ERROR null DistributedUpdateProcessor ClusterState says we 
> are the leader, but locally we don't think so"
> I managed to come up with a work around that helped me index over 400K files 
> without getting replicas down with that error message. The work around is to 
> index 5K files, restart Solr, wait for shards and replicas to get active, 
> then index the next 5K files, and repeat the previous steps.
> If this is not enough to investigate this issue, I will be happy to provide 
> more details regarding this issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

[jira] [Commented] (SOLR-9591) Shards and replicas go down when indexing large number of files

Reply via email to