Kevin,

Take a look at 
http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-td4067388.html
 and https://issues.apache.org/jira/browse/SOLR-4816. I had the same issue that 
you're reporting for a while then I applied the patch from SOLR-4816 to my 
clients and the problems went away. If you don't feel like applying the patch 
it looks like it should be included in the release of version 4.5. Also note 
that the problem happens more frequently when the replication factor is greater 
than 1.

Thanks,
Greg

-----Original Message-----
From: kevin.osb...@cbsinteractive.com [mailto:kevin.osb...@cbsinteractive.com] 
On Behalf Of Kevin Osborn
Sent: Tuesday, September 03, 2013 4:16 PM
To: solr-user
Subject: Solr Cloud hangs when replicating updates

I was having problems updating SolrCloud with a large batch of records. The 
records are coming in bursts with lulls between updates.

At first, I just tried large updates of 100,000 records at a time.
Eventually, this caused Solr to hang. When hung, I can still query Solr.
But I cannot do any deletes or other updates to the index.

At first, my updates were going as SolrJ CSV posts. I have also tried local 
file updates and had similar results. I finally slowed things down to just use 
SolrJ's Update feature, which is basically just JavaBin. I am also sending over 
just 100 at a time in 10 threads. Again, it eventually hung.

Sometimes, Solr hangs in the first couple of chunks. Other times, it hangs 
right away.

These are my commit settings:

<autoCommit>
       <maxTime>15000</maxTime>
       <maxDocs>5000</maxDocs>
       <openSearcher>false</openSearcher>
     </autoCommit>
<autoSoftCommit>
         <maxTime>30000</maxTime>
       </autoSoftCommit>

I have tried quite a few variations with the same results. I also tried various 
JVM settings with the same results. The only variable seems to be that reducing 
the cluster size from 2 to 1 is the only thing that helps.

I also did a jstack trace. I did not see any explicit deadlocks, but I did see 
quite a few threads in WAITING or TIMED_WAITING. It is typically something like 
this:

  java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x000000074039a450> (a
java.util.concurrent.Semaphore$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:834)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:994)
        at
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1303)
        at java.util.concurrent.Semaphore.acquire(Semaphore.java:317)
        at
org.apache.solr.util.AdjustableSemaphore.acquire(AdjustableSemaphore.java:61)
        at
org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:418)
        at
org.apache.solr.update.SolrCmdDistributor.submit(SolrCmdDistributor.java:368)
        at
org.apache.solr.update.SolrCmdDistributor.flushAdds(SolrCmdDistributor.java:300)
        at
org.apache.solr.update.SolrCmdDistributor.distribAdd(SolrCmdDistributor.java:139)
        at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:474)
        at
org.apache.solr.handler.loader.CSVLoaderBase.doAdd(CSVLoaderBase.java:395)
        at
org.apache.solr.handler.loader.SingleThreadedCSVLoader.addDoc(CSVLoader.java:44)
        at
org.apache.solr.handler.loader.CSVLoaderBase.load(CSVLoaderBase.java:364)
        at org.apache.solr.handler.loader.CSVLoader.load(CSVLoader.java:31)
        at
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
        at
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904)
        at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:533)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)
        at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
        at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
        at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
        at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
        at
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255)
        at
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154)

It basically appears that Solr gets stuck while trying to acquire a semaphore 
that never becomes available.

Anyone have any ideas? This is definitely causing major problems for us.

--
*KEVIN OSBORN*
LEAD SOFTWARE ENGINEER
CNET Content Solutions
OFFICE 949.399.8714
CELL 949.310.4677      SKYPE osbornk
5 Park Plaza, Suite 600, Irvine, CA 92614
[image: CNET Content Solutions]

Reply via email to