If both queries _and_ updates are slow, it's hard to see how upping the number of threads would help overall. Hmmm, you also reported that the CPUs didn't seem to be stressed so its worth a try, perhaps there's some kind of blocking going on....
Best, Erick On Mon, Sep 19, 2016 at 5:33 PM, Rallavagu <rallav...@gmail.com> wrote: > Hi Erick, > > Would increasing (or adjusting) update threads help as per this JIRA ((Allow > the number of threads ConcurrentUpdateSolrClient StreamingSolrClients > configurable by a system property) here? > > https://issues.apache.org/jira/browse/SOLR-8500 > > Thanks > > > On 9/19/16 8:30 AM, Erick Erickson wrote: >> >> Hmmm, not sure, and also not sure what to suggest next. QTimes >> measure only the search time, not, say, time waiting for the request to >> get >> serviced. >> >> I'm afraid the next suggestion is to throw a profiler at it 'cause nothing >> jumps >> out at me..' >> >> Best, >> Erick >> >> On Fri, Sep 16, 2016 at 10:23 AM, Rallavagu <rallav...@gmail.com> wrote: >>> >>> Comments in line... >>> >>> On 9/16/16 10:15 AM, Erick Erickson wrote: >>>> >>>> >>>> Well, the next thing I'd look at is CPU activity. If you're flooding the >>>> system >>>> with updates there'll be CPU contention. >>> >>> >>> >>> Monitoring does not suggest any high CPU but as you can see from vmstat >>> output "user" cpu is a bit high during updates that are taking time (34 >>> user, 65 idle). >>> >>>> >>>> And there are a number of things you can do that make updates in >>>> particular >>>> much less efficient, from committing very frequently (sometimes combined >>>> with excessive autowarm parameters) and the like. >>> >>> >>> >>> softCommit is set to 10 minutes, autowarm count is set to 0 and commit is >>> set to 15 sec for NRT. >>> >>>> >>>> There are a series of ideas that might trigger an "aha" moment: >>>> https://wiki.apache.org/solr/SolrPerformanceFactors >>> >>> >>> >>> Reviewed this document and made few changes accordingly a while ago. >>>> >>>> >>>> >>>> But the crude measure is just to look at CPU usage when updates happen, >>>> or >>>> just before. Are you running hot with queries alone then add an update >>>> burden? >>> >>> >>> >>> Essentially, it is high QTimes for queries got me looking into logs, >>> system >>> etc and I could correlate updates slowness and searching slowness. Some >>> other time QTimes go high is right after softCommit which is expected. >>> >>> Wondering what causes update threads wait and if it has any impact on >>> search >>> at all. I had couple of more CPUs added but I still see similar behavior. >>> >>> Thanks. >>> >>> >>>> >>>> Best, >>>> Erick >>>> >>>> On Fri, Sep 16, 2016 at 9:19 AM, Rallavagu <rallav...@gmail.com> wrote: >>>>> >>>>> >>>>> Erick, >>>>> >>>>> Was monitoring GC activity and couldn't align GC pauses to this >>>>> behavior. >>>>> Also, the vmstat shows no swapping or cpu I/O wait. However, whenever I >>>>> see >>>>> high update response times (corresponding high QTimes for searches) >>>>> vmstat >>>>> shows as series of number of "waiting to runnable" processes in "r" >>>>> column >>>>> of "procs" section. >>>>> >>>>> >>>>> >>>>> https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png >>>>> >>>>> procs -----------------------memory---------------------- ---swap-- >>>>> -----io---- -system-- --------cpu-------- -----timestamp----- >>>>> r b swpd free inact active si so bi >>>>> bo >>>>> in cs us sy id wa st CDT >>>>> 2 0 71068 18688496 2526604 24204440 0 0 0 >>>>> 0 >>>>> 1433 462 27 1 73 0 0 2016-09-16 11:02:32 >>>>> 1 0 71068 18688180 2526600 24204568 0 0 0 >>>>> 0 >>>>> 1388 404 26 1 74 0 0 2016-09-16 11:02:33 >>>>> 1 0 71068 18687928 2526600 24204568 0 0 0 >>>>> 0 >>>>> 1354 401 25 0 75 0 0 2016-09-16 11:02:34 >>>>> 1 0 71068 18687800 2526600 24204572 0 0 0 >>>>> 0 >>>>> 1311 397 25 0 74 0 0 2016-09-16 11:02:35 >>>>> 1 0 71068 18687164 2527116 24204844 0 0 0 >>>>> 0 >>>>> 1770 702 31 1 69 0 0 2016-09-16 11:02:36 >>>>> 1 0 71068 18686944 2527108 24204908 0 0 0 >>>>> 52 >>>>> 1266 421 26 0 74 0 0 2016-09-16 11:02:37 >>>>> 12 1 71068 18682676 2528560 24207116 0 0 0 >>>>> 280 >>>>> 2388 934 34 1 65 0 0 2016-09-16 11:02:38 >>>>> 2 1 71068 18651340 2530820 24233368 0 0 0 >>>>> 1052 >>>>> 10258 5696 82 5 13 0 0 2016-09-16 11:02:39 >>>>> 5 0 71068 18648600 2530112 24235060 0 0 0 >>>>> 1988 >>>>> 7261 3644 84 2 13 1 0 2016-09-16 11:02:40 >>>>> 9 1 71068 18647804 2530580 24236076 0 0 0 >>>>> 1688 >>>>> 7031 3575 84 2 13 1 0 2016-09-16 11:02:41 >>>>> 1 0 71068 18647628 2530364 24236256 0 0 0 >>>>> 680 >>>>> 7065 4463 61 3 35 1 0 2016-09-16 11:02:42 >>>>> 1 0 71068 18646344 2531204 24236536 0 0 0 >>>>> 44 >>>>> 6422 4922 35 3 63 0 0 2016-09-16 11:02:43 >>>>> 2 0 71068 18644460 2532196 24237440 0 0 0 >>>>> 0 >>>>> 6561 5056 25 3 72 0 0 2016-09-16 11:02:44 >>>>> 0 0 71068 18661900 2531724 24218764 0 0 0 >>>>> 0 >>>>> 7312 10050 11 3 86 0 0 2016-09-16 11:02:45 >>>>> 2 0 71068 18649400 2532228 24229800 0 0 0 >>>>> 0 >>>>> 7211 6222 34 3 63 0 0 2016-09-16 11:02:46 >>>>> 0 0 71068 18648280 2533440 24230300 0 0 0 >>>>> 108 >>>>> 3936 3381 20 1 79 0 0 2016-09-16 11:02:47 >>>>> 0 0 71068 18648156 2533212 24230684 0 0 0 >>>>> 12 >>>>> 1279 1681 2 0 97 0 0 2016-09-16 11:02:48 >>>>> >>>>> >>>>> Captured stack trace including timing for one of the update threads. >>>>> >>>>> >>>>> org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time = >>>>> 15 >>>>> ms, total time = 30782 ms) >>>>> Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total >>>>> time >>>>> = >>>>> 30767 ms) >>>>> Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total >>>>> time = >>>>> 30767 ms) >>>>> org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms, >>>>> total >>>>> time = 30767 ms) >>>>> org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 >>>>> ms, >>>>> total time = 30767 ms) >>>>> org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms, >>>>> total >>>>> time = 30767 ms) >>>>> org.apache.solr.handler.RequestHandlerBase:handleRequest:156 >>>>> (method >>>>> time = 0 ms, total time = 30767 ms) >>>>> >>>>> org.apache.solr.handler.ContentStreamHandlerBase:handleRequestBody:70 >>>>> (method time = 0 ms, total time = 30767 ms) >>>>> org.apache.solr.handler.UpdateRequestHandler$1:load:95 (method >>>>> time >>>>> = 0 ms, total time = 23737 ms) >>>>> org.apache.solr.handler.loader.XMLLoader:load:178 (method time >>>>> = >>>>> 0 >>>>> ms, total time = 23737 ms) >>>>> org.apache.solr.handler.loader.XMLLoader:processUpdate:251 >>>>> (method >>>>> time = 0 ms, total time = 23737 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:processAdd:104 >>>>> (method time = 0 ms, total time = 23737 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:processAdd:702 >>>>> (method time = 0 ms, total time = 23737 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:versionAdd:1011 >>>>> (method time = 0 ms, total time = 23477 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:getUpdatedDocument:1114 >>>>> (method time = 0 ms, total time = 51 ms) >>>>> >>>>> org.apache.solr.update.processor.AtomicUpdateDocumentMerger:merge:110 >>>>> (method time = 51 ms, total time = 51 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:doLocalAdd:924 >>>>> (method time = 0 ms, total time = 23426 ms) >>>>> >>>>> org.apache.solr.update.processor.UpdateRequestProcessor:processAdd:49 >>>>> (method time = 0 ms, total time = 23426 ms) >>>>> >>>>> org.apache.solr.update.processor.RunUpdateProcessor:processAdd:69 >>>>> (method >>>>> time = 0 ms, total time = 23426 ms) >>>>> org.apache.solr.update.DirectUpdateHandler2:addDoc:169 >>>>> (method time = 0 ms, total time = 23426 ms) >>>>> >>>>> org.apache.solr.update.DirectUpdateHandler2:addDoc0:207 (method time = >>>>> 0 >>>>> ms, >>>>> total time = 23426 ms) >>>>> >>>>> org.apache.solr.update.DirectUpdateHandler2:doNormalUpdate:275 (method >>>>> time >>>>> = 0 ms, total time = 23426 ms) >>>>> >>>>> org.apache.lucene.index.IndexWriter:updateDocument:1477 (method time = >>>>> 0 >>>>> ms, >>>>> total time = 8551 ms) >>>>> >>>>> org.apache.lucene.index.DocumentsWriter:updateDocument:450 (method time >>>>> = >>>>> 0 >>>>> ms, total time = 8551 ms) >>>>> >>>>> org.apache.lucene.index.DocumentsWriterPerThread:updateDocument:234 >>>>> (method >>>>> time = 0 ms, total time = 8551 ms) >>>>> >>>>> org.apache.lucene.index.DefaultIndexingChain:processDocument:300 >>>>> (method >>>>> time = 0 ms, total time = 8551 ms) >>>>> >>>>> org.apache.lucene.index.DefaultIndexingChain:processField:344 (method >>>>> time = >>>>> 0 ms, total time = 8551 ms) >>>>> >>>>> org.apache.lucene.index.DefaultIndexingChain$PerField:invert:613 >>>>> (method >>>>> time = 0 ms, total time = 4098 ms) >>>>> >>>>> org.apache.lucene.analysis.util.FilteringTokenFilter:incrementToken:51 >>>>> (method time = 0 ms, total time = 4098 ms) >>>>> >>>>> org.apache.lucene.analysis.synonym.SynonymFilter:incrementToken:627 >>>>> (method >>>>> time = 0 ms, total time = 4098 ms) >>>>> >>>>> org.apache.lucene.analysis.synonym.SynonymFilter:parse:396 (method time >>>>> = >>>>> 0 >>>>> ms, total time = 4098 ms) >>>>> >>>>> org.apache.lucene.util.fst.FST:findTargetArc:1186 (method time = 0 ms, >>>>> total >>>>> time = 4098 ms) >>>>> >>>>> org.apache.lucene.util.fst.FST:findTargetArc:1270 (method time = 0 ms, >>>>> total >>>>> time = 4098 ms) >>>>> >>>>> org.apache.lucene.util.fst.FST:readFirstRealTargetArc:992 (method time >>>>> = >>>>> 0 >>>>> ms, total time = 4098 ms) >>>>> >>>>> org.apache.lucene.util.fst.FST:readNextRealArc:1085 (method time = 0 >>>>> ms, >>>>> total time = 4098 ms) >>>>> >>>>> org.apache.lucene.util.fst.FST:readLabel:636 (method time = 0 ms, total >>>>> time >>>>> = 4098 ms) >>>>> >>>>> org.apache.lucene.store.DataInput:readVInt:125 (method time = 4098 ms, >>>>> total >>>>> time = 4098 ms) >>>>> >>>>> org.apache.lucene.index.DefaultIndexingChain:getOrAddField:484 (method >>>>> time >>>>> = 0 ms, total time = 4453 ms) >>>>> >>>>> org.apache.lucene.index.FieldInfos$Builder:getOrAdd:317 (method time = >>>>> 0 >>>>> ms, >>>>> total time = 4453 ms) >>>>> >>>>> org.apache.lucene.index.FieldInfos$FieldNumbers:addOrGet:218 (method >>>>> time >>>>> = >>>>> 4453 ms, total time = 4453 ms) >>>>> org.apache.solr.update.UpdateLog:add:412 (method >>>>> time >>>>> = >>>>> 0 ms, total time = 14875 ms) >>>>> org.apache.solr.update.UpdateLog:add:421 (method >>>>> time = >>>>> 14875 ms, total time = 14875 ms) >>>>> org.apache.solr.update.SolrCmdDistributor:distribAdd:207 >>>>> (method time = 0 ms, total time = 260 ms) >>>>> org.apache.solr.update.SolrCmdDistributor:submit:289 >>>>> (method >>>>> time = 0 ms, total time = 260 ms) >>>>> org.apache.solr.update.SolrCmdDistributor:doRequest:296 >>>>> (method time = 0 ms, total time = 260 ms) >>>>> org.apache.solr.client.solrj.SolrClient:request:1220 >>>>> (method >>>>> time = 0 ms, total time = 260 ms) >>>>> >>>>> >>>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:request:382 >>>>> (method time = 0 ms, total time = 260 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:addRunner:324 >>>>> (method time = 0 ms, total time = 260 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor:execute:215 >>>>> (method time = 0 ms, total time = 260 ms) >>>>> >>>>> org.apache.solr.common.util.SolrjNamedThreadFactory:newThread:40 >>>>> (method >>>>> time = 260 ms, total time = 260 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:finish:183 >>>>> (method time = 0 ms, total time = 7030 ms) >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:finish:1626 >>>>> (method time = 0 ms, total time = 7030 ms) >>>>> >>>>> >>>>> org.apache.solr.update.processor.DistributedUpdateProcessor:doFinish:778 >>>>> (method time = 0 ms, total time = 7030 ms) >>>>> org.apache.solr.update.SolrCmdDistributor:finish:90 (method >>>>> time >>>>> = 0 ms, total time = 7030 ms) >>>>> >>>>> org.apache.solr.update.SolrCmdDistributor:blockAndDoRetries:232 (method >>>>> time >>>>> = 0 ms, total time = 7030 ms) >>>>> >>>>> org.apache.solr.update.StreamingSolrClients:blockUntilFinished:107 >>>>> (method >>>>> time = 0 ms, total time = 7030 ms) >>>>> >>>>> >>>>> >>>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:blockUntilFinished:429 >>>>> (method time = 0 ms, total time = 7030 ms) >>>>> java.lang.Object:wait (method time = 7030 ms, total time >>>>> = >>>>> 7030 ms) >>>>> >>>>> >>>>> >>>>> https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.18.52%20AM.png >>>>> >>>>> Appears there could be thread waiting but I am not sure how would this >>>>> impact searching. >>>>> >>>>> Thanks >>>>> >>>>> On 9/16/16 8:42 AM, Erick Erickson wrote: >>>>>> >>>>>> >>>>>> >>>>>> First thing I'd look is whether you're _also_ seeing stop-the-world GC >>>>>> pauses. >>>>>> In that case there are a number of JVM options that can be tuned.... >>>>>> >>>>>> Best, >>>>>> Erick >>>>>> >>>>>> On Fri, Sep 16, 2016 at 8:40 AM, Rallavagu <rallav...@gmail.com> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Solr 5.4.1 with embedded jetty single shard - NRT >>>>>>> >>>>>>> Looking in logs, noticed that there are high QTimes for Queries and >>>>>>> round >>>>>>> same time high response times for updates. These are not during >>>>>>> "commit" >>>>>>> or >>>>>>> "softCommit" but when client application is sending updates. >>>>>>> Wondering >>>>>>> how >>>>>>> updates could impact query performance. What are the options for >>>>>>> tuning? >>>>>>> Thanks.