Hmmm, not sure, and also not sure what to suggest next. QTimes
measure only the search time, not, say, time waiting for the request to get
serviced.

I'm afraid the next suggestion is to throw a profiler at it 'cause nothing jumps
out at me..'

Best,
Erick

On Fri, Sep 16, 2016 at 10:23 AM, Rallavagu <rallav...@gmail.com> wrote:
> Comments in line...
>
> On 9/16/16 10:15 AM, Erick Erickson wrote:
>>
>> Well, the next thing I'd look at is CPU activity. If you're flooding the
>> system
>> with updates there'll be CPU contention.
>
>
> Monitoring does not suggest any high CPU but as you can see from vmstat
> output "user" cpu is a bit high during updates that are taking time (34
> user, 65 idle).
>
>>
>> And there are a number of things you can do that make updates in
>> particular
>> much less efficient, from committing very frequently (sometimes combined
>> with excessive autowarm parameters) and the like.
>
>
> softCommit is set to 10 minutes, autowarm count is set to 0 and commit is
> set to 15 sec for NRT.
>
>>
>> There are a series of ideas that might trigger an "aha" moment:
>> https://wiki.apache.org/solr/SolrPerformanceFactors
>
>
> Reviewed this document and made few changes accordingly a while ago.
>>
>>
>> But the crude measure is just to look at CPU usage when updates happen, or
>> just before. Are you running hot with queries alone then add an update
>> burden?
>
>
> Essentially, it is high QTimes for queries got me looking into logs, system
> etc and I could correlate updates slowness and searching slowness. Some
> other time QTimes go high is right after softCommit which is expected.
>
> Wondering what causes update threads wait and if it has any impact on search
> at all. I had couple of more CPUs added but I still see similar behavior.
>
> Thanks.
>
>
>>
>> Best,
>> Erick
>>
>> On Fri, Sep 16, 2016 at 9:19 AM, Rallavagu <rallav...@gmail.com> wrote:
>>>
>>> Erick,
>>>
>>> Was monitoring GC activity and couldn't align GC pauses to this behavior.
>>> Also, the vmstat shows no swapping or cpu I/O wait. However, whenever I
>>> see
>>> high update response times (corresponding high QTimes for searches)
>>> vmstat
>>> shows as series of number of "waiting to runnable" processes in "r"
>>> column
>>> of "procs" section.
>>>
>>>
>>> https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png
>>>
>>> procs -----------------------memory---------------------- ---swap--
>>> -----io---- -system-- --------cpu-------- -----timestamp-----
>>>  r  b         swpd         free        inact       active   si   so bi
>>> bo
>>> in   cs  us  sy  id  wa  st                 CDT
>>>  2  0        71068     18688496      2526604     24204440    0    0 0
>>> 0
>>> 1433  462  27   1  73   0   0 2016-09-16 11:02:32
>>>  1  0        71068     18688180      2526600     24204568    0    0 0
>>> 0
>>> 1388  404  26   1  74   0   0 2016-09-16 11:02:33
>>>  1  0        71068     18687928      2526600     24204568    0    0 0
>>> 0
>>> 1354  401  25   0  75   0   0 2016-09-16 11:02:34
>>>  1  0        71068     18687800      2526600     24204572    0    0 0
>>> 0
>>> 1311  397  25   0  74   0   0 2016-09-16 11:02:35
>>>  1  0        71068     18687164      2527116     24204844    0    0 0
>>> 0
>>> 1770  702  31   1  69   0   0 2016-09-16 11:02:36
>>>  1  0        71068     18686944      2527108     24204908    0    0 0
>>> 52
>>> 1266  421  26   0  74   0   0 2016-09-16 11:02:37
>>> 12  1        71068     18682676      2528560     24207116    0    0 0
>>> 280
>>> 2388  934  34   1  65   0   0 2016-09-16 11:02:38
>>>  2  1        71068     18651340      2530820     24233368    0    0 0
>>> 1052
>>> 10258 5696  82   5  13   0   0 2016-09-16 11:02:39
>>>  5  0        71068     18648600      2530112     24235060    0    0 0
>>> 1988
>>> 7261 3644  84   2  13   1   0 2016-09-16 11:02:40
>>>  9  1        71068     18647804      2530580     24236076    0    0 0
>>> 1688
>>> 7031 3575  84   2  13   1   0 2016-09-16 11:02:41
>>>  1  0        71068     18647628      2530364     24236256    0    0 0
>>> 680
>>> 7065 4463  61   3  35   1   0 2016-09-16 11:02:42
>>>  1  0        71068     18646344      2531204     24236536    0    0 0
>>> 44
>>> 6422 4922  35   3  63   0   0 2016-09-16 11:02:43
>>>  2  0        71068     18644460      2532196     24237440    0    0 0
>>> 0
>>> 6561 5056  25   3  72   0   0 2016-09-16 11:02:44
>>>  0  0        71068     18661900      2531724     24218764    0    0 0
>>> 0
>>> 7312 10050  11   3  86   0   0 2016-09-16 11:02:45
>>>  2  0        71068     18649400      2532228     24229800    0    0 0
>>> 0
>>> 7211 6222  34   3  63   0   0 2016-09-16 11:02:46
>>>  0  0        71068     18648280      2533440     24230300    0    0 0
>>> 108
>>> 3936 3381  20   1  79   0   0 2016-09-16 11:02:47
>>>  0  0        71068     18648156      2533212     24230684    0    0 0
>>> 12
>>> 1279 1681   2   0  97   0   0 2016-09-16 11:02:48
>>>
>>>
>>> Captured stack trace including timing for one of the update threads.
>>>
>>>
>>> org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time =
>>> 15
>>> ms, total time = 30782 ms)
>>>  Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total time
>>> =
>>> 30767 ms)
>>>   Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total
>>> time =
>>> 30767 ms)
>>>    org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms,
>>> total
>>> time = 30767 ms)
>>>     org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 ms,
>>> total time = 30767 ms)
>>>      org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms,
>>> total
>>> time = 30767 ms)
>>>       org.apache.solr.handler.RequestHandlerBase:handleRequest:156
>>> (method
>>> time = 0 ms, total time = 30767 ms)
>>>
>>> org.apache.solr.handler.ContentStreamHandlerBase:handleRequestBody:70
>>> (method time = 0 ms, total time = 30767 ms)
>>>         org.apache.solr.handler.UpdateRequestHandler$1:load:95 (method
>>> time
>>> = 0 ms, total time = 23737 ms)
>>>          org.apache.solr.handler.loader.XMLLoader:load:178 (method time =
>>> 0
>>> ms, total time = 23737 ms)
>>>           org.apache.solr.handler.loader.XMLLoader:processUpdate:251
>>> (method
>>> time = 0 ms, total time = 23737 ms)
>>>
>>>
>>> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:processAdd:104
>>> (method time = 0 ms, total time = 23737 ms)
>>>
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:processAdd:702
>>> (method time = 0 ms, total time = 23737 ms)
>>>
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:versionAdd:1011
>>> (method time = 0 ms, total time = 23477 ms)
>>>
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:getUpdatedDocument:1114
>>> (method time = 0 ms, total time = 51 ms)
>>>
>>> org.apache.solr.update.processor.AtomicUpdateDocumentMerger:merge:110
>>> (method time = 51 ms, total time = 51 ms)
>>>
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:doLocalAdd:924
>>> (method time = 0 ms, total time = 23426 ms)
>>>
>>> org.apache.solr.update.processor.UpdateRequestProcessor:processAdd:49
>>> (method time = 0 ms, total time = 23426 ms)
>>>
>>> org.apache.solr.update.processor.RunUpdateProcessor:processAdd:69 (method
>>> time = 0 ms, total time = 23426 ms)
>>>                  org.apache.solr.update.DirectUpdateHandler2:addDoc:169
>>> (method time = 0 ms, total time = 23426 ms)
>>>
>>> org.apache.solr.update.DirectUpdateHandler2:addDoc0:207 (method time = 0
>>> ms,
>>> total time = 23426 ms)
>>>
>>> org.apache.solr.update.DirectUpdateHandler2:doNormalUpdate:275 (method
>>> time
>>> = 0 ms, total time = 23426 ms)
>>>
>>> org.apache.lucene.index.IndexWriter:updateDocument:1477 (method time = 0
>>> ms,
>>> total time = 8551 ms)
>>>
>>> org.apache.lucene.index.DocumentsWriter:updateDocument:450 (method time =
>>> 0
>>> ms, total time = 8551 ms)
>>>
>>> org.apache.lucene.index.DocumentsWriterPerThread:updateDocument:234
>>> (method
>>> time = 0 ms, total time = 8551 ms)
>>>
>>> org.apache.lucene.index.DefaultIndexingChain:processDocument:300 (method
>>> time = 0 ms, total time = 8551 ms)
>>>
>>> org.apache.lucene.index.DefaultIndexingChain:processField:344 (method
>>> time =
>>> 0 ms, total time = 8551 ms)
>>>
>>> org.apache.lucene.index.DefaultIndexingChain$PerField:invert:613 (method
>>> time = 0 ms, total time = 4098 ms)
>>>
>>> org.apache.lucene.analysis.util.FilteringTokenFilter:incrementToken:51
>>> (method time = 0 ms, total time = 4098 ms)
>>>
>>> org.apache.lucene.analysis.synonym.SynonymFilter:incrementToken:627
>>> (method
>>> time = 0 ms, total time = 4098 ms)
>>>
>>> org.apache.lucene.analysis.synonym.SynonymFilter:parse:396 (method time =
>>> 0
>>> ms, total time = 4098 ms)
>>>
>>> org.apache.lucene.util.fst.FST:findTargetArc:1186 (method time = 0 ms,
>>> total
>>> time = 4098 ms)
>>>
>>> org.apache.lucene.util.fst.FST:findTargetArc:1270 (method time = 0 ms,
>>> total
>>> time = 4098 ms)
>>>
>>> org.apache.lucene.util.fst.FST:readFirstRealTargetArc:992 (method time =
>>> 0
>>> ms, total time = 4098 ms)
>>>
>>> org.apache.lucene.util.fst.FST:readNextRealArc:1085 (method time = 0 ms,
>>> total time = 4098 ms)
>>>
>>> org.apache.lucene.util.fst.FST:readLabel:636 (method time = 0 ms, total
>>> time
>>> = 4098 ms)
>>>
>>> org.apache.lucene.store.DataInput:readVInt:125 (method time = 4098 ms,
>>> total
>>> time = 4098 ms)
>>>
>>> org.apache.lucene.index.DefaultIndexingChain:getOrAddField:484 (method
>>> time
>>> = 0 ms, total time = 4453 ms)
>>>
>>> org.apache.lucene.index.FieldInfos$Builder:getOrAdd:317 (method time = 0
>>> ms,
>>> total time = 4453 ms)
>>>
>>> org.apache.lucene.index.FieldInfos$FieldNumbers:addOrGet:218 (method time
>>> =
>>> 4453 ms, total time = 4453 ms)
>>>                     org.apache.solr.update.UpdateLog:add:412 (method time
>>> =
>>> 0 ms, total time = 14875 ms)
>>>                      org.apache.solr.update.UpdateLog:add:421 (method
>>> time =
>>> 14875 ms, total time = 14875 ms)
>>>              org.apache.solr.update.SolrCmdDistributor:distribAdd:207
>>> (method time = 0 ms, total time = 260 ms)
>>>               org.apache.solr.update.SolrCmdDistributor:submit:289
>>> (method
>>> time = 0 ms, total time = 260 ms)
>>>                org.apache.solr.update.SolrCmdDistributor:doRequest:296
>>> (method time = 0 ms, total time = 260 ms)
>>>                 org.apache.solr.client.solrj.SolrClient:request:1220
>>> (method
>>> time = 0 ms, total time = 260 ms)
>>>
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:request:382
>>> (method time = 0 ms, total time = 260 ms)
>>>
>>>
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:addRunner:324
>>> (method time = 0 ms, total time = 260 ms)
>>>
>>>
>>> org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor:execute:215
>>> (method time = 0 ms, total time = 260 ms)
>>>
>>> org.apache.solr.common.util.SolrjNamedThreadFactory:newThread:40 (method
>>> time = 260 ms, total time = 260 ms)
>>>
>>>
>>> org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:finish:183
>>> (method time = 0 ms, total time = 7030 ms)
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:finish:1626
>>> (method time = 0 ms, total time = 7030 ms)
>>>
>>> org.apache.solr.update.processor.DistributedUpdateProcessor:doFinish:778
>>> (method time = 0 ms, total time = 7030 ms)
>>>            org.apache.solr.update.SolrCmdDistributor:finish:90 (method
>>> time
>>> = 0 ms, total time = 7030 ms)
>>>
>>> org.apache.solr.update.SolrCmdDistributor:blockAndDoRetries:232 (method
>>> time
>>> = 0 ms, total time = 7030 ms)
>>>
>>> org.apache.solr.update.StreamingSolrClients:blockUntilFinished:107
>>> (method
>>> time = 0 ms, total time = 7030 ms)
>>>
>>>
>>> org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:blockUntilFinished:429
>>> (method time = 0 ms, total time = 7030 ms)
>>>                java.lang.Object:wait (method time = 7030 ms, total time =
>>> 7030 ms)
>>>
>>>
>>> https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.18.52%20AM.png
>>>
>>> Appears there could be thread waiting but I am not sure how would this
>>> impact searching.
>>>
>>> Thanks
>>>
>>> On 9/16/16 8:42 AM, Erick Erickson wrote:
>>>>
>>>>
>>>> First thing I'd look is whether you're _also_ seeing stop-the-world GC
>>>> pauses.
>>>> In that case there are a number of JVM options that can be tuned....
>>>>
>>>> Best,
>>>> Erick
>>>>
>>>> On Fri, Sep 16, 2016 at 8:40 AM, Rallavagu <rallav...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> Solr 5.4.1 with embedded jetty single shard - NRT
>>>>>
>>>>> Looking in logs, noticed that there are high QTimes for Queries and
>>>>> round
>>>>> same time high response times for updates. These are not during
>>>>> "commit"
>>>>> or
>>>>> "softCommit" but when client application is sending updates. Wondering
>>>>> how
>>>>> updates could impact query performance. What are the options for
>>>>> tuning?
>>>>> Thanks.

Reply via email to