Hi Erick,

Would increasing (or adjusting) update threads help as per this JIRA ((Allow the number of threads ConcurrentUpdateSolrClient StreamingSolrClients configurable by a system property) here?

https://issues.apache.org/jira/browse/SOLR-8500

Thanks


On 9/19/16 8:30 AM, Erick Erickson wrote:
Hmmm, not sure, and also not sure what to suggest next. QTimes
measure only the search time, not, say, time waiting for the request to get
serviced.

I'm afraid the next suggestion is to throw a profiler at it 'cause nothing jumps
out at me..'

Best,
Erick

On Fri, Sep 16, 2016 at 10:23 AM, Rallavagu <rallav...@gmail.com> wrote:
Comments in line...

On 9/16/16 10:15 AM, Erick Erickson wrote:

Well, the next thing I'd look at is CPU activity. If you're flooding the
system
with updates there'll be CPU contention.


Monitoring does not suggest any high CPU but as you can see from vmstat
output "user" cpu is a bit high during updates that are taking time (34
user, 65 idle).


And there are a number of things you can do that make updates in
particular
much less efficient, from committing very frequently (sometimes combined
with excessive autowarm parameters) and the like.


softCommit is set to 10 minutes, autowarm count is set to 0 and commit is
set to 15 sec for NRT.


There are a series of ideas that might trigger an "aha" moment:
https://wiki.apache.org/solr/SolrPerformanceFactors


Reviewed this document and made few changes accordingly a while ago.


But the crude measure is just to look at CPU usage when updates happen, or
just before. Are you running hot with queries alone then add an update
burden?


Essentially, it is high QTimes for queries got me looking into logs, system
etc and I could correlate updates slowness and searching slowness. Some
other time QTimes go high is right after softCommit which is expected.

Wondering what causes update threads wait and if it has any impact on search
at all. I had couple of more CPUs added but I still see similar behavior.

Thanks.



Best,
Erick

On Fri, Sep 16, 2016 at 9:19 AM, Rallavagu <rallav...@gmail.com> wrote:

Erick,

Was monitoring GC activity and couldn't align GC pauses to this behavior.
Also, the vmstat shows no swapping or cpu I/O wait. However, whenever I
see
high update response times (corresponding high QTimes for searches)
vmstat
shows as series of number of "waiting to runnable" processes in "r"
column
of "procs" section.


https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.05.51%20AM.png

procs -----------------------memory---------------------- ---swap--
-----io---- -system-- --------cpu-------- -----timestamp-----
 r  b         swpd         free        inact       active   si   so bi
bo
in   cs  us  sy  id  wa  st                 CDT
 2  0        71068     18688496      2526604     24204440    0    0 0
0
1433  462  27   1  73   0   0 2016-09-16 11:02:32
 1  0        71068     18688180      2526600     24204568    0    0 0
0
1388  404  26   1  74   0   0 2016-09-16 11:02:33
 1  0        71068     18687928      2526600     24204568    0    0 0
0
1354  401  25   0  75   0   0 2016-09-16 11:02:34
 1  0        71068     18687800      2526600     24204572    0    0 0
0
1311  397  25   0  74   0   0 2016-09-16 11:02:35
 1  0        71068     18687164      2527116     24204844    0    0 0
0
1770  702  31   1  69   0   0 2016-09-16 11:02:36
 1  0        71068     18686944      2527108     24204908    0    0 0
52
1266  421  26   0  74   0   0 2016-09-16 11:02:37
12  1        71068     18682676      2528560     24207116    0    0 0
280
2388  934  34   1  65   0   0 2016-09-16 11:02:38
 2  1        71068     18651340      2530820     24233368    0    0 0
1052
10258 5696  82   5  13   0   0 2016-09-16 11:02:39
 5  0        71068     18648600      2530112     24235060    0    0 0
1988
7261 3644  84   2  13   1   0 2016-09-16 11:02:40
 9  1        71068     18647804      2530580     24236076    0    0 0
1688
7031 3575  84   2  13   1   0 2016-09-16 11:02:41
 1  0        71068     18647628      2530364     24236256    0    0 0
680
7065 4463  61   3  35   1   0 2016-09-16 11:02:42
 1  0        71068     18646344      2531204     24236536    0    0 0
44
6422 4922  35   3  63   0   0 2016-09-16 11:02:43
 2  0        71068     18644460      2532196     24237440    0    0 0
0
6561 5056  25   3  72   0   0 2016-09-16 11:02:44
 0  0        71068     18661900      2531724     24218764    0    0 0
0
7312 10050  11   3  86   0   0 2016-09-16 11:02:45
 2  0        71068     18649400      2532228     24229800    0    0 0
0
7211 6222  34   3  63   0   0 2016-09-16 11:02:46
 0  0        71068     18648280      2533440     24230300    0    0 0
108
3936 3381  20   1  79   0   0 2016-09-16 11:02:47
 0  0        71068     18648156      2533212     24230684    0    0 0
12
1279 1681   2   0  97   0   0 2016-09-16 11:02:48


Captured stack trace including timing for one of the update threads.


org.eclipse.jetty.server.handler.ContextHandler:doHandle (method time =
15
ms, total time = 30782 ms)
 Filter - SolrDispatchFilter:doFilter:181 (method time = 0 ms, total time
=
30767 ms)
  Filter - SolrDispatchFilter:doFilter:223 (method time = 0 ms, total
time =
30767 ms)
   org.apache.solr.servlet.HttpSolrCall:call:457 (method time = 0 ms,
total
time = 30767 ms)
    org.apache.solr.servlet.HttpSolrCall:execute:658 (method time = 0 ms,
total time = 30767 ms)
     org.apache.solr.core.SolrCore:execute:2073 (method time = 0 ms,
total
time = 30767 ms)
      org.apache.solr.handler.RequestHandlerBase:handleRequest:156
(method
time = 0 ms, total time = 30767 ms)

org.apache.solr.handler.ContentStreamHandlerBase:handleRequestBody:70
(method time = 0 ms, total time = 30767 ms)
        org.apache.solr.handler.UpdateRequestHandler$1:load:95 (method
time
= 0 ms, total time = 23737 ms)
         org.apache.solr.handler.loader.XMLLoader:load:178 (method time =
0
ms, total time = 23737 ms)
          org.apache.solr.handler.loader.XMLLoader:processUpdate:251
(method
time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:processAdd:104
(method time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:processAdd:702
(method time = 0 ms, total time = 23737 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:versionAdd:1011
(method time = 0 ms, total time = 23477 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:getUpdatedDocument:1114
(method time = 0 ms, total time = 51 ms)

org.apache.solr.update.processor.AtomicUpdateDocumentMerger:merge:110
(method time = 51 ms, total time = 51 ms)


org.apache.solr.update.processor.DistributedUpdateProcessor:doLocalAdd:924
(method time = 0 ms, total time = 23426 ms)

org.apache.solr.update.processor.UpdateRequestProcessor:processAdd:49
(method time = 0 ms, total time = 23426 ms)

org.apache.solr.update.processor.RunUpdateProcessor:processAdd:69 (method
time = 0 ms, total time = 23426 ms)
                 org.apache.solr.update.DirectUpdateHandler2:addDoc:169
(method time = 0 ms, total time = 23426 ms)

org.apache.solr.update.DirectUpdateHandler2:addDoc0:207 (method time = 0
ms,
total time = 23426 ms)

org.apache.solr.update.DirectUpdateHandler2:doNormalUpdate:275 (method
time
= 0 ms, total time = 23426 ms)

org.apache.lucene.index.IndexWriter:updateDocument:1477 (method time = 0
ms,
total time = 8551 ms)

org.apache.lucene.index.DocumentsWriter:updateDocument:450 (method time =
0
ms, total time = 8551 ms)

org.apache.lucene.index.DocumentsWriterPerThread:updateDocument:234
(method
time = 0 ms, total time = 8551 ms)

org.apache.lucene.index.DefaultIndexingChain:processDocument:300 (method
time = 0 ms, total time = 8551 ms)

org.apache.lucene.index.DefaultIndexingChain:processField:344 (method
time =
0 ms, total time = 8551 ms)

org.apache.lucene.index.DefaultIndexingChain$PerField:invert:613 (method
time = 0 ms, total time = 4098 ms)

org.apache.lucene.analysis.util.FilteringTokenFilter:incrementToken:51
(method time = 0 ms, total time = 4098 ms)

org.apache.lucene.analysis.synonym.SynonymFilter:incrementToken:627
(method
time = 0 ms, total time = 4098 ms)

org.apache.lucene.analysis.synonym.SynonymFilter:parse:396 (method time =
0
ms, total time = 4098 ms)

org.apache.lucene.util.fst.FST:findTargetArc:1186 (method time = 0 ms,
total
time = 4098 ms)

org.apache.lucene.util.fst.FST:findTargetArc:1270 (method time = 0 ms,
total
time = 4098 ms)

org.apache.lucene.util.fst.FST:readFirstRealTargetArc:992 (method time =
0
ms, total time = 4098 ms)

org.apache.lucene.util.fst.FST:readNextRealArc:1085 (method time = 0 ms,
total time = 4098 ms)

org.apache.lucene.util.fst.FST:readLabel:636 (method time = 0 ms, total
time
= 4098 ms)

org.apache.lucene.store.DataInput:readVInt:125 (method time = 4098 ms,
total
time = 4098 ms)

org.apache.lucene.index.DefaultIndexingChain:getOrAddField:484 (method
time
= 0 ms, total time = 4453 ms)

org.apache.lucene.index.FieldInfos$Builder:getOrAdd:317 (method time = 0
ms,
total time = 4453 ms)

org.apache.lucene.index.FieldInfos$FieldNumbers:addOrGet:218 (method time
=
4453 ms, total time = 4453 ms)
                    org.apache.solr.update.UpdateLog:add:412 (method time
=
0 ms, total time = 14875 ms)
                     org.apache.solr.update.UpdateLog:add:421 (method
time =
14875 ms, total time = 14875 ms)
             org.apache.solr.update.SolrCmdDistributor:distribAdd:207
(method time = 0 ms, total time = 260 ms)
              org.apache.solr.update.SolrCmdDistributor:submit:289
(method
time = 0 ms, total time = 260 ms)
               org.apache.solr.update.SolrCmdDistributor:doRequest:296
(method time = 0 ms, total time = 260 ms)
                org.apache.solr.client.solrj.SolrClient:request:1220
(method
time = 0 ms, total time = 260 ms)

org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:request:382
(method time = 0 ms, total time = 260 ms)


org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:addRunner:324
(method time = 0 ms, total time = 260 ms)


org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor:execute:215
(method time = 0 ms, total time = 260 ms)

org.apache.solr.common.util.SolrjNamedThreadFactory:newThread:40 (method
time = 260 ms, total time = 260 ms)


org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor:finish:183
(method time = 0 ms, total time = 7030 ms)

org.apache.solr.update.processor.DistributedUpdateProcessor:finish:1626
(method time = 0 ms, total time = 7030 ms)

org.apache.solr.update.processor.DistributedUpdateProcessor:doFinish:778
(method time = 0 ms, total time = 7030 ms)
           org.apache.solr.update.SolrCmdDistributor:finish:90 (method
time
= 0 ms, total time = 7030 ms)

org.apache.solr.update.SolrCmdDistributor:blockAndDoRetries:232 (method
time
= 0 ms, total time = 7030 ms)

org.apache.solr.update.StreamingSolrClients:blockUntilFinished:107
(method
time = 0 ms, total time = 7030 ms)


org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrClient:blockUntilFinished:429
(method time = 0 ms, total time = 7030 ms)
               java.lang.Object:wait (method time = 7030 ms, total time =
7030 ms)


https://dl.dropboxusercontent.com/u/39813705/Screen%20Shot%202016-09-16%20at%209.18.52%20AM.png

Appears there could be thread waiting but I am not sure how would this
impact searching.

Thanks

On 9/16/16 8:42 AM, Erick Erickson wrote:


First thing I'd look is whether you're _also_ seeing stop-the-world GC
pauses.
In that case there are a number of JVM options that can be tuned....

Best,
Erick

On Fri, Sep 16, 2016 at 8:40 AM, Rallavagu <rallav...@gmail.com> wrote:


Solr 5.4.1 with embedded jetty single shard - NRT

Looking in logs, noticed that there are high QTimes for Queries and
round
same time high response times for updates. These are not during
"commit"
or
"softCommit" but when client application is sending updates. Wondering
how
updates could impact query performance. What are the options for
tuning?
Thanks.

Reply via email to