[ 
https://issues.apache.org/jira/browse/SOLR-10130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15986859#comment-15986859
 ] 

Shawn Heisey commented on SOLR-10130:
-------------------------------------

Have a question related to this issue.  Somebody on the IRC channel running 
6.4.2 is seeing continued performance degradation compared to 4.x.  They were 
running an earlier 6.4.x release, until they were advised about this issue.

Looking at the utilization for threads, the top threads on 6.4.2 are all named 
starting with qtp, which I believe means they are Jetty threads.

https://gist.github.com/msporleder-work/7313ebedbdab2e178ca0aa2e889d006b

If I'm not mistaken, we enabled container-level metrics with the changes that 
went into 6.4.0.  If that's true, do we perhaps have those metrics dialed up to 
11?

> Serious performance degradation in Solr 6.4.1 due to the new metrics 
> collection
> -------------------------------------------------------------------------------
>
>                 Key: SOLR-10130
>                 URL: https://issues.apache.org/jira/browse/SOLR-10130
>             Project: Solr
>          Issue Type: Bug
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: metrics
>    Affects Versions: 6.4, 6.4.1
>         Environment: Centos 7, OpenJDK 1.8.0 update 111
>            Reporter: Ere Maijala
>            Assignee: Andrzej Bialecki 
>            Priority: Blocker
>              Labels: perfomance
>             Fix For: 6.4.2, master (7.0)
>
>         Attachments: SOLR-10130.patch, SOLR-10130.patch, 
> solr-8983-console-f1.log
>
>
> We've stumbled on serious performance issues after upgrading to Solr 6.4.1. 
> Looks like the new metrics collection system in MetricsDirectoryFactory is 
> causing a major slowdown. This happens with an index configuration that, as 
> far as I can see, has no metrics specific configuration and uses 
> luceneMatchVersion 5.5.0. In practice a moderate load will completely bog 
> down the server with Solr threads constantly using up all CPU (600% on 6 core 
> machine) capacity with a load that normally  where we normally see an average 
> load of < 50%.
> I took stack traces (I'll attach them) and noticed that the threads are 
> spending time in com.codahale.metrics.Meter.mark. I tested building Solr 
> 6.4.1 with the metrics collection disabled in MetricsDirectoryFactory getByte 
> and getBytes methods and was unable to reproduce the issue.
> As far as I can see there are several issues:
> 1. Collecting metrics on every single byte read is slow.
> 2. Having it enabled by default is not a good idea.
> 3. The comment "enable coarse-grained metrics by default" at 
> https://github.com/apache/lucene-solr/blob/branch_6x/solr/core/src/java/org/apache/solr/update/SolrIndexConfig.java#L104
>  implies that only coarse-grained metrics should be enabled by default, and 
> this contradicts with collecting metrics on every single byte read.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to