Additional info on GC selection http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#available_collectors
> If response time is more important than overall throughput and garbage collection pauses must be kept shorter than approximately one second, then select the concurrent collector with -XX:+UseConcMarkSweepGC. If only one or two processors are available, consider using incremental mode, described below. I'm not entirely certain of the implications of GC tuning for SolrCloud. I imagine distributed searching is going to be as slow as the slowest core being queried. I'd also be curious as to the root-cause of any excess GC churn. It sounds like you're doing a ton of random queries. This probably creates a lot of evictions your caches. There's nothing really worth caching, so the caches fill up and empty frequently, causing a lot of heap activity. If you expect to have high-load and a ton of turnover in queries, then tuning down cache size might help minimize GC churn. Solr Meter is another great tool for your perf testing that can help get at some of these caching issues. It gives you some higher-level stats about cache eviction, etc. https://code.google.com/p/solrmeter/ -Doug On Thu, Nov 21, 2013 at 10:24 PM, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > Dave you might want to connect JVisualVm and see if there's any pattern > with latency and garbage collection. That's a frequent culprit for > periodic hits in latency. > > More info here > > http://docs.oracle.com/javase/6/docs/technotes/guides/visualvm/jmx_connections.html > > There's a couple GC implementations in Java that can be tuned as needed > > With JvisualVM You can also add the mbeans plugin to get a ton of > performance stats out of Solr that might help debug latency issues. > > Doug > > Sent from my Windows Phone From: Dave Seltzer > Sent: 11/21/2013 8:42 PM > To: solr-user@lucene.apache.org > Subject: Re: Periodic Slowness on Solr Cloud > Lots of questions. Okay. > > In digging a little deeper and looking at the config I see that > <nrtMode>true</nrtMode> is commented out. I believe this is the default > setting. So I don't know if NRT is enabled or not. Maybe just a red > herring. > > I don't know what Garbage Collector we're using. In this test I'm running > Solr 4.5.1 using Jetty from the example directory. > > The CPU on the 8 nodes all stay around 70% use during the test. The nodes > have 28GB of RAM. Java is using about 6GB and the rest is being used by OS > cache. > > To perform the test we're running 200 concurrent threads in JMeter. The > threads hit HAProxy which loadbalances the requests among the nodes. Each > query is for a random word out of a list of about 10,000 words. Some of the > queries have faceting turned on. > > Because we're heavily loading the system the queries are returning quite > slowly. For a simple search, the average response time was 300ms. The peak > response time was 11,000ms. The spikes in latency seem to occur about every > 2.5 minutes. > > I haven't spent that much time messing with SolrConfig, so most of the > settings are the out-of-the-box defaults. > > Where should I start to look? > > Thanks so much! > > -Dave > > > > > > On Thu, Nov 21, 2013 at 6:53 PM, Mark Miller <markrmil...@gmail.com> > wrote: > > > Yes, more details… > > > > Solr version, which garbage collector, how does heap usage look, cpu, > etc. > > > > - Mark > > > > On Nov 21, 2013, at 6:46 PM, Erick Erickson <erickerick...@gmail.com> > > wrote: > > > > > How real time is NRT? In particular, what are you commit settings? > > > > > > And can you characterize "periodic slowness"? Queries that usually > > > take 500ms not tail 10s? Or 1s? How often? How are you measuring? > > > > > > Details matter, a lot... > > > > > > Best, > > > Erick > > > > > > > > > > > > > > > On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer <dselt...@tveyes.com> > > wrote: > > > > > >> I'm doing some performance testing against an 8-node Solr cloud > cluster, > > >> and I'm noticing some periodic slowness. > > >> > > >> > > >> http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png > > >> > > >> I'm doing random test searches against an Alias Collection made up of > > four > > >> smaller (monthly) collections. Like this: > > >> > > >> MasterCollection > > >> |- Collection201308 > > >> |- Collection201309 > > >> |- Collection201310 > > >> |- Collection201311 > > >> > > >> The last collection is constantly updated. New documents are being > > added at > > >> the rate of about 3 documents per second. > > >> > > >> I believe the slowness may due be to NRT, but I'm not sure. How > should I > > >> investigate this? > > >> > > >> If the slowness is related to NRT, how can I alleviate the issue > without > > >> disabling NRT? > > >> > > >> Thanks Much! > > >> > > >> -Dave > > >> > -- Doug Turnbull Search & Big Data Architect OpenSource Connections <http://o19s.com>