Thanks Doug! One thing I'm not clear on is how do I know if this is in-fact related to Garbage Collection. If you're right, and the cluster is only as slow as its slowest link, how do I determine that this is GC. Do I have to run the profiler on all eight nodes?
Or is it a matter of turning on the correct logging and then watching and waiting. Thanks! -D On Thu, Nov 21, 2013 at 11:20 PM, Doug Turnbull < dturnb...@opensourceconnections.com> wrote: > Additional info on GC selection > > http://www.oracle.com/technetwork/java/javase/gc-tuning-6-140523.html#available_collectors > > > If response time is more important than overall throughput and garbage > collection pauses must be kept shorter than approximately one second, then > select the concurrent collector with -XX:+UseConcMarkSweepGC. If only one > or two processors are available, consider using incremental mode, described > below. > > I'm not entirely certain of the implications of GC tuning for SolrCloud. I > imagine distributed searching is going to be as slow as the slowest core > being queried. > > I'd also be curious as to the root-cause of any excess GC churn. It sounds > like you're doing a ton of random queries. This probably creates a lot of > evictions your caches. There's nothing really worth caching, so the caches > fill up and empty frequently, causing a lot of heap activity. If you expect > to have high-load and a ton of turnover in queries, then tuning down cache > size might help minimize GC churn. > > Solr Meter is another great tool for your perf testing that can help get > at some of these caching issues. It gives you some higher-level stats about > cache eviction, etc. > https://code.google.com/p/solrmeter/ > > -Doug > > > > On Thu, Nov 21, 2013 at 10:24 PM, Doug Turnbull < > dturnb...@opensourceconnections.com> wrote: > >> Dave you might want to connect JVisualVm and see if there's any pattern >> with latency and garbage collection. That's a frequent culprit for >> periodic hits in latency. >> >> More info here >> >> http://docs.oracle.com/javase/6/docs/technotes/guides/visualvm/jmx_connections.html >> >> There's a couple GC implementations in Java that can be tuned as needed >> >> With JvisualVM You can also add the mbeans plugin to get a ton of >> performance stats out of Solr that might help debug latency issues. >> >> Doug >> >> Sent from my Windows Phone From: Dave Seltzer >> Sent: 11/21/2013 8:42 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Periodic Slowness on Solr Cloud >> Lots of questions. Okay. >> >> In digging a little deeper and looking at the config I see that >> <nrtMode>true</nrtMode> is commented out. I believe this is the default >> setting. So I don't know if NRT is enabled or not. Maybe just a red >> herring. >> >> I don't know what Garbage Collector we're using. In this test I'm running >> Solr 4.5.1 using Jetty from the example directory. >> >> The CPU on the 8 nodes all stay around 70% use during the test. The nodes >> have 28GB of RAM. Java is using about 6GB and the rest is being used by OS >> cache. >> >> To perform the test we're running 200 concurrent threads in JMeter. The >> threads hit HAProxy which loadbalances the requests among the nodes. Each >> query is for a random word out of a list of about 10,000 words. Some of >> the >> queries have faceting turned on. >> >> Because we're heavily loading the system the queries are returning quite >> slowly. For a simple search, the average response time was 300ms. The peak >> response time was 11,000ms. The spikes in latency seem to occur about >> every >> 2.5 minutes. >> >> I haven't spent that much time messing with SolrConfig, so most of the >> settings are the out-of-the-box defaults. >> >> Where should I start to look? >> >> Thanks so much! >> >> -Dave >> >> >> >> >> >> On Thu, Nov 21, 2013 at 6:53 PM, Mark Miller <markrmil...@gmail.com> >> wrote: >> >> > Yes, more details… >> > >> > Solr version, which garbage collector, how does heap usage look, cpu, >> etc. >> > >> > - Mark >> > >> > On Nov 21, 2013, at 6:46 PM, Erick Erickson <erickerick...@gmail.com> >> > wrote: >> > >> > > How real time is NRT? In particular, what are you commit settings? >> > > >> > > And can you characterize "periodic slowness"? Queries that usually >> > > take 500ms not tail 10s? Or 1s? How often? How are you measuring? >> > > >> > > Details matter, a lot... >> > > >> > > Best, >> > > Erick >> > > >> > > >> > > >> > > >> > > On Thu, Nov 21, 2013 at 6:03 PM, Dave Seltzer <dselt...@tveyes.com> >> > wrote: >> > > >> > >> I'm doing some performance testing against an 8-node Solr cloud >> cluster, >> > >> and I'm noticing some periodic slowness. >> > >> >> > >> >> > >> http://farm4.staticflickr.com/3668/10985410633_23e26c7681_o.png >> > >> >> > >> I'm doing random test searches against an Alias Collection made up of >> > four >> > >> smaller (monthly) collections. Like this: >> > >> >> > >> MasterCollection >> > >> |- Collection201308 >> > >> |- Collection201309 >> > >> |- Collection201310 >> > >> |- Collection201311 >> > >> >> > >> The last collection is constantly updated. New documents are being >> > added at >> > >> the rate of about 3 documents per second. >> > >> >> > >> I believe the slowness may due be to NRT, but I'm not sure. How >> should I >> > >> investigate this? >> > >> >> > >> If the slowness is related to NRT, how can I alleviate the issue >> without >> > >> disabling NRT? >> > >> >> > >> Thanks Much! >> > >> >> > >> -Dave >> > >> >> > > > > -- > Doug Turnbull > Search & Big Data Architect > OpenSource Connections <http://o19s.com> >