The working set is larger than the heap. This is our largest collection and
all shards combined would probably be around 60GB in total, there are also
a few other much smaller collections.

During normal operations the JVM memory utilization hangs between 17GB and
22GB if we aren't indexing any data.

Either way, this wasn't a problem before. I suspect that it is because we
are now on Java 8 so I wanted to reach out to the community to see if there
are any new best practices around GC tuning since the current
recommendation seems to be for Java 7.


On Thu, Apr 28, 2016 at 11:54 AM, Walter Underwood <wun...@wunderwood.org>
wrote:

> 32 GB is a pretty big heap. If the working set is really smaller than
> that, the extra heap just makes a full GC take longer.
>
> How much heap is used after a full GC? Take the largest value you see
> there, then add a bit more, maybe 25% more or 2 GB more.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>
> > On Apr 28, 2016, at 8:50 AM, Nick Vasilyev <nick.vasily...@gmail.com>
> wrote:
> >
> > mmfr_exact is a string field. key_phrases is a multivalued string field.
> >
> > On Thu, Apr 28, 2016 at 11:47 AM, Yonik Seeley <ysee...@gmail.com>
> wrote:
> >
> >> What about the field types though... are they single valued or multi
> >> valued, string, text, numeric?
> >>
> >> -Yonik
> >>
> >>
> >> On Thu, Apr 28, 2016 at 11:43 AM, Nick Vasilyev
> >> <nick.vasily...@gmail.com> wrote:
> >>> Hi Yonik,
> >>>
> >>> I forgot to mention that the index is approximately 50 million docs
> split
> >>> across 4 shards (replication factor 2) on 2 solr replicas.
> >>>
> >>> This particular script will filter items based on a category
> >> (10-~1,000,000
> >>> items in each) and run facets on top X terms for particular fields.
> Query
> >>> looks like this:
> >>>
> >>> {
> >>>   q => "cat:$code",
> >>>   rows => 0,
> >>>   facet => 'true',
> >>>   'facet.field' => [ 'key_phrases', 'mmfr_exact' ],
> >>>   'f.key_phrases.facet.limit' => 100,
> >>>   'f.mmfr_exact.facet.limit' => 20,
> >>>   'facet.mincount' => 5,
> >>>   distrib => 'false',
> >>> }
> >>>
> >>> I know it can be re-worked some, especially considering there are
> >> thousands
> >>> of similar requests going out. However we didn't have this issue before
> >> and
> >>> I am worried that it may be a symptom of a larger underlying problem.
> >>>
> >>> On Thu, Apr 28, 2016 at 11:34 AM, Yonik Seeley <ysee...@gmail.com>
> >> wrote:
> >>>
> >>>> On Thu, Apr 28, 2016 at 11:29 AM, Nick Vasilyev
> >>>> <nick.vasily...@gmail.com> wrote:
> >>>>> Hello,
> >>>>>
> >>>>> We recently upgraded to Solr 5.2.1 with jre1.8.0_74 and are seeing
> >> long
> >>>> GC
> >>>>> pauses when running jobs that do some hairy faceting. The same jobs
> >>>> worked
> >>>>> fine with our previous 4.6 Solr.
> >>>>
> >>>> What does a typical request look like, and what are the field types
> >>>> that faceting is done on?
> >>>>
> >>>> -Yonik
> >>>>
> >>>>
> >>>>> The JVM is configured with 32GB heap with default GC settings,
> however
> >>>> I've
> >>>>> been tweaking the GC settings to no avail. The latest version had the
> >>>>> following differences from the default config:
> >>>>>
> >>>>> XX:ConcGCThreads and XX:ParallelGCThreads are increased from 4 to 7
> >>>>>
> >>>>> XX:CMSInitiatingOccupancyFraction increased from 50 to 70
> >>>>>
> >>>>>
> >>>>> Here is a sample output from the gc_log
> >>>>>
> >>>>> 2016-04-28T04:36:47.240-0400: 27905.535: Total time for which
> >> application
> >>>>> threads were stopped: 0.1667520 seconds, Stopping threads took:
> >> 0.0171900
> >>>>> seconds
> >>>>> {Heap before GC invocations=2051 (full 59):
> >>>>> par new generation   total 6990528K, used 2626705K
> >> [0x00002b16c0000000,
> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
> >>>>>  eden space 5592448K,  44% used [0x00002b16c0000000,
> >> 0x00002b17571b9948,
> >>>>> 0x00002b1815560000)
> >>>>>  from space 1398080K,  10% used [0x00002b1815560000,
> >> 0x00002b181e8cac28,
> >>>>> 0x00002b186aab0000)
> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
> >> 0x00002b186aab0000,
> >>>>> 0x00002b18c0000000)
> >>>>> concurrent mark-sweep generation total 25165824K, used 25122205K
> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
> >> reserved
> >>>>> 43008K
> >>>>> 2016-04-28T04:36:49.828-0400: 27908.123: [GC (Allocation Failure)
> >>>>> 2016-04-28T04:36:49.828-0400: 27908.124:
> >>>> [CMS2016-04-28T04:36:49.912-0400:
> >>>>> 27908.207: [CMS-concurr
> >>>>> ent-abortable-preclean: 5.615/5.862 secs] [Times: user=17.70
> sys=2.77,
> >>>>> real=5.86 secs]
> >>>>> (concurrent mode failure): 25122205K->15103706K(25165824K), 8.5567560
> >>>>> secs] 27748910K->15103706K(32156352K), [Metaspace:
> >>>> 41840K->41840K(43008K)],
> >>>>> 8.5657830 secs] [
> >>>>> Times: user=8.56 sys=0.01, real=8.57 secs]
> >>>>> Heap after GC invocations=2052 (full 60):
> >>>>> par new generation   total 6990528K, used 0K [0x00002b16c0000000,
> >>>>> 0x00002b18c0000000, 0x00002b18c0000000)
> >>>>>  eden space 5592448K,   0% used [0x00002b16c0000000,
> >> 0x00002b16c0000000,
> >>>>> 0x00002b1815560000)
> >>>>>  from space 1398080K,   0% used [0x00002b1815560000,
> >> 0x00002b1815560000,
> >>>>> 0x00002b186aab0000)
> >>>>>  to   space 1398080K,   0% used [0x00002b186aab0000,
> >> 0x00002b186aab0000,
> >>>>> 0x00002b18c0000000)
> >>>>> concurrent mark-sweep generation total 25165824K, used 15103706K
> >>>>> [0x00002b18c0000000, 0x00002b1ec0000000, 0x00002b1ec0000000)
> >>>>> Metaspace       used 41840K, capacity 42284K, committed 42680K,
> >> reserved
> >>>>> 43008K
> >>>>> }
> >>>>> 2016-04-28T04:36:58.395-0400: 27916.690: Total time for which
> >> application
> >>>>> threads were stopped: 8.5676090 seconds, Stopping threads took:
> >> 0.0003930
> >>>>> seconds
> >>>>>
> >>>>> I read the instructions here,
> >> https://wiki.apache.org/solr/ShawnHeisey,
> >>>> but
> >>>>> they seem to be specific to Java 7. Are there any updated
> >> recommendations
> >>>>> for Java 8?
> >>>>
> >>
>
>

Reply via email to