But it is silly to base non-heap RAM on the size of the heap. Get the
RAM needed for the non-heap usage. That has nothing to do with the
size of the Java heap.

Non-heap RAM is mostly used for two things: other programs and
file buffers for the Solr indexes. Base the RAM needs on those.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Dec 5, 2018, at 10:01 AM, Gus Heck <gus.h...@gmail.com> wrote:
> 
> 3x heap is larger than usual, but significant RAM beyond heap is a good
> idea if you can't fit the whole index in 31 GB of memory, since the OS will
> cache files in ram. Note also the use of 32 GB through about 45 GB heap
> settings gives you LESS heap than 31 GB due to an increase in pointer sizes
> needed to track large memory spaces. Typically 64 GB ram with 31gb heap is
> a good start for decent sized indexes and add more machines to get more
> ram/heap/cpu relative to your data on disk and query load. Of course test
> and tune from there to find your ideal spec for your installation... Also
> larger ram means longer gc pauses.
> 
> That said, none of the ram beyond heap is likely to have much effect on
> crashing once the OS and other processes on the box are happy.
> 
> On Wed, Dec 5, 2018, 11:11 AM Walter Underwood <wun...@wunderwood.org wrote:
> 
>> I’ve never heard a recommendation to have three times as much RAM as the
>> heap. That doesn’t make sense to me.
>> 
>> You might need 3X as much disk space as the index size.
>> 
>> For RAM, it is best to have the sum of:
>> 
>> * JVM heap
>> * A couple of gigabytes for OS and demons
>> * RAM for other processes needed on the host (keep to a minimum)
>> * Enough RAM to hold the entire index
>> 
>> Clearly, you are not going to have enough RAM for a 555 gigabyte index.
>> Well, Amazon does have a dozen instance types that can do that, but they
>> are expensive.
>> 
>> A 24 GB heap on a 30 GB machine will be pretty tight.
>> 
>> Always set Xms (starting heap) to the same as Xmx (maximum heap). If you
>> set it smaller, the JVM will keep increasing the heap until it hits the max
>> before doing a full GC. It will always end up with the max setting, but it
>> will have to do more work to get there. The setting for initial heap size
>> is about the most useless thing in Java.
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Dec 4, 2018, at 6:06 AM, Bernd Fehling <
>> bernd.fehl...@uni-bielefeld.de> wrote:
>>> 
>>> Hi Danilo,
>>> 
>>> Full GC points out that you need more heap which also implies that you
>> need more RAM.
>>> Raise your heap to 24GB and your physical RAM to about 75GB or better
>> 96GB.
>>> RAM should be about 3 to 4 times heap size.
>>> 
>>> Regards, Bernd
>>> 
>>> 
>>> Am 04.12.18 um 13:37 schrieb Danilo Tomasoni:
>>>> Hello Bernd,
>>>> Here I list the extra info you requested:
>>>> - actually the virtual machine has 22GB of RAM and 16GB of heap
>>>> - my 40 million raw data takes about 1364GB on filesystem (in xml
>> format)
>>>> - my index optimized (1 segment, 0 deleted docs) takes about 555GB
>>>> - solr 7.3, openjdk 1.8.0_181
>>>> - GC logs are like
>>>> 2018-12-03T07:40:22.302+0100: 28752.505: [Full GC (Allocation Failure)
>> 2018-12-03T07:40:22.302+0100: 28752.505: [CMS:
>> 12287999K->12287999K(12288000K), 13.6470083 secs]
>> 15701375K->15701373K(15701376K), [Metaspace: 37438K->37438K(1083392K)],
>> 13.6470726 secs] [Times: user=13.66 sys=0.00, real=13.64 secs]
>>>> Heap after GC invocations=2108 (full 1501):
>>>> par new generation   total 3413376K, used 3413373K
>> [0x00000003d8000000, 0x00000004d2000000, 0x00000004d2000000)
>>>>  eden space 2730752K,  99% used [0x00000003d8000000,
>> 0x000000047eabfdc0, 0x000000047eac0000)
>>>>  from space 682624K,  99% used [0x000000047eac0000,
>> 0x00000004a855f8a0, 0x00000004a8560000)
>>>>  to   space 682624K,   0% used [0x00000004a8560000,
>> 0x00000004a8560000, 0x00000004d2000000)
>>>> concurrent mark-sweep generation total 12288000K, used 12287999K
>> [0x00000004d2000000, 0x00000007c0000000, 0x00000007c0000000)
>>>> Metaspace       used 37438K, capacity 38438K, committed 38676K,
>> reserved 1083392K
>>>>  class space    used 4257K, capacity 4521K, committed 4628K, reserved
>> 1048576K
>>>> }
>>>> Thank you for your help
>>>> Danilo
>>>> On 03/12/18 10:36, Bernd Fehling wrote:
>>>>> Hi Danilo,
>>>>> 
>>>>> you have to give more infos about your system and the config.
>>>>> 
>>>>> - 30gb RAM (physical RAM?) how much heap do you have for JAVA?
>>>>> - how large (in GByte) are your 40 million raw data being indexed?
>>>>> - how large is your index (in GByte) with 40 million docs indexed?
>>>>> - which version of Solr and JAVA?
>>>>> - do you have JAVA garbage collection logs and if so what are they
>> reporting?
>>>>> - Any FullGC in GC logs?
>>>>> 
>>>>> Regards, Bernd
>>>>> 
>>>>> 
>>>>> Am 03.12.18 um 10:09 schrieb Danilo Tomasoni:
>>>>>> Hello all,
>>>>>> 
>>>>>> We have a configuration with a single node with 30gb of RAM.
>>>>>> 
>>>>>> We use it to index ~40MLN of documents.
>>>>>> 
>>>>>> We perform queries with edismax parser that contain often edismax
>> parser subqueries with the syntax
>>>>>> 
>>>>>> '_query_:{!edismax mm=X v=$subqueryN}'
>>>>>> 
>>>>>> Often X == 1.
>>>>>> 
>>>>>> This solves the "too many boolean clauses" error we got expanding the
>> query terms (often phrase queries) directly in the main query.
>>>>>> 
>>>>>> Unfortunately in this scenario solr often crashes while performing a
>> query, even with a single query and no other source of system load.
>>>>>> 
>>>>>> 
>>>>>> Do you have any idea of what's going on here?
>>>>>> 
>>>>>> Otherwise,
>>>>>> 
>>>>>> What kind of solr configuration parameters do you think I need to
>> investigate first?
>>>>>> 
>>>>>> What kind of log lines should I search for to understand what's going
>> on?
>>>>>> 
>>>>>> 
>>>>>> Thank you
>>>>>> 
>>>>>> Danilo
>>>>>> 
>>> 
>>> --
>>> *************************************************************
>>> Bernd Fehling                    Bielefeld University Library
>>> Dipl.-Inform. (FH)                LibTec - Library Technology
>>> Universitätsstr. 25                  and Knowledge Management
>>> 33615 Bielefeld
>>> Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de
>>>         https://www.ub.uni-bielefeld.de/~befehl/
>>> 
>>> BASE - Bielefeld Academic Search Engine - www.base-search.net
>>> *************************************************************
>> 
>> 

Reply via email to