We use munin with jmx plugin for monitoring all server and Solr installations.
(http://munin-monitoring.org/)

Only for short time monitoring we also use jvisualvm delivered with Java SE JDK.

Regards
Bernd

Am 25.03.2013 14:45, schrieb Arkadi Colson:
> Thanks for the info!
> I just upgraded java from 6 to 7...
> How exactly do you monitor the memory usage and the affect of the garbage 
> collector?
> 
> 
> On 03/25/2013 01:18 PM, Bernd Fehling wrote:
>> The of UseG1GC yes,
>> but with Solr 4.x, Jetty 8.1.8 and Java HotSpot(TM) 64-Bit Server VM 
>> (1.7.0_07).
>> os.​arch: amd64
>> os.​name: Linux
>> os.​version: 2.6.32.13-0.5-xen
>>
>> Only args are "-XX:+UseG1GC -Xms16g -Xmx16g".
>> Monitoring shows that 16g is a bit high, I might reduce it to 10g or 12g for 
>> the slaves.
>> Start is at 5g, runtime is between 6 and 8g with some peaks to 9.5g.
>> Single index, 130GByte, 43.5 mio. dokuments.
>>
>> Regards,
>> Bernd
>>
>>
>> Am 25.03.2013 11:55, schrieb Arkadi Colson:
>>> Is sombody using the UseG1GC garbage collector with Solr and Tomcat 7? Any 
>>> extra options needed?
>>>
>>> Thanks...
>>>
>>> On 03/25/2013 08:34 AM, Arkadi Colson wrote:
>>>> I changed my system memory to 12GB. Solr now gets -Xms2048m -Xmx8192m as 
>>>> parameters. I also added -XX:+UseG1GC to the java process. But now
>>>> the whole machine crashes! Any idea why?
>>>>
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.077809] java invoked oom-killer: 
>>>> gfp_mask=0x201da, order=0, oom_adj=0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.077962] java cpuset=/ 
>>>> mems_allowed=0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078019] Pid: 29339, comm: java 
>>>> Not tainted 2.6.32-5-amd64 #1
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078095] Call Trace:
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078155] [<ffffffff810b6324>] ? 
>>>> oom_kill_process+0x7f/0x23f
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078233] [<ffffffff810b6848>] ? 
>>>> __out_of_memory+0x12a/0x141
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078309] [<ffffffff810b699f>] ? 
>>>> out_of_memory+0x140/0x172
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078385] [<ffffffff810ba704>] ? 
>>>> __alloc_pages_nodemask+0x4ec/0x5fc
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078469] [<ffffffff812fb47a>] ? 
>>>> io_schedule+0x93/0xb7
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078541] [<ffffffff810bbc69>] ? 
>>>> __do_page_cache_readahead+0x9b/0x1b4
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078626] [<ffffffff81064fc0>] ? 
>>>> wake_bit_function+0x0/0x23
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078702] [<ffffffff810bbd9e>] ? 
>>>> ra_submit+0x1c/0x20
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078773] [<ffffffff810b4a72>] ? 
>>>> filemap_fault+0x17d/0x2f6
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078849] [<ffffffff810ca9e2>] ? 
>>>> __do_fault+0x54/0x3c3
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078921] [<ffffffff810ccd36>] ? 
>>>> handle_mm_fault+0x3b8/0x80f
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.078999] [<ffffffff8101166e>] ? 
>>>> apic_timer_interrupt+0xe/0x20
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079078] [<ffffffff812febf6>] ? 
>>>> do_page_fault+0x2e0/0x2fc
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079153] [<ffffffff812fca95>] ? 
>>>> page_fault+0x25/0x30
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079222] Mem-Info:
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079261] Node 0 DMA per-cpu:
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079310] CPU    0: hi: 0, btch:   
>>>> 1 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079374] CPU    1: hi: 0, btch:   
>>>> 1 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079439] CPU    2: hi: 0, btch:   
>>>> 1 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079527] CPU    3: hi: 0, btch:   
>>>> 1 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079591] Node 0 DMA32 per-cpu:
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079642] CPU    0: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079706] CPU    1: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079770] CPU    2: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079834] CPU    3: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079899] Node 0 Normal per-cpu:
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.079951] CPU    0: hi: 186, btch: 
>>>>  31 usd:  17
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080015] CPU    1: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080079] CPU    2: hi: 186, btch: 
>>>>  31 usd:   2
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080142] CPU    3: hi: 186, btch: 
>>>>  31 usd:   0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080209] active_anon:2638016 
>>>> inactive_anon:388557 isolated_anon:0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080209]  active_file:68 
>>>> inactive_file:236 isolated_file:0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080210]  unevictable:0 dirty:5 
>>>> writeback:5 unstable:0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080211]  free:16573 
>>>> slab_reclaimable:2398 slab_unreclaimable:2335
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080212]  mapped:36 shmem:0 
>>>> pagetables:24750 bounce:0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.080575] Node 0 DMA free:15796kB 
>>>> min:16kB low:20kB high:24kB active_anon:0kB inactive_anon:0kB
>>>> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB 
>>>> isolated(file):0kB present:15244kB mlocked:0kB dirty:0kB writeback:0kB
>>>> mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB 
>>>> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
>>>> pages_scanned:0 all_unreclaimable? yes
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.081041] lowmem_reserve[]: 0 3000 
>>>> 12090 12090
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.081110] Node 0 DMA32 
>>>> free:39824kB min:3488kB low:4360kB high:5232kB active_anon:2285240kB
>>>> inactive_anon:520624kB active_file:0kB inactive_file:188kB unevictable:0kB 
>>>> isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB
>>>> dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:4152kB 
>>>> slab_unreclaimable:1640kB kernel_stack:1104kB pagetables:31100kB
>>>> unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:89 
>>>> all_unreclaimable? no
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.081600] lowmem_reserve[]: 0 0 
>>>> 9090 9090
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.081664] Node 0 Normal 
>>>> free:10672kB min:10572kB low:13212kB high:15856kB active_anon:8266824kB
>>>> inactive_anon:1033604kB active_file:292kB inactive_file:756kB 
>>>> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:9308160kB
>>>> mlocked:0kB dirty:20kB writeback:20kB mapped:156kB shmem:0kB 
>>>> slab_reclaimable:5440kB slab_unreclaimable:7692kB kernel_stack:1280kB
>>>> pagetables:67900kB unstable:0kB bounce:0kB writeback_tmp:0kB 
>>>> pages_scanned:256 all_unreclaimable? no
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082171] lowmem_reserve[]: 0 0 0 0
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082240] Node 0 DMA: 1*4kB 2*8kB 
>>>> 2*16kB 0*32kB 2*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB
>>>> 3*4096kB = 15796kB
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082394] Node 0 DMA32: 4578*4kB 
>>>> 2434*8kB 4*16kB 1*32kB 1*64kB 2*128kB 0*256kB 2*512kB 1*1024kB
>>>> 0*2048kB 0*4096kB = 40248kB
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082555] Node 0 Normal: 1020*4kB 
>>>> 332*8kB 7*16kB 3*32kB 6*64kB 6*128kB 2*256kB 2*512kB 1*1024kB
>>>> 0*2048kB 0*4096kB = 10656kB
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082715] 7069 total pagecache 
>>>> pages
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.082769] 6768 pages in swap cache
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.114216] 203 pages shared
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.114261] 3067047 pages non-shared
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.114314] Out of memory: kill 
>>>> process 29301 (java) score 37654 or a child
>>>> Mar 22 20:30:01 solr01-gs kernel: [716098.114401] Killed process 29301 
>>>> (java)
>>>>
>>>> Thanks!
>>>>
>>>> On 03/14/2013 04:00 PM, Arkadi Colson wrote:
>>>>> On 03/14/2013 03:11 PM, Toke Eskildsen wrote:
>>>>>> On Thu, 2013-03-14 at 13:10 +0100, Arkadi Colson wrote:
>>>>>>> When I shutdown tomcat free -m and top keeps telling me the same values.
>>>>>>> Almost no free memory...
>>>>>>>
>>>>>>> Any idea?
>>>>>> Are you reading top & free right? It is standard behaviour for most
>>>>>> modern operating systems to have very little free memory. As long as the
>>>>>> sum of free memory and cache is high, everything is fine.
>>>>>>
>>>>>> Looking at the stats you gave previously we have
>>>>>>
>>>>>>>> *top*
>>>>>>>>     PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM TIME+
>>>>>>>> COMMAND &nbs
>>>>>>>> p;
>>>>>>>> 13666 root      20   0 86.8g 4.7g 248m S  101 39.7 478:37.45
>>>>>> 4.7GB physical memory used and ~80GB used for memory mapping the index.
>>>>>>
>>>>>>>> *free -m**
>>>>>>>> *             total       used       free     shared buffers     cached
>>>>>>>> Mem:         12047      11942        105          0 180       6363
>>>>>>>> -/+ buffers/cache:       5399       6648
>>>>>>>> Swap:          956         75        881
>>>>>> So 6648MB used for either general disk cache or memory mapped index.
>>>>>> This really translates to 6648MB (plus the 105MB above) available memory
>>>>>> as any application asking for memory will get it immediately from that
>>>>>> pool (sorry if this is basic stuff for you).
>>>>>>
>>>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>>>>>           at java.lang.Thread.run(Thread.java:662)
>>>>>>>> Caused by: java.lang.OutOfMemoryError
>>>>>>>>           at java.util.zip.ZipFile.open(Native Method)
>>>>>>>>           at java.util.zip.ZipFile.<init>(ZipFile.java:127)
>>>>>>>>           at java.util.zip.ZipFile.<init>(ZipFile.java:144)
>>>>>>>>           at
>>>>>>>> org.apache.poi.openxml4j.opc.internal.ZipHelper.openZipFile(ZipHelper.java:157)
>>>>>> [...]
>>>>>>
>>>>>>>> Java HotSpot(TM) 64-Bit Server VM warning: Attempt to allocate stack
>>>>>>>> guard pages failed.
>>>>>>>> mmap failed for CEN and END part of zip file
>>>>>> A quick search shows that other people have had problems with ZipFile in
>>>>>> at least some sub-versions of Java 1.7. However, another very common
>>>>>> cause for OOM with memory mapping is that the limit for allocating
>>>>>> virtual memory is too low.
>>>>> We do not index zip files so that could not cause the problem
>>>>>
>>>>>> Try doing a
>>>>>>    ulimit -v
>>>>>> on the machine. If the number is somewhere around 100000000 (100GB),
>>>>>> Lucene's memory mapping of your index (the 80GB) plus the ZipFile's
>>>>>> memory mapping plus other processes might hit the ceiling. If that is
>>>>>> the case, simply raise the limit.
>>>>>>
>>>>>> - Toke
>>>>>>
>>>>> ulimit -v shows me unlimited
>>>>> I decreased the hard commit time to 10 seconds and set ramBufferSizeMB to 
>>>>> 250. Hope this helps...
>>>>> Will keep you informed!
>>>>>
>>>>> Thanks for the explanation!
>>>>>
>>>>>
> 

-- 
*************************************************************
Bernd Fehling                    Bielefeld University Library
Dipl.-Inform. (FH)                LibTec - Library Technology
Universitätsstr. 25                  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Reply via email to