Re: System Out of Memory

Barry Oglesby Mon, 25 Apr 2016 13:18:56 -0700

The vsz is virtual set size, not resident set size. In your output, vsz is
2922460k (which is 2.9g).


You should use rss like:

ps axo pid,rss,vsz,comm=|sort -n -k 2

In my test with a JVM with -Xmx300m, I see output like this when the JVM
heap is full (but hasn't thrown an OOME):

20251 414228 2852312 java

top shows the same info:

20251 boglesby  20   0 2785m 404m  11m S 95.5  5.1   4:48.42 java

When I look at Geode memory stats, I see 300m usedMemory. Also, I see 100%
cpu and pretty much continuous CMS GCs.

Are you collecting Geode stats (with statistic-sampling-enabled=true and
statistic-archive-file=cacheserver.gfs)? If so, if you post them, I can
take a look.


Thanks,
Barry Oglesby


On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <[email protected]>
wrote:

> And when I'm checking memory usage per process, it looks normal, java took
> only 300Mb as it supposed to, but free -m still shows no memory:
>
> # ps axo pid,vsz,comm=|sort -n -k 2
>   PID    VSZ
>   465  26396 systemd-logind
>   444  26724 dbus-daemon
>   454  27984 avahi-daemon
>   443  28108 avahi-daemon
>   344  32720 systemd-journal
>     1  41212 systemd
>   364  43132 systemd-udevd
> 27138  52688 sftp-server
>   511  53056 wpa_supplicant
>   769  82548 sshd
> 30734  83972 sshd
>  1068  91128 master
> 28534  91232 pickup
>  1073  91300 qmgr
>   519 110032 agetty
> 27029 115380 bash
> 27145 115380 bash
> 30736 116440 sort
>   385 116720 auditd
>   489 126332 crond
> 30733 139624 sshd
> 27027 140840 sshd
> 27136 140840 sshd
> 27143 140840 sshd
> 30735 148904 ps
>   438 242360 rsyslogd
>   466 447932 NetworkManager
>   510 527448 polkitd
>   770 553060 tuned
> 30074 2922460 java
>
> # free -m
>               total        used        free      shared  buff/cache
> available
> Mem:            489         424           5           0          58
>    41
> Swap:           255          57         198
>
>
> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <[email protected]>
> wrote:
>
>> thanks for your help, but I still struggling with the System OOMKiller
>> issue.
>> I was doing more digging. And still couldn't find the problem.
>> All settings are normal overcommit_memory=0, overcommit_ratio=50.
>> free -m before the process starts:
>>
>> # free -m
>>               total        used        free      shared  buff/cache
>> available
>> Mem:            489          25         399           1          63
>>   440
>> Swap:           255          57         198
>>
>> I start my process like this:
>>
>> *java* -server -Xmx300m -Xms300m -XX:+HeapDumpOnOutOfMemoryError
>> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=55 -jar
>> /opt/ccio-image.jar
>>
>> So, I should still have about 99Mb of free memory, but:
>>
>> # free -m
>>               total        used        free      shared  buff/cache
>> available
>> Mem:            489         409           6           1          73
>>    55
>> Swap:           255          54         201
>>
>> And I didn't even make a single call to the process yet. It joined the
>> cluster, and loaded data from overflow files. And all my free memory is
>> gone. Even though I've set 300Mb max for Java.
>> As I mentioned before, I've set off-heap memory setting to false:
>>
>> Cache cache = new CacheFactory()
>> .set("locators", LOCATORS.get())
>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>> .set("bind-address", LOCATOR_IP.get())
>> .create();
>>
>> cache.createDiskStoreFactory()
>> .setMaxOplogSize(500)
>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") },
>> new int[] { 18000 })
>> .setCompactionThreshold(95)
>> .create("-ccio-store");
>>
>> RegionFactory<String, byte[]> regionFactory = cache.createRegionFactory();
>>
>> Region<String, byte[]> region = regionFactory
>> .setDiskStoreName("-ccio-store")
>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>> .setOffHeap(false)
>> .setMulticastEnabled(false)
>> .setCacheLoader(new AwsS3CacheLoader())
>> .create("ccio-images");
>>
>> I don't understand how the memory is getting overcommitted.
>>
>> Eugene
>>
>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <[email protected]>
>> wrote:
>>
>>> The OOM killer uses the overcommit_memory and overcommit_ratio
>>> parameters to determine if / when to kill a process.
>>>
>>> What are the settings for these parameters in your environment?
>>>
>>> The defaults are 0 and 50.
>>>
>>> cat /proc/sys/vm/overcommit_memory
>>> 0
>>>
>>> cat /proc/sys/vm/overcommit_ratio
>>> 50
>>>
>>> How much free memory is available before you start the JVM?
>>>
>>> How much free memory is available when your process is killed?
>>>
>>> You can monitor free memory using either free or vmstat before and
>>> during your test.
>>>
>>> Run free -m in a loop to monitor free memory like:
>>>
>>> free -ms2
>>>              total       used       free     shared    buffers     cached
>>> Mem:        290639      35021     255617          0       9215      21396
>>> -/+ buffers/cache:       4408     286230
>>> Swap:        20473          0      20473
>>>
>>> Run vmstat in a loop to monitor memory like:
>>>
>>> vmstat -SM 2
>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>> -----cpu-----
>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
>>> id wa st
>>>  0  0      0 255619   9215  21396    0    0     0    23    0    0  2  0
>>> 98  0  0
>>>  0  0      0 255619   9215  21396    0    0     0     0  121  198  0  0
>>> 100  0  0
>>>  0  0      0 255619   9215  21396    0    0     0     0  102  189  0  0
>>> 100  0  0
>>>  0  0      0 255619   9215  21396    0    0     0     0  110  195  0  0
>>> 100  0  0
>>>  0  0      0 255619   9215  21396    0    0     0     0  117  205  0  0
>>> 100  0  0
>>>
>>>
>>> Thanks,
>>> Barry Oglesby
>>>
>>>
>>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith <[email protected]> wrote:
>>>
>>>> The java metaspace will also take up memory. Maybe try setting
>>>> -XX:MaxMetaspaceSize
>>>>
>>>> -Dan
>>>>
>>>>
>>>> -------- Original message --------
>>>> From: Eugene Strokin <[email protected]>
>>>> Date: 4/22/2016 4:34 PM (GMT-08:00)
>>>> To: [email protected]
>>>> Subject: Re: System Out of Memory
>>>>
>>>> The machine is small, it has only 512mb RAM, plus 256mb swap.
>>>> But java is set max heap size to 400mb. I've tried less, no help. And
>>>> the most interesting part is that I don't see Java OOM Exceptions at all. I
>>>> even included a code with memory leak, and I saw the Java OOM Exceptions
>>>> before the java process got killed then.
>>>> I've browsed internet, and some people are actually noticed the same
>>>> problem with other frameworks, not Geode. So, I'm suspecting this could be
>>>> not Geode, but Geode was the first suspect because it has off-heap storage
>>>> feature. They say that there was a memory leak, but for some reason OS was
>>>> killing the process even before Java was getting OOM,
>>>> I'll connect with JProbe, and will be monitoring the system with the
>>>> console. Will let you know if I'll find something interesting.
>>>>
>>>> Thanks,
>>>> Eugene
>>>>
>>>>
>>>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith <[email protected]> wrote:
>>>>
>>>>> What's your -Xmx for your JVM set to, and how much memory does your
>>>>> droplet have? Does it have any swap space? My guess is you need to
>>>>> reduce the heap size of your JVM and the OS is killing your process
>>>>> because there is not enough memory left.
>>>>>
>>>>> -Dan
>>>>>
>>>>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider <
>>>>> [email protected]> wrote:
>>>>> > I don't know why your OS would be killing your process which seems
>>>>> like your
>>>>> > main problem.
>>>>> >
>>>>> > But I did want you to know that if you don't have any regions with
>>>>> > off-heap=true then you have no reason to have off-heap-memory-size
>>>>> to be set
>>>>> > to anything other than 0.
>>>>> >
>>>>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin <
>>>>> [email protected]>
>>>>> > wrote:
>>>>> >>
>>>>> >> I'm running load tests on the Geode cluster I've built.
>>>>> >> The OS is killing my process occasionally, complaining that the
>>>>> process
>>>>> >> takes too much memory:
>>>>> >>
>>>>> >> # dmesg
>>>>> >> [ 2544.932226] Out of memory: Kill process 5382 (java) score 780 or
>>>>> >> sacrifice child
>>>>> >> [ 2544.933591] Killed process 5382 (java) total-vm:3102804kB,
>>>>> >> anon-rss:335780kB, file-rss:0kB
>>>>> >>
>>>>> >> Java doesn't have any problems, I don't see OOM exception.
>>>>> >> Looks like Geode is using off-heap memory. But I set offHeap to
>>>>> false for
>>>>> >> my region, and I do have only one region:
>>>>> >>
>>>>> >> RegionFactory<String, byte[]> regionFactory =
>>>>> cache.createRegionFactory();
>>>>> >> regionFactory
>>>>> >> .setDiskStoreName("-ccio-store")
>>>>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>> >> .setOffHeap(false)
>>>>> >> .setCacheLoader(new AwsS3CacheLoader());
>>>>> >>
>>>>> >> Also, I've played with off-heap-memory-size setting, setting it to
>>>>> small
>>>>> >> number like 20M to prevent Geode to take too much off-heap memory,
>>>>> but
>>>>> >> result is the same.
>>>>> >>
>>>>> >> Do you have any other ideas what could I do here? I'm stack at this
>>>>> point.
>>>>> >>
>>>>> >> Thank you,
>>>>> >> Eugene
>>>>> >
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: System Out of Memory

Reply via email to