Re: System Out of Memory

Jens Deppe Mon, 25 Apr 2016 13:19:39 -0700

I think you're looking at the wrong info in ps.

What you're showing is the Virtual size (vsz) of memory. This is how much
the process has requested, but that does not mean it is actually using it.
In fact, your output says that Java has reserved 3Gb of memory, not 300Mb!
You should instead look at the Resident Set Size (rss option) as that will
give you a much more accurate picture of what is actually using real memory.


Also, remember that the JVM also needs memory for loaded code (jars and
classes), JITed code, thread stacks, etc. so when setting your heap size
you should take that into account too.

Finally, especially on virtualized hardware and doubly so on small configs,
make sure you *never, ever* end up swapping because that will really kill
your performance.

--Jens

On Mon, Apr 25, 2016 at 12:32 PM, Anilkumar Gingade <[email protected]>
wrote:

> >> It joined the cluster, and loaded data from overflow files.
> Not sure if this makes the OS file-system (disk buffer/cache) to consume
> memory...
> When you say overflow, I am assuming you are initializing the data/regions
> using persistence files, if so can you try without the persistence...
>
> -Anil.
>
>
>
>
>
>
> On Mon, Apr 25, 2016 at 12:18 PM, Eugene Strokin <[email protected]>
> wrote:
>
>> And when I'm checking memory usage per process, it looks normal, java
>> took only 300Mb as it supposed to, but free -m still shows no memory:
>>
>> # ps axo pid,vsz,comm=|sort -n -k 2
>>   PID    VSZ
>>   465  26396 systemd-logind
>>   444  26724 dbus-daemon
>>   454  27984 avahi-daemon
>>   443  28108 avahi-daemon
>>   344  32720 systemd-journal
>>     1  41212 systemd
>>   364  43132 systemd-udevd
>> 27138  52688 sftp-server
>>   511  53056 wpa_supplicant
>>   769  82548 sshd
>> 30734  83972 sshd
>>  1068  91128 master
>> 28534  91232 pickup
>>  1073  91300 qmgr
>>   519 110032 agetty
>> 27029 115380 bash
>> 27145 115380 bash
>> 30736 116440 sort
>>   385 116720 auditd
>>   489 126332 crond
>> 30733 139624 sshd
>> 27027 140840 sshd
>> 27136 140840 sshd
>> 27143 140840 sshd
>> 30735 148904 ps
>>   438 242360 rsyslogd
>>   466 447932 NetworkManager
>>   510 527448 polkitd
>>   770 553060 tuned
>> 30074 2922460 java
>>
>> # free -m
>>               total        used        free      shared  buff/cache
>> available
>> Mem:            489         424           5           0          58
>>    41
>> Swap:           255          57         198
>>
>>
>> On Mon, Apr 25, 2016 at 2:52 PM, Eugene Strokin <[email protected]>
>> wrote:
>>
>>> thanks for your help, but I still struggling with the System OOMKiller
>>> issue.
>>> I was doing more digging. And still couldn't find the problem.
>>> All settings are normal overcommit_memory=0, overcommit_ratio=50.
>>> free -m before the process starts:
>>>
>>> # free -m
>>>               total        used        free      shared  buff/cache
>>> available
>>> Mem:            489          25         399           1          63
>>>     440
>>> Swap:           255          57         198
>>>
>>> I start my process like this:
>>>
>>> *java* -server -Xmx300m -Xms300m -XX:+HeapDumpOnOutOfMemoryError
>>> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=55 -jar
>>> /opt/ccio-image.jar
>>>
>>> So, I should still have about 99Mb of free memory, but:
>>>
>>> # free -m
>>>               total        used        free      shared  buff/cache
>>> available
>>> Mem:            489         409           6           1          73
>>>      55
>>> Swap:           255          54         201
>>>
>>> And I didn't even make a single call to the process yet. It joined the
>>> cluster, and loaded data from overflow files. And all my free memory is
>>> gone. Even though I've set 300Mb max for Java.
>>> As I mentioned before, I've set off-heap memory setting to false:
>>>
>>> Cache cache = new CacheFactory()
>>> .set("locators", LOCATORS.get())
>>> .set("start-locator", LOCATOR_IP.get()+"["+LOCATOR_PORT.get()+"]")
>>> .set("bind-address", LOCATOR_IP.get())
>>> .create();
>>>
>>> cache.createDiskStoreFactory()
>>> .setMaxOplogSize(500)
>>> .setDiskDirsAndSizes(new File[] { new File("/opt/ccio/geode/store") },
>>> new int[] { 18000 })
>>> .setCompactionThreshold(95)
>>> .create("-ccio-store");
>>>
>>> RegionFactory<String, byte[]> regionFactory =
>>> cache.createRegionFactory();
>>>
>>> Region<String, byte[]> region = regionFactory
>>> .setDiskStoreName("-ccio-store")
>>> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>> .setOffHeap(false)
>>> .setMulticastEnabled(false)
>>> .setCacheLoader(new AwsS3CacheLoader())
>>> .create("ccio-images");
>>>
>>> I don't understand how the memory is getting overcommitted.
>>>
>>> Eugene
>>>
>>> On Fri, Apr 22, 2016 at 8:03 PM, Barry Oglesby <[email protected]>
>>> wrote:
>>>
>>>> The OOM killer uses the overcommit_memory and overcommit_ratio
>>>> parameters to determine if / when to kill a process.
>>>>
>>>> What are the settings for these parameters in your environment?
>>>>
>>>> The defaults are 0 and 50.
>>>>
>>>> cat /proc/sys/vm/overcommit_memory
>>>> 0
>>>>
>>>> cat /proc/sys/vm/overcommit_ratio
>>>> 50
>>>>
>>>> How much free memory is available before you start the JVM?
>>>>
>>>> How much free memory is available when your process is killed?
>>>>
>>>> You can monitor free memory using either free or vmstat before and
>>>> during your test.
>>>>
>>>> Run free -m in a loop to monitor free memory like:
>>>>
>>>> free -ms2
>>>>              total       used       free     shared    buffers
>>>> cached
>>>> Mem:        290639      35021     255617          0       9215
>>>>  21396
>>>> -/+ buffers/cache:       4408     286230
>>>> Swap:        20473          0      20473
>>>>
>>>> Run vmstat in a loop to monitor memory like:
>>>>
>>>> vmstat -SM 2
>>>> procs -----------memory---------- ---swap-- -----io---- --system--
>>>> -----cpu-----
>>>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
>>>> id wa st
>>>>  0  0      0 255619   9215  21396    0    0     0    23    0    0  2  0
>>>> 98  0  0
>>>>  0  0      0 255619   9215  21396    0    0     0     0  121  198  0  0
>>>> 100  0  0
>>>>  0  0      0 255619   9215  21396    0    0     0     0  102  189  0  0
>>>> 100  0  0
>>>>  0  0      0 255619   9215  21396    0    0     0     0  110  195  0  0
>>>> 100  0  0
>>>>  0  0      0 255619   9215  21396    0    0     0     0  117  205  0  0
>>>> 100  0  0
>>>>
>>>>
>>>> Thanks,
>>>> Barry Oglesby
>>>>
>>>>
>>>> On Fri, Apr 22, 2016 at 4:44 PM, Dan Smith <[email protected]> wrote:
>>>>
>>>>> The java metaspace will also take up memory. Maybe try setting
>>>>> -XX:MaxMetaspaceSize
>>>>>
>>>>> -Dan
>>>>>
>>>>>
>>>>> -------- Original message --------
>>>>> From: Eugene Strokin <[email protected]>
>>>>> Date: 4/22/2016 4:34 PM (GMT-08:00)
>>>>> To: [email protected]
>>>>> Subject: Re: System Out of Memory
>>>>>
>>>>> The machine is small, it has only 512mb RAM, plus 256mb swap.
>>>>> But java is set max heap size to 400mb. I've tried less, no help. And
>>>>> the most interesting part is that I don't see Java OOM Exceptions at all. 
>>>>> I
>>>>> even included a code with memory leak, and I saw the Java OOM Exceptions
>>>>> before the java process got killed then.
>>>>> I've browsed internet, and some people are actually noticed the same
>>>>> problem with other frameworks, not Geode. So, I'm suspecting this could be
>>>>> not Geode, but Geode was the first suspect because it has off-heap storage
>>>>> feature. They say that there was a memory leak, but for some reason OS was
>>>>> killing the process even before Java was getting OOM,
>>>>> I'll connect with JProbe, and will be monitoring the system with the
>>>>> console. Will let you know if I'll find something interesting.
>>>>>
>>>>> Thanks,
>>>>> Eugene
>>>>>
>>>>>
>>>>> On Fri, Apr 22, 2016 at 5:55 PM, Dan Smith <[email protected]> wrote:
>>>>>
>>>>>> What's your -Xmx for your JVM set to, and how much memory does your
>>>>>> droplet have? Does it have any swap space? My guess is you need to
>>>>>> reduce the heap size of your JVM and the OS is killing your process
>>>>>> because there is not enough memory left.
>>>>>>
>>>>>> -Dan
>>>>>>
>>>>>> On Fri, Apr 22, 2016 at 1:55 PM, Darrel Schneider <
>>>>>> [email protected]> wrote:
>>>>>> > I don't know why your OS would be killing your process which seems
>>>>>> like your
>>>>>> > main problem.
>>>>>> >
>>>>>> > But I did want you to know that if you don't have any regions with
>>>>>> > off-heap=true then you have no reason to have off-heap-memory-size
>>>>>> to be set
>>>>>> > to anything other than 0.
>>>>>> >
>>>>>> > On Fri, Apr 22, 2016 at 12:48 PM, Eugene Strokin <
>>>>>> [email protected]>
>>>>>> > wrote:
>>>>>> >>
>>>>>> >> I'm running load tests on the Geode cluster I've built.
>>>>>> >> The OS is killing my process occasionally, complaining that the
>>>>>> process
>>>>>> >> takes too much memory:
>>>>>> >>
>>>>>> >> # dmesg
>>>>>> >> [ 2544.932226] Out of memory: Kill process 5382 (java) score 780
>>>>>> or
>>>>>> >> sacrifice child
>>>>>> >> [ 2544.933591] Killed process 5382 (java) total-vm:3102804kB,
>>>>>> >> anon-rss:335780kB, file-rss:0kB
>>>>>> >>
>>>>>> >> Java doesn't have any problems, I don't see OOM exception.
>>>>>> >> Looks like Geode is using off-heap memory. But I set offHeap to
>>>>>> false for
>>>>>> >> my region, and I do have only one region:
>>>>>> >>
>>>>>> >> RegionFactory<String, byte[]> regionFactory =
>>>>>> cache.createRegionFactory();
>>>>>> >> regionFactory
>>>>>> >> .setDiskStoreName("-ccio-store")
>>>>>> >> .setDataPolicy(DataPolicy.PERSISTENT_PARTITION)
>>>>>> >> .setOffHeap(false)
>>>>>> >> .setCacheLoader(new AwsS3CacheLoader());
>>>>>> >>
>>>>>> >> Also, I've played with off-heap-memory-size setting, setting it to
>>>>>> small
>>>>>> >> number like 20M to prevent Geode to take too much off-heap memory,
>>>>>> but
>>>>>> >> result is the same.
>>>>>> >>
>>>>>> >> Do you have any other ideas what could I do here? I'm stack at
>>>>>> this point.
>>>>>> >>
>>>>>> >> Thank you,
>>>>>> >> Eugene
>>>>>> >
>>>>>> >
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: System Out of Memory

Reply via email to