Hi!


The docker cgroups limit (set via --memory) is set to 8G. Here's the
docker stats output of that container right now (been running for around
16 hours now):

Okay, so you have about 2 gig to use outside the java heap. The docker image will use some space. The JVM will use memory (off java heap) for compiler caches, threads, class metadata, mmaped files (_mmap is used by elastic_) and other things.

I can not give you good help on how much this is, but a quick look at: https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html suggest that you should only give maximum half the memory to the java heap. If you give 6 gig to the heap, they seem to suggest 12 gig of memory (instead of 8).

Now, I do not know much about elastic search to say this is enough, and that guide does not seem to mention docker. I would suggest that you try to find a configuration that also _limits_ the use of file cache for elastic. If elastic does not understand that it is running under docker, it _might_ use huge file caches. Try to limit the file caches and give the docker image x gig for java heap, y gig for caches and some extra for slack.

Hope this helps.
Thanks, Leo



CONTAINER ID        NAME                             CPU %
MEM USAGE / LIMIT   MEM %               NET I/O             BLOCK
I/O           PIDS
da5ee21b1d        elasticsearch   94.62%              3.657GiB /
8GiB     45.71%              0B / 0B             1.21MB / 18.2GB     227
You will need _more_ than 6 gig of memory; exactly how much is hard to
say. You will probably find the limit faster if you use Xms==Xmx and
maybe not need days of running the application.
That's interesting and useful - I'll give that a try. That will
certainly help speed up the investigation.



Notice the vast difference between the live and non-live instances of
the same class. This isn't just in one "snapshot". I have been
monitoring this for more than a day and this pattern continues. Even
taking heap dumps and using tools like visualvm shows that these
instances have "no GC root" and I have even checked the gc log files to
see that the CMS collector does occasionally run. However these objects
never seem to get collected.

What makes you believe they never get collected?
I have gc logs enabled on this setup. I notice that the Full GC gets
triggered once in a while. However, even after that Full GC completes, I
still see these vast amount of "non-live" objects staying around in the
heap (via jmap -histo output as well as a heap dump using visualvm).
This is the "total" in heap via jmap -histo:

            #instances   #bytes
Total      22880189     1644503392 (around 1.5 G)

and this is the -histo:live

Total       1292631      102790184 (around 98M, not even 100M)


Some of these non-live objects hold on to the ByteBuffer(s) which keep
filling up then non-heap memory too (right now the non-heap "mapped"
ByteBuffer memory as shown in the JMX MBean is around 2.5G). The Full GC
log message looks like this:

2018-11-24T00:57:00.665+0000: 59655.295: [Full GC (Heap Inspection
Initiated GC))
  2018-11-24T00:57:00.665+0000: 59655.295: [CMS:
711842K->101527K(989632K), 0.5322
0752 secs] 1016723K->101527K(1986432K), [Metaspace:
48054K->48054K(1093632K)], 00
.5325692 secs] [Times: user=0.53 sys=0.00, real=0.53 secs]

The jmap -heap output is this:

Server compiler detected.
JVM version is 25.172-b11

using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC

Heap Configuration:
    MinHeapFreeRatio         = 40
    MaxHeapFreeRatio         = 70
    MaxHeapSize              = 6442450944 (6144.0MB)
    NewSize                  = 1134100480 (1081.5625MB)
    MaxNewSize               = 1134100480 (1081.5625MB)
    OldSize                  = 1013383168 (966.4375MB)
    NewRatio                 = 2
    SurvivorRatio            = 8
    MetaspaceSize            = 21807104 (20.796875MB)
    CompressedClassSpaceSize = 1073741824 (1024.0MB)
    MaxMetaspaceSize         = 17592186044415 MB
    G1HeapRegionSize         = 0 (0.0MB)

Heap Usage:
New Generation (Eden + 1 Survivor Space):
    capacity = 1020723200 (973.4375MB)
    used     = 900988112 (859.2492218017578MB)
    free     = 119735088 (114.18827819824219MB)
    88.26958297802969% used
Eden Space:
    capacity = 907345920 (865.3125MB)
    used     = 879277776 (838.5446319580078MB)
    free     = 28068144 (26.767868041992188MB)
    96.9065663512324% used
 From Space:
    capacity = 113377280 (108.125MB)
    used     = 21710336 (20.70458984375MB)
    free     = 91666944 (87.42041015625MB)
    19.148753612716764% used
To Space:
    capacity = 113377280 (108.125MB)
    used     = 0 (0.0MB)
    free     = 113377280 (108.125MB)
    0.0% used
concurrent mark-sweep generation:
    capacity = 1013383168 (966.4375MB)
    used     = 338488264 (322.8075637817383MB)
    free     = 674894904 (643.6299362182617MB)
    33.40180443968061% used

14231 interned Strings occupying 2258280 bytes.

So based on these various logs and data, I notice that they aren't being
collected. At the time when the process gets chosen to be killed, even
then I see these vast non-live objects holding on. I don't have too much
knowledge in the GC area, so a genuine question - Is it just a wrong
expectation that whenever a Full GC runs, it is supposed to clear out
the non-live objects? Or is it a usual thing that the collector doesn't
choose them for collection for some reason?


I realize this data may not be enough to narrow down the issue, but what
I am looking for is some kind of help/input/hints/suggestions on what I
should be trying to figure out why these instances aren't GCed. Is this
something that's expected in certain situations?

Being killed by the oom-killer suggests that your limit is too low
and/or your -Xmx is too large. If an increasing number of objects does
not get collected, you would get an exception (not being killed by the
oom-killer). What likely is happening is that your java heap slowly
grows (not unusual with CMS that does not do compaction of old
objects) and that the memory consumed by your docker image exceeds
your limit.
You are right. The heap usage (keeping into consideration the
uncollected non-live objects) does indeed grow very slowly. There
however aren't too many (live) objects on heap really (even after days
when this process is about to be killed). I'll have to read up on the
compaction that you mentioned about CMS.

Here's one example output from dmesg when this process was killed
previously:

[Wed Aug  8 03:21:13 2018] java invoked oom-killer: gfp_mask=0xd0,
order=0, oom_score_adj=-600
[Wed Aug  8 03:21:13 2018] java
cpuset=390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
mems_allowed=0
[Wed Aug  8 03:21:13 2018] CPU: 7 PID: 15827 Comm: java Not tainted
3.10.0-693.5.2.el7.x86_64 #1
...
[Wed Aug  8 03:21:13 2018]  ffff88081472dee0 00000000fe941297
ffff8808062cfcb8 ffffffff816a3e51
[Wed Aug  8 03:21:13 2018]  ffff8808062cfd48 ffffffff8169f246
ffff88080d6d3500 0000000000000001
[Wed Aug  8 03:21:13 2018]  0000000000000000 0000000000000000
ffff8808062cfcf8 0000000000000046
[Wed Aug  8 03:21:13 2018] Call Trace:
[Wed Aug  8 03:21:13 2018]  [<ffffffff816a3e51>] dump_stack+0x19/0x1b
[Wed Aug  8 03:21:13 2018]  [<ffffffff8169f246>] dump_header+0x90/0x229
[Wed Aug  8 03:21:13 2018]  [<ffffffff81185ef6>] ?
find_lock_task_mm+0x56/0xc0
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f14f8>] ?
try_get_mem_cgroup_from_mm+0x28/0x60
[Wed Aug  8 03:21:13 2018]  [<ffffffff811863a4>]
oom_kill_process+0x254/0x3d0
[Wed Aug  8 03:21:13 2018]  [<ffffffff812b7eec>] ? selinux_capable+0x1c/0x40
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f5216>]
mem_cgroup_oom_synchronize+0x546/0x570
[Wed Aug  8 03:21:13 2018]  [<ffffffff811f4690>] ?
mem_cgroup_charge_common+0xc0/0xc0
[Wed Aug  8 03:21:13 2018]  [<ffffffff81186c34>]
pagefault_out_of_memory+0x14/0x90
[Wed Aug  8 03:21:13 2018]  [<ffffffff8169d60e>] mm_fault_error+0x68/0x12b
[Wed Aug  8 03:21:13 2018]  [<ffffffff816b02b1>] __do_page_fault+0x391/0x450
[Wed Aug  8 03:21:13 2018]  [<ffffffff816b03a5>] do_page_fault+0x35/0x90
[Wed Aug  8 03:21:13 2018]  [<ffffffff816ac5c8>] page_fault+0x28/0x30
[Wed Aug  8 03:21:13 2018] Task in
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
killed as a result of limit of
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5
[Wed Aug  8 03:21:13 2018] memory: usage 8388608kB, limit 8388608kB,
failcnt 426669
[Wed Aug  8 03:21:13 2018] memory+swap: usage 8388608kB, limit
8388608kB, failcnt 152
[Wed Aug  8 03:21:13 2018] kmem: usage 5773512kB, limit
9007199254740988kB, failcnt 0
[Wed Aug  8 03:21:13 2018] Memory cgroup stats for
/docker/390dfe142d30d73b43a35996ae11b9647fc0598651c8959e95788edbbc9916b5:
cache:14056KB rss:2601040KB rss_huge:2484224KB mapped_file:6428KB
swap:0KB inactive_anon:0KB active_anon:142640KB inactive_file:4288KB
active_file:3324KB unevictable:2464692KB
[Wed Aug  8 03:21:13 2018] [ pid ]   uid  tgid total_vm      rss nr_ptes
swapents oom_score_adj name
[Wed Aug  8 03:21:13 2018] [15582]  1000 15582  3049827   654465
1446        0          -600 java
[Wed Aug  8 03:21:13 2018] Memory cgroup out of memory: Kill process
12625 (java) score 0 or sacrifice child
[Wed Aug  8 03:21:13 2018] Killed process 15582 (java)
total-vm:12199308kB, anon-rss:2599320kB, file-rss:18540kB, shmem-rss:0kB


How big the limit should be is hard to tell, but it _must_ be larger
than your "-Xmx" (the JVM is using more memory than the java heap, so
this would be true even without the addition of docker).
Agreed and that's where the ByteBuffer(s) are coming into picture
(around 2.5 GB right now and growing steadily).

-Jaikiran
_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

_______________________________________________
hotspot-gc-use mailing list
hotspot-gc-use@openjdk.java.net
http://mail.openjdk.java.net/mailman/listinfo/hotspot-gc-use

Reply via email to