Hello Dev Team, First, apologies if this is not the correct forum to post this query and please suggest on which is the correct forum. We are facing the following issue and I think it will be of interest to the dev team as well.
We have an Java application (HBase) which writes data into a heap data structure and after a threshold is reached writes the data into files and released the memory. We have defined the total amount of heap memory to this data structure to ~2GB. We have allocated ~2GB for another data structure which is used as cache to store data when read from disk to before sending back to client. Also we have ~1GB for -Xmn. So in terms of JVM the settings are -Xmn 1024m -Xms5120m and -Xmx5120. What we see is that when the number of clients writing to HBase is high, the HBase JVM dies (or getting killed by OS) on a particular version of Linux (3.2.0-70-generic). There are no entries in OS logs (syslog, kern.log or dmesg) and also application has code to handle OOM and that is not log gin either. But consistently we see that the last set of activity is GC running and from the GC log we see that no memory is being released. The following is a sample of the GC log entry at the time the application dies. Since we are exercising only writes and not reads only half the old gen memory (~2 GB) should be used at any time since the application releases the memory. GC log entry from the same workload on a different version of Linux validates that the memory stays below 2 GB. Any pointers on what can be done to further debug this issue or details about and known issues relevant to this problem is much appreciated. 2015-07-24T12:21:56.861-0400: 2011.823: [Full GC2015-07-24T12:21:56.861-0400: 2011.823: [CMS2015-07-24T12:21:57.682-0400: 2012.643: [CMS-concurrent-mark: 0.818/0.825 secs] [Times: user=3.30 sys=0.00, real=0.82 secs] (concurrent mode failure): 4194256K->4194228K(4194304K), 4.4017900 secs] 4258987K->4257955K(5138048K), [CMS Perm : 42597K->42597K(71012K)], 4.4020310 secs] [Times: user=6.86 sys=0.00, real=4.40 secs] Heap after GC invocations=418 (full 227): par new generation total 943744K, used 63727K [0x00000006b5a00000, 0x00000006f5a00000, 0x00000006f5a00000) eden space 838912K, 7% used [0x00000006b5a00000, 0x00000006b983bc58, 0x00000006e8d40000) from space 104832K, 0% used [0x00000006e8d40000, 0x00000006e8d40000, 0x00000006ef3a0000) to space 104832K, 0% used [0x00000006ef3a0000, 0x00000006ef3a0000, 0x00000006f5a00000) concurrent mark-sweep generation total 4194304K, used 4194228K [0x00000006f5a00000, 0x00000007f5a00000, 0x00000007f5a00000) concurrent-mark-sweep perm gen total 71012K, used 42597K [0x00000007f5a00000, 0x00000007f9f59000, 0x0000000800000000) } {Heap before GC invocations=418 (full 227): par new generation total 943744K, used 63727K [0x00000006b5a00000, 0x00000006f5a00000, 0x00000006f5a00000) eden space 838912K, 7% used [0x00000006b5a00000, 0x00000006b983bc58, 0x00000006e8d40000) from space 104832K, 0% used [0x00000006e8d40000, 0x00000006e8d40000, 0x00000006ef3a0000) to space 104832K, 0% used [0x00000006ef3a0000, 0x00000006ef3a0000, 0x00000006f5a00000) concurrent mark-sweep generation total 4194304K, used 4194228K [0x00000006f5a00000, 0x00000007f5a00000, 0x00000007f5a00000) concurrent-mark-sweep perm gen total 71012K, used 42597K [0x00000007f5a00000, 0x00000007f9f59000, 0x0000000800000000) Thanks, Biju