Hello, First of all, this is my stack:
- Ubuntu 22.04.3 on x86/64 with 2GM of physical RAM that has been enough for years. - Java 11.0.20.1+1-post-Ubuntu-0ubuntu122.04 / openjdk 11.0.20.1 2023-08-24 - Tomcat 9.0.58 (JAVA_OPTS="-Djava.awt.headless=true -Xmx900m -Xms16m ......") - My app, which I developed myself, and has been running without any OOM crashes for years Well, a couple of weeks ago my website started crushing about every 5-7 days. Between crashes the RAM usage is fine and very steady (as it has been for years) and it uses just about 50% of the "Max memory" (according to what the Tomcat Manager server status shows). The 3 types of G1 heap are steady and low. And there are no leaks as far as I can tell. And I haven't made any significant changes to my app in the last months. When my website crashes, I can see on the Ubuntu log that some process has invoked the "oom-killer" and that this killer investigates which process is using most of the RAM and it is Tomcat/Java so it kills it. This is what I see on the log when it was Nginx that invoked the OOM-killer: Nov 15 15:23:54 ip-172-31-89-211 kernel: [366008.597771] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=nginx.service,mems_allowed=0,global_oom,task_memcg=/system.slice/tomcat9.service,task=java,pid=470,uid=998 Nov 15 15:23:54 ip-172-31-89-211 kernel: [366008.597932] Out of memory: Killed process 470 (java) total-vm:4553056kB, anon-rss:1527944kB, file-rss:2872kB, shmem-rss:0kB, UID:998 pgtables:3628kB oom_score_adj:0 I would like to be able to know what was happening inside the JVM when it was using too much RAM and deserved to be killed. Was it a problem in Java not associated with Tomcat or my app? Was it Tomcat itself that ate too much RAM? I doubt it. Was it my application? If it was my application (and I have to assume it was), how/why was it using all that RAM? What were the objects, threads, etc that were involved in the crash? What part of the heap memory was using all that RAM? This can happen at any time, like at 4am so I can not run to the computer to see what was going on at that moment. I need some way to get a detailed log of what was going on when the crush took place. So my question is, what tool should I use to investigate these crashes? I have started trying to make "New Relic" work since it seems that this service could help me, but I am having some problems making it work and I still don't know if this would be a solution in the first place. So, while I struggle with New Relic, I would appreciate your suggestions. Thanks in advance!