Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
On 2/11/23 08:17, Thad Humphries wrote: Finally I profiled our Java utility with VisualVM, all on my Mac Mini, and quickly found a leak from java.util.zip. This was a surprise because we were not using java.util.zip anywhere, nor could I find any reference to java.util.zip when I looked at the code of the few open source libraries we were using. The jar format is a zipfile with a particular directory structure. Which means that unless the application has no jars and exists as ONLY extracted .class files, java.util.zip is going to be used behind the scenes. Interesting that it would create a continual memory leak, though. I guess your evidence means that it is the jar file, not the files included in the jar, which are mmapped by Java. Thanks, Shawn - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
A profiler has its place. VisualVM was vital in helping us solve a Java 8 memory leak last November (although it didn't involve Tomcat). We were moving a large, long running system from old Solaris machines with InformixDB to Intel hardware with SELinux and OracleDB. Part of the system's workflow involved a small Java utility that processed images from one step to another. That process had run on Oracle Java 7 and was now running on a RedHad build of OpenJDK 8. With the new environment we observed a memory leak where the Java process grew in memory used (resident set size increased continually) and the overall memory available on the machines just got lower and lower until the OS decided to kill one or more processes to free up some memory. We poured over our code. Small though it was, I tried Java programs that were smaller yet--little more than connect and disconnect. We still saw a memory problem. Finally I profiled our Java utility with VisualVM, all on my Mac Mini, and quickly found a leak from java.util.zip. This was a surprise because we were not using java.util.zip anywhere, nor could I find any reference to java.util.zip when I looked at the code of the few open source libraries we were using. What we did find was two online references to a native memory leak in java.util.zip in Java 8 (one from OpenJDK and one from RedHat). Based on this, we took RedHat's recommendation to upgrade to OpenJDK 11. Memory stabilized immediately. We might still be restarting servers at 1am without VisualVM helping us identify this leak. (We are also moving our other Java apps and our Tomcat installations to Java 11.) On Fri, Feb 10, 2023 at 4:31 PM John Dale (DB2DOM) wrote: > I've tried profilers in the past, but I've never had much luck since > you need a super computer to run them. Human intelligence rules .. > read the code carefully, review it, step it with a debugger, and look > for memory leak patterns. Mine have mostly been static and non static > collections and HashMaps that keep growing, or rampant string creation > wherein the GC couldn't keep up under load. > > Review the code .. become its master and empath. Memory leaks cause it > pain. > > On 2/9/23, James H. H. Lampert wrote: > > I've obtained some heap and CPU numbers, taking data at 15 minute > > intervals, heap from WRKJVMJOB and CPU from WRKACTJOB. In two days of > > this, I didn't witness any crashes; I did witness a near-miss, in which > > heap-in-use hit 5011.938M (out of 5120). > > > > In discussion with our webapp developer (to whom we sent a catalina.out > > excerpt), he observed that they were running Tomcat on a six-year-old > > JVM (it identifies in a WRKJVMJOB as "1.8.0_151"; on the Manager page, > > it identifies as "8.0.5.5 - pap6480sr5fp5-20171114_01(SR5 FP5)") with a > > known issue (on Github, it's listed as 11493). He suggested that the > > customer ought to try updating to a more recent Java. > > > > I've also asked on the IBM Midrange Java List whether we can go any > > higher on the heap parameters (currently set at -Xms 4096 -Xmx 5120 for > > that particular installation). > > > > -- > > JHHL > > > > - > > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > > For additional commands, e-mail: users-h...@tomcat.apache.org > > > > > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > -- "Hell hath no limits, nor is circumscrib'd In one self-place; but where we are is hell, And where hell is, there must we ever be" --Christopher Marlowe, *Doctor Faustus* (v. 111-13)
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
I've tried profilers in the past, but I've never had much luck since you need a super computer to run them. Human intelligence rules .. read the code carefully, review it, step it with a debugger, and look for memory leak patterns. Mine have mostly been static and non static collections and HashMaps that keep growing, or rampant string creation wherein the GC couldn't keep up under load. Review the code .. become its master and empath. Memory leaks cause it pain. On 2/9/23, James H. H. Lampert wrote: > I've obtained some heap and CPU numbers, taking data at 15 minute > intervals, heap from WRKJVMJOB and CPU from WRKACTJOB. In two days of > this, I didn't witness any crashes; I did witness a near-miss, in which > heap-in-use hit 5011.938M (out of 5120). > > In discussion with our webapp developer (to whom we sent a catalina.out > excerpt), he observed that they were running Tomcat on a six-year-old > JVM (it identifies in a WRKJVMJOB as "1.8.0_151"; on the Manager page, > it identifies as "8.0.5.5 - pap6480sr5fp5-20171114_01(SR5 FP5)") with a > known issue (on Github, it's listed as 11493). He suggested that the > customer ought to try updating to a more recent Java. > > I've also asked on the IBM Midrange Java List whether we can go any > higher on the heap parameters (currently set at -Xms 4096 -Xmx 5120 for > that particular installation). > > -- > JHHL > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > > - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Naturally, I thought about this about 5 seconds after I clicked "Send": It doesn't happen very often, and it usually happens *after* a substantial portion of the heap has been idle for some time. Maybe there's something in there that works somewhat like a disk defragmenter. And when it gets a big chunk of idle heap that's been idle for long enough, and can be returned to the OS with the heap remaining contiguous, it gets returned. -- JHHL - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
It would be unusual for the OS to reclaim any of that memory from the JVM process. Are you looking at OS heap usage, or "JVM heap" usage? From your description above, it's tough to tell. The tool is called WRKJVMJOB so presumably it knows what the heck a JVM is, so maybe you were getting the exact numbers I was recommending. TL;DR: I have seen Java return memory to the OS. I was frankly a little shocked myself when I first saw it yesterday. I was always taught that Java heap, once allocated, remained allocated. But there are a lot of things extraordinary about IBM Midrange boxes. -- JHHL - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Shawn, On 2/9/23 17:18, Shawn Heisey wrote: On 2/9/23 12:54, Christopher Schultz wrote: It would be unusual for the OS to reclaim any of that memory from the JVM process. Are you looking at OS heap usage, or "JVM heap" usage? From your description above, it's tough to tell. The tool is called WRKJVMJOB so presumably it knows what the heck a JVM is, so maybe you were getting the exact numbers I was recommending. TL;DR: I have seen Java return memory to the OS. Oh, that's interesting. I'm somewhat happy to see that happening, honestly. On the other hand, if the JVM allocates heap, it should probably hold on to it; it's likely to need it later. On the Oracle JVM, I think that happens with G1GC when -Xms is less than -Xmx. That would make sense. I typically run server processes where Xms == Xmx. I think G1 is the only collector in Java 8 that does this, but I can't say that for sure. I don't see anything in the thread about precise JVM options. Someone mentioned the J9 JVM, and I don't know if that behaves the same as Oracle. I'm sure in this case, it's some wily IBM JVM version, since this is AS/400. -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
On 2/9/23 12:54, Christopher Schultz wrote: It would be unusual for the OS to reclaim any of that memory from the JVM process. Are you looking at OS heap usage, or "JVM heap" usage? From your description above, it's tough to tell. The tool is called WRKJVMJOB so presumably it knows what the heck a JVM is, so maybe you were getting the exact numbers I was recommending. TL;DR: I have seen Java return memory to the OS. On the Oracle JVM, I think that happens with G1GC when -Xms is less than -Xmx. I think G1 is the only collector in Java 8 that does this, but I can't say that for sure. I don't see anything in the thread about precise JVM options. Someone mentioned the J9 JVM, and I don't know if that behaves the same as Oracle. Thanks, Shawn - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
James, On 2/7/23 20:35, James H. H. Lampert wrote: Monitored the thing all day, taking the CPU usage (via a WRKACTJOB) and the current heap size and heap-in-use (via option 5 of a WRKJVMJOB) every 15 minutes. Heap size was 4925.375M (out of a maximum of 5120M) at 08:45, and the OS took heap away over the course of the day, until it was down to 4352.0M at 17:30. Heap in use fluctuated; from 11:45 to 13:00 it steadily climbed from 2575.382M to 3668.079M, then by 13:30 it had dropped to 2169.834M. Job CPU hit 35.5% at 09:00; that was the highest I saw it all day. It would be unusual for the OS to reclaim any of that memory from the JVM process. Are you looking at OS heap usage, or "JVM heap" usage? From your description above, it's tough to tell. The tool is called WRKJVMJOB so presumably it knows what the heck a JVM is, so maybe you were getting the exact numbers I was recommending. Are you able to graph them and eyeball the steady-state? What you describe briefly seems healthy. Not sure about the 35% CPU usage; you might just have witnessed a spike. Usually if the GC is thrashing, you'll see a CPU pegged for a while. That climb to 3668.079M peak (and I think it went higher, but not on the quarter-hour) had me wondering if I was going to witness a crash. ... and did it crash? :) -chris - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
I've obtained some heap and CPU numbers, taking data at 15 minute intervals, heap from WRKJVMJOB and CPU from WRKACTJOB. In two days of this, I didn't witness any crashes; I did witness a near-miss, in which heap-in-use hit 5011.938M (out of 5120). In discussion with our webapp developer (to whom we sent a catalina.out excerpt), he observed that they were running Tomcat on a six-year-old JVM (it identifies in a WRKJVMJOB as "1.8.0_151"; on the Manager page, it identifies as "8.0.5.5 - pap6480sr5fp5-20171114_01(SR5 FP5)") with a known issue (on Github, it's listed as 11493). He suggested that the customer ought to try updating to a more recent Java. I've also asked on the IBM Midrange Java List whether we can go any higher on the heap parameters (currently set at -Xms 4096 -Xmx 5120 for that particular installation). -- JHHL - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Monitored the thing all day, taking the CPU usage (via a WRKACTJOB) and the current heap size and heap-in-use (via option 5 of a WRKJVMJOB) every 15 minutes. Heap size was 4925.375M (out of a maximum of 5120M) at 08:45, and the OS took heap away over the course of the day, until it was down to 4352.0M at 17:30. Heap in use fluctuated; from 11:45 to 13:00 it steadily climbed from 2575.382M to 3668.079M, then by 13:30 it had dropped to 2169.834M. Job CPU hit 35.5% at 09:00; that was the highest I saw it all day. That climb to 3668.079M peak (and I think it went higher, but not on the quarter-hour) had me wondering if I was going to witness a crash. -- JHHL - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Thomas, James, On 2/6/23 17:00, Thomas Hoffmann (Speed4Trade GmbH) wrote: Hello James, -Ursprüngliche Nachricht- Von: James H. H. Lampert Gesendet: Montag, 6. Februar 2023 18:18 An: Tomcat Users List Betreff: Re: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager Thanks, Herr Hoffmann. Your questions were most helpful in determining what information to gather and share. And thanks in advance to anybody else who has any insights. First, I will note that the seemingly non-sequitur nursery-survivor numbers aren't just what we see during a crash; they're what we see when it's running normally. On 2/4/23 6:13 AM, Thomas Hoffmann (Speed4Trade GmbH) wrote: Could you describe "crash" in a bit more detail? Typically, the signed-on users start to get degraded response times, before it becomes completely unresponsive. - does the tomcat / java process run but is unresponsive? Yes. Exactly. And shutting it down (and therefore freeing up the port for a restart) takes a fairly sizeable amount of time, and leaves a core dump of approximately 6G size, a Javacore dump of approximately 4M size, and a JIT dump of approximately 20M size. - does the java process crash itself (then there should be a logfile written)? The job does not generally terminate itself, or even respond to a shutdown request; it has to be forcibly terminated (given that it's running on an AS/400, this would typically be either from WRKACTJOB, or from an ENDJOB command, or from their GUI console equivalents). This may be relevant: even when it is not in this state, the Tomcat server, when being shut down, tends not to respond readily to shutdown requests. - Is there any OOM message in the logfiles? Not out-of-memory, but there are chronic problems with contacting outside web services (many of them involving Oauth2), and with BIRT reporting. Around the time of the shutdown, I typically see stuff like: Unhandled exception Type=Segmentation error vmState=0x J9Generic_Signal_Number=0004 Signal_Number=000b Error_Value= Signal_Code=0032 I am not sure whether this is going into catalina.out before or after the job is forcibly terminated. - Is the process still alive but CPU at 100% ? Yes. We just had a near-miss as I was typing this: CPU pushing up into the high 80s, and the JVM job for Tomcat eating up most of it, but it backed down to something more normal without my having to intervene, and without any sign of anybody else intervening. One of my colleagues managed to get into manager during the near-miss, and took a screen-shot. The "nursery-allocate" Used was at 400.97M (34%), "nursery-survivor" as I described last week, "tenured-LOA" Used was at zero used, and "tenured-SOA" was showing Initial 2918.40M, Total 3648.00M, Maximum 4864.00M, and Used 1997.72M (41%). -- JHHL - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org The observations looks like java is running out of memory and the garbage collector can't keep up with making memory free again. Either the GC uses 100% or the application has some cpu intensive procedures. I would guess, that it’s the GC. One option would be to open a JMX port on tomcat and use VisualVM to connect to the java process and inspect the memory and GC usage. When the CPU is eating 100% CPU you might also consider generating a thread dump (kill -3) and check if there are any suspicious threads running. Also setting the java options HeapDumpOnOutOfMemoryError and HeapDumpPath might help, if the process stops because of OOM. If the GC can always free some bytes again, which the application is instantly eating again, an OOM might not occur. You can also add parameters to log some GC statistics, but I never used that : https://sematext.com/blog/java-garbage-collection-logs/ I'm not ready to blame this on GC /yet/, but Thomas's recommendations align with mine: enable the memory-error-related options and take thread-dumps. They may be long, but several of them over time (even a few seconds of minutes) can really shed light into what's going on. If you can also get some heap numbers from the running JVM over time that would be very helpful. There are various ways of doing this; I'm not sure what's most convenient in your environment. One way is to run the 'jmap' command. This is how it's done on *NIX and Windows systems: $ jmap -heap [pid] That will spit-out a human-readable set of heap-size/usage numbers. You may be able to copy/paste that into a spreadsheet or script the parsing of those dumps over time. The idea is to get a heap-space graph so you can see what's happening to the heaps over time. The ideal graph looks like a nice saw-tooth shape where the usage goes up steadily and then abruptly drops b
Re: AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Hi, > Hello James, > >> -Ursprüngliche Nachricht- >> Von: James H. H. Lampert >> Gesendet: Montag, 6. Februar 2023 18:18 >> An: Tomcat Users List >> Betreff: Re: AW: Having trouble with Tomcat crashes. Interesting memory >> numbers in Manager >> >> Thanks, Herr Hoffmann. Your questions were most helpful in determining >> what information to gather and share. And thanks in advance to anybody >> else who has any insights. >> >> First, I will note that the seemingly non-sequitur nursery-survivor >> numbers >> aren't just what we see during a crash; they're what we see when it's >> running >> normally. >> >> On 2/4/23 6:13 AM, Thomas Hoffmann (Speed4Trade GmbH) wrote: >> > Could you describe "crash" in a bit more detail? >> >> Typically, the signed-on users start to get degraded response times, >> before it >> becomes completely unresponsive. >> >> > - does the tomcat / java process run but is unresponsive? >> >> Yes. Exactly. And shutting it down (and therefore freeing up the port >> for a >> restart) takes a fairly sizeable amount of time, and leaves a core dump >> of >> approximately 6G size, a Javacore dump of approximately 4M size, and a >> JIT >> dump of approximately 20M size. >> >> > - does the java process crash itself (then there should be a logfile >> written)? >> The job does not generally terminate itself, or even respond to a >> shutdown >> request; it has to be forcibly terminated (given that it's running on an >> AS/400, >> this would typically be either from WRKACTJOB, or from an ENDJOB >> command, or from their GUI console equivalents). >> >> This may be relevant: even when it is not in this state, the Tomcat >> server, >> when being shut down, tends not to respond readily to shutdown >> requests. >> >> > - Is there any OOM message in the logfiles? >> Not out-of-memory, but there are chronic problems with contacting >> outside >> web services (many of them involving Oauth2), and with BIRT reporting. >> >> Around the time of the shutdown, I typically see stuff like: >> Unhandled exception >> Type=Segmentation error vmState=0x >> J9Generic_Signal_Number=0004 Signal_Number=000b >> Error_Value= Signal_Code=0032 >> >> I am not sure whether this is going into catalina.out before or after >> the job is >> forcibly terminated. >> >> > - Is the process still alive but CPU at 100% ? >> Yes. >> >> We just had a near-miss as I was typing this: CPU pushing up into the >> high >> 80s, and the JVM job for Tomcat eating up most of it, but it backed down >> to >> something more normal without my having to intervene, and without any >> sign of anybody else intervening. >> >> One of my colleagues managed to get into manager during the near-miss, >> and took a screen-shot. The "nursery-allocate" Used was at 400.97M >> (34%), >> "nursery-survivor" as I described last week, "tenured-LOA" Used was at >> zero >> used, and "tenured-SOA" was showing Initial 2918.40M, Total 3648.00M, >> Maximum 4864.00M, and Used 1997.72M (41%). >> >> -- >> JHHL >> >> - >> To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org >> For additional commands, e-mail: users-h...@tomcat.apache.org > > The observations looks like java is running out of memory and the garbage > collector can't keep up with making memory free again. > Either the GC uses 100% or the application has some cpu intensive > procedures. I would guess, that it’s the GC. > One option would be to open a JMX port on tomcat and use VisualVM to > connect to the java process and inspect the memory and GC usage. > When the CPU is eating 100% CPU you might also consider generating a > thread dump (kill -3) and check if there are any suspicious threads > running. > > Also setting the java options HeapDumpOnOutOfMemoryError and HeapDumpPath > might help, if the process stops because of OOM. > If the GC can always free some bytes again, which the application is > instantly eating again, an OOM might not occur. > > You can also add parameters to log some GC statistics, but I never used > that : https://sematext.com/blog/java-garbage-collection-logs/ > > Greetings, > Thomas I agree here, just one little thing to also keep in mind: I guess the Java VM here is the IBM J9 flavor which can be a little bit different than the SUN flavor. So configuration options may differ in details. Regards, Simon - To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org For additional commands, e-mail: users-h...@tomcat.apache.org
AW: AW: Having trouble with Tomcat crashes. Interesting memory numbers in Manager
Hello James, > -Ursprüngliche Nachricht- > Von: James H. H. Lampert > Gesendet: Montag, 6. Februar 2023 18:18 > An: Tomcat Users List > Betreff: Re: AW: Having trouble with Tomcat crashes. Interesting memory > numbers in Manager > > Thanks, Herr Hoffmann. Your questions were most helpful in determining > what information to gather and share. And thanks in advance to anybody > else who has any insights. > > First, I will note that the seemingly non-sequitur nursery-survivor numbers > aren't just what we see during a crash; they're what we see when it's running > normally. > > On 2/4/23 6:13 AM, Thomas Hoffmann (Speed4Trade GmbH) wrote: > > Could you describe "crash" in a bit more detail? > > Typically, the signed-on users start to get degraded response times, before it > becomes completely unresponsive. > > > - does the tomcat / java process run but is unresponsive? > > Yes. Exactly. And shutting it down (and therefore freeing up the port for a > restart) takes a fairly sizeable amount of time, and leaves a core dump of > approximately 6G size, a Javacore dump of approximately 4M size, and a JIT > dump of approximately 20M size. > > > - does the java process crash itself (then there should be a logfile > > written)? > The job does not generally terminate itself, or even respond to a shutdown > request; it has to be forcibly terminated (given that it's running on an > AS/400, > this would typically be either from WRKACTJOB, or from an ENDJOB > command, or from their GUI console equivalents). > > This may be relevant: even when it is not in this state, the Tomcat server, > when being shut down, tends not to respond readily to shutdown requests. > > > - Is there any OOM message in the logfiles? > Not out-of-memory, but there are chronic problems with contacting outside > web services (many of them involving Oauth2), and with BIRT reporting. > > Around the time of the shutdown, I typically see stuff like: > Unhandled exception > Type=Segmentation error vmState=0x > J9Generic_Signal_Number=0004 Signal_Number=000b > Error_Value= Signal_Code=0032 > > I am not sure whether this is going into catalina.out before or after the job > is > forcibly terminated. > > > - Is the process still alive but CPU at 100% ? > Yes. > > We just had a near-miss as I was typing this: CPU pushing up into the high > 80s, and the JVM job for Tomcat eating up most of it, but it backed down to > something more normal without my having to intervene, and without any > sign of anybody else intervening. > > One of my colleagues managed to get into manager during the near-miss, > and took a screen-shot. The "nursery-allocate" Used was at 400.97M (34%), > "nursery-survivor" as I described last week, "tenured-LOA" Used was at zero > used, and "tenured-SOA" was showing Initial 2918.40M, Total 3648.00M, > Maximum 4864.00M, and Used 1997.72M (41%). > > -- > JHHL > > - > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org The observations looks like java is running out of memory and the garbage collector can't keep up with making memory free again. Either the GC uses 100% or the application has some cpu intensive procedures. I would guess, that it’s the GC. One option would be to open a JMX port on tomcat and use VisualVM to connect to the java process and inspect the memory and GC usage. When the CPU is eating 100% CPU you might also consider generating a thread dump (kill -3) and check if there are any suspicious threads running. Also setting the java options HeapDumpOnOutOfMemoryError and HeapDumpPath might help, if the process stops because of OOM. If the GC can always free some bytes again, which the application is instantly eating again, an OOM might not occur. You can also add parameters to log some GC statistics, but I never used that : https://sematext.com/blog/java-garbage-collection-logs/ Greetings, Thomas