We've been running a WebSphere application for a few years on 4-WAS V5.0 app servers on separate zLinux-SLE8-31bit z/VM guests. The virtual memory size of each guest was 1GB and the maximum heap size in the JVM was set to 768MB. These systems do little or no swapping and can run for days and days without being bounced.
We upgraded one of the 4 servers to WAS V6.0.2.17 and zLinux-SLES9-SP3-64bit. After running for 1 day we saw the virtual memory used by WAS increase to over 1.5GB and the system began swapping heavily. Since the max JVM size was the same, 768MB, the memory that was being used seemed to be native memory above and beyond that used by the java application. The JVM heap size seems to be fine; Garbage Collection is occurring at a frequency and running for a duration that is acceptable. Below you can see the memory comparison between the WAS5 and WAS6 systems as shown using the "top" command WAS5 TOP command display PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 21610 root 17 0 540m 540m 72m S 9.6 53.6 0:02.78 java WAS6 TOP command display PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 4917 root 22 0 1595m 1.2g 13m S 45.3 84.9 36:44.80 java Another thing that happened was eventually Java threads began to fail because of "too many open files". Our "ulimit -n" setting is at the default of 1024. Issuing the "lsof -p <was-pid>" command showed an accumulation of apparently unused sockets. There were over 500 of these out there by the time we booted the system. java 4917 root 218u sock 0,4 31192 can't identify protocol java 4917 root 220u sock 0,4 33379 can't identify protocol java 4917 root 223u sock 0,4 10689 can't identify protocol java 4917 root 273u sock 0,4 26577 can't identify protocol java 4917 root 274u sock 0,4 62545 can't identify protocol java 4917 root 276u sock 0,4 66073 can't identify protocol java 4917 root 284u sock 0,4 69555 can't identify protocol These sockets don't seem to be associated with TCP/IP because the "netstat" command shows a normal list of open sockets. I don't know if these sockets can account for such a huge memory increase but they may help identify the cause. As a stopgap measure, until we figure out what's going on, we've increased the Virtual Memory size of the guest to 1.5GB and set "ulimit -n 8000" for WAS. This way we can at least make it through a day and boot the guest at night. We have a PMR open with IBM and need to collect heap dumps and malloc traces, but any assistance on how to go about determining what's using these sockets would be much appreciated. Thank You, Hank Calzaretta Acxiom Corp, (630)944-0392 *************************************************************************** The information contained in this communication is confidential, is intended only for the use of the recipient named above, and may be legally privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please resend this communication to the sender and delete the original message or any copy of it from your computer system. Thank You. **************************************************************************** ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
