00:
00: CP Q SRM
00: IABIAS : INTENSITY=90%; DURATION=2
00: LDUBUF : Q1=100% Q2=75% Q3=60%
00: STORBUF: Q1=300% Q2=250% Q3=200%
00: DSPBUF : Q1=32767 Q2=32767 Q3=32767
00: DISPATCHING MINOR TIMESLICE = 5 MS
00: MAXWSS : LIMIT=9999%
00: ...... : PAGES=999999
00: XSTORE : 0%
00:
00: CP Q ALLOC PAGE
EXTENT EXTENT TOTAL PAGES HIGH %
VOLID RDEV START END PAGES IN USE PAGE USED
------ ---- ---------- ---------- ------ ------ ------ ----
VM6PG1 9F86 1 10016 1761K 0 0 0%
VM6PG2 9F87 0 0 180 12 12 6%
------ ------ ----
SUMMARY 1761K 12 1%
USABLE 1761K 12 1%
On Mon, Jul 26, 2010 at 11:13 AM, Marcy Cortes <
[email protected]> wrote:
> Can you post the results of
>
> Q SRM
>
> Q ALLOC PAGE
>
>
>
>
> Marcy
>
> “This message may contain confidential and/or privileged information. If
> you are not the addressee or authorized to receive this for the addressee,
> you must not use, copy, disclose, or take any action based on this message
> or any information herein. If you have received this message in error,
> please advise the sender immediately by reply e-mail and delete this
> message. Thank you for your cooperation."
>
>
>
> ________________________________
>
> From: The IBM z/VM Operating System [mailto:[email protected]] On
> Behalf Of Daniel Tate
> Sent: Monday, July 26, 2010 8:31 AM
> To: [email protected]
> Subject: [IBMVM] Linux on Z/VM running WAS problems - anyone got any tips?
>
>
> I apologize for this not being a "direct" z/VM question.
>
> I've posted to the Linux-390 group to get the linux POV.. but exploring all
> angles here I am attempting to find out if there's anything i can set/do
> from z/VM that would help the situation.. I'd like it to "finish" the
> scroll, not sure how to do that except tape down control-C (i'm using
> c3720). a CP Q ALL is at the bottom of all this mess..
>
> The e-mail sent to Linux-s390 for reference:
>
> We're running websphere on a z9 under z/VM 4 systems are live out of 8.
> it is running apps that consume around 16GB of memory on a Windows machine.
> on this, we have allocated 10G of real storage (RAM) and around 35GB of
> Swap. When websphere starts, it consumes all the memory eventually and
> halts, but not panics, the system. We are running 64-Bit. I'm a z/VM
> novice so i don't know much to do..
>
> Here is some information from our WAS Admin:
> "We are running WebSphere 6.1.0.25 with FP EJB3.0,Webservices and Web 2.0
> installed. There are two nodes running 14 application servers each. there
> are currently 32 applications installed but not currently running. No
> security has been enabled for WebSphere at this time."
>
>
> At this point i see two problems:
>
> 1) Why is OOM Kill not functioning properly
> 2) Why is websphere performance so awful?
>
> and have two questions
>
> 1) Does anyone have any PRACTICAL experience/tips to optimize SLES11 on
> z/VM? So far we've been using dated case studies and redbooks that seem to
> be filled with inaccuracies or outdated information.
> 2) Is there any way to force a coredump via the cp, like you can with the
> magic sysrq?
>
> All systems are running the same release and patch level:
>
> [root] bwzld001:~# lsb_release -a
> LSB Version:
>
> core-2.0-noarch:core-3.2-noarch:core-4.0-noarch:core-2.0-s390x:core-3.2-s390x:core-4.0-s390x:desktop-4.0-noarch:desktop-4.0-s390:desktop-4.0-s390x:graphics-2.0-noarch:graphics-2.0-s390:graphics-2.0-s390x:graphics-3.2-noarch:graphics-3.2-s390:graphics-3.2-s390x:graphics-4.0-noarch:graphics-4.0-s390:graphics-4.0-s390x
> Distributor ID: SUSE LINUX
> Description: SUSE Linux Enterprise Server 11 (s390x)
> Release: 11
> Codename: n/a
>
>
> Here is a partial top shortly before system death:
>
> top - 08:13:14 up 2 days, 16:08, 2 users, load average: 51.47, 22.20,
> 10.25
> Tasks: 129 total, 4 running, 125 sleeping, 0 stopped, 0 zombie
> Cpu(s): 16.7%us, 81.5%sy, 0.0%ni, 0.0%id, 0.0%wa, 0.3%hi, 0.3%si,
> 1.2%st
> Mem: 10268344k total, 10220568k used, 47776k free, 548k buffers
> Swap: 35764956k total, 35764956k used, 0k free, 56340k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>
> 26850 wasadmin 20 0 1506m 253m 2860 S 18 2.5 16:06.28 java
> 29870 wasadmin 20 0 1497m 279m 2560 S 15 2.8 15:41.13 java
> 24607 wasadmin 20 0 1502m 223m 2760 S 13 2.2 16:15.14 java
> 24641 wasadmin 20 0 7229m 1.3g 3172 S 13 13.1 196:35.52 java
> 26606 wasadmin 20 0 1438m 272m 6212 S 12 2.7 16:02.77 java
> 27600 wasadmin 20 0 1553m 258m 2920 S 12 2.6 15:46.57 java
> 24638 wasadmin 20 0 7368m 1.3g 24m S 10 13.7 206:02.05 java
> 25609 wasadmin 20 0 1528m 219m 2540 S 9 2.2 16:07.33 java
> 30258 wasadmin 20 0 1515m 249m 2592 S 7 2.5 15:49.79 java
> 25780 wasadmin 20 0 1604m 277m 2332 S 6 2.8 16:31.41 java
> 27106 wasadmin 20 0 1458m 273m 2472 S 6 2.7 15:59.13 java
> 27336 wasadmin 20 0 1528m 238m 2540 S 5 2.4 15:38.82 java
> 29164 wasadmin 20 0 1527m 224m 2608 S 5 2.2 16:02.56 java
> 31400 wasadmin 20 0 1509m 259m 2468 S 5 2.6 15:26.38 java
> 25244 wasadmin 20 0 1509m 290m 2624 S 5 2.9 16:16.07 java
> 24769 wasadmin 20 0 1409m 259m 2308 S 5 2.6 16:08.12 java
> 28796 wasadmin 20 0 1338m 263m 3076 S 4 2.6 15:47.72 java
> 26185 wasadmin 20 0 1493m 274m 2304 S 2 2.7 16:01.97 java
> 25968 wasadmin 20 0 1427m 257m 2532 S 1 2.6 15:51.50 java
> 29495 wasadmin 20 0 1466m 259m 2260 S 1 2.6 15:31.82 java
> 25080 wasadmin 20 0 1445m 236m 2472 S 0 2.4 15:53.19 java
> 26410 wasadmin 20 0 1475m 271m 2540 S 0 2.7 15:52.48 java
> 31027 wasadmin 20 0 1413m 238m 2492 S 0 2.4 15:29.78 java
> 3695 wasadmin 20 0 9968 1352 1352 S 0 0.0 0:00.13 bash
> 24474 wasadmin 20 0 1468m 205m 2472 S 0 2.0 16:03.63 java
> 24920 wasadmin 20 0 1522m 263m 2616 S 0 2.6 16:06.29 java
> 25422 wasadmin 20 0 1584m 229m 2284 S 0 2.3 16:02.18 java
> 27892 wasadmin 20 0 1414m 263m 2648 S 0 2.6 15:45.96 java
> 28184 wasadmin 20 0 1523m 241m 2320 S 0 2.4 15:42.21 java
> 28486 wasadmin 20 0 1450m 231m 2288 S 0 2.3 15:46.53 java
> 30625 wasadmin 20 0 1477m 251m 3024 S 0 2.5 15:44.80 java
>
> -----------------
>
>
> Here are a few screen grabs from the 3720 Console session:
>
> Unless you get a _continuous_flood_ of these messages it means
> everything is working fine. Allocations from irqs cannot be
> perfectly reliable and the kernel is designed to handle that.
> java: page allocation failure. order:0, mode:0x20, alloc_flags:0x7,
> pflags:0x400
> 040
> CPU: 1 Not tainted 2.6.27.45-0.1-default #1
> Process java (pid: 28831, task: 00000001ab64c638, ksp: 0000000215bbb5e0)
> 0000000000000000 000000027fbcf7b0 0000000000000002 0000000000000000
> 000000027fbcf850 000000027fbcf7c8 000000027fbcf7c8 00000000003b6696
> 00000000014a4e88 0000000000000007 0000000000634e00 0000000000000000
> 000000000000000d 0000000000000000 000000027fbcf818 000000000000000e
> 00000000003cdc00 000000000010521a 000000027fbcf7b0 000000027fbcf7f8
> Call Trace:
> ( 0000000000105174> show_trace+0x130/0x134)
> 000000000019890a> __alloc_pages_internal+0x406/0x55c
> 00000000001c7056> cache_grow+0x382/0x458
> 00000000001c7440> cache_alloc_refill+0x314/0x36c
> 00000000001c6c12> kmem_cache_alloc+0x82/0x144
> 00000000003228f2> __alloc_skb+0x82/0x208
> 000000000032378e> dev_alloc_skb+0x36/0x64
> 000003e0001a030e> qeth_core_get_next_skb+0x31e/0x704 eth
> 000003e0000d5f8c> qeth_l3_process_inbound_buffer+0x9c/0x598 eth_l3
> 000003e0000d6574> qeth_l3_qdio_input_handler+0xec/0x268 eth_l3
> 000003e0000ebc44> qdio_kick_inbound_handler+0xbc/0x178 dio
> 000003e0000ee58c> __tiqdio_inbound_processing+0x394/0xdf4 dio
> 000000000013a800> tasklet_action+0x10c/0x1e4
> 000000000013b908> __do_softirq+0xe0/0x1c8
> 0000000000110252> do_softirq+0xaa/0xb0
> 000000000013b772> irq_exit+0xc2/0xcc
> 00000000002f6586> do_IRQ+0x132/0x1c8
> 0000000000114148> io_return+0x0/0x8
> 00000000002b850e> _raw_spin_lock_wait+0x86/0xa4
> ( 000003e047d6fa00> 0x3e047d6fa00)
> 000000000019eb9c> shrink_page_list+0x1a0/0x584
> 000000000019f184> shrink_inactive_list+0x204/0x5b0
> 000000000019f620> shrink_zone+0xf0/0x1d0
> 000000000019f882> shrink_zones+0xae/0x184
> 00000000001a02be> do_try_to_free_pages+0x96/0x3fc
> 00000000001a072c> try_to_free_pages+0x74/0x7c
> 0000000000198730> __alloc_pages_internal+0x22c/0x55c
> 000000000019b5a2> __do_page_cache_readahead+0x10a/0x2ac
> 000000000019b7cc> do_page_cache_readahead+0x88/0xa8
> 000000000019170e> filemap_fault+0x33a/0x448
> 00000000001a55bc> __do_fault+0x78/0x580
> 00000000001a962e> handle_mm_fault+0x1e6/0x4c0
> 00000000003b9e1e> do_dat_exception+0x29e/0x388
> 0000000000113c0c> sysc_return+0x0/0x8
> 0000020000214bde> 0x20000214bde
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Normal per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Active:1355277 inactive:1132712 dirty:0 writeback:0 unstable:0
> free:9269 slab:17875 mapped:765 pagetables:24402 bounce:0
> DMA free:33220kB min:2568kB low:3208kB high:3852kB active:1092112kB
> inactive:926
> 924kB present:2064384kB pages_scanned:21132286 all_unreclaimable? no
> lowmem_reserveݨ: 0 8064 8064
> Normal free:3856kB min:10276kB low:12844kB high:15412kB active:4328996kB
> inactiv
> e:3603924kB present:8257536kB pages_scanned:44557906 all_unreclaimable? yes
> lowmem_reserveݨ: 0 0 0
> DMA: 101*4kB 32*8kB 473*16kB 195*32kB 49*64kB 30*128kB 8*256kB 3*512kB
> 8*1024kB
> = 33220kB
> Normal: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB 3*1024kB =
> 3856
> kB
> 9283 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 34513958, delete 34513958, find 6612011/8393146
> Free swap = 0kB
> Total swap = 35764956kB
> 2621440 pages RAM
> 54354 pages reserved
> 22356 pages shared
> 2538214 pages non-shared
> The following is only an harmless informational message.
> Unless you get a _continuous_flood_ of these messages it means
> everything is working fine. Allocations from irqs cannot be
> perfectly reliable and the kernel is designed to handle that.
> java: page allocation failure. order:0, mode:0x20, alloc_flags:0x7,
> pflags:0x400
> 040
> CPU: 1 Not tainted 2.6.27.45-0.1-default #1
> Process java (pid: 28831, task: 00000001ab64c638, ksp: 0000000215bbb5e0)
> 0000000000000000 000000027fbcf7b0 0000000000000002 0000000000000000
> 000000027fbcf850 000000027fbcf7c8 000000027fbcf7c8 00000000003b6696
> 00000000014a5dd3 0000000000000007 0000000000634e00 0000000000000000
> 000000000000000d 0000000000000000 000000027fbcf818 000000000000000e
> 00000000003cdc00 000000000010521a 000000027fbcf7b0 000000027fbcf7f8
> Call Trace:
> ( 0000000000105174> show_trace+0x130/0x134)
> 000000000019890a> __alloc_pages_internal+0x406/0x55c
> 00000000001c7056> cache_grow+0x382/0x458
> 00000000001c7440> cache_alloc_refill+0x314/0x36c
> 00000000001c6c12> kmem_cache_alloc+0x82/0x144
> 00000000003228f2> __alloc_skb+0x82/0x208
> 000000000032378e> dev_alloc_skb+0x36/0x64
> 000003e0001a030e> qeth_core_get_next_skb+0x31e/0x704 eth
> 000003e0000d5f8c> qeth_l3_process_inbound_buffer+0x9c/0x598 eth_l3
> 000003e0000d6574> qeth_l3_qdio_input_handler+0xec/0x268 eth_l3
> 000003e0000ebc44> qdio_kick_inbound_handler+0xbc/0x178 dio
> 000003e0000ee58c> __tiqdio_inbound_processing+0x394/0xdf4 dio
> 000000000013a800> tasklet_action+0x10c/0x1e4
> 000000000013b908> __do_softirq+0xe0/0x1c8
> 0000000000110252> do_softirq+0xaa/0xb0
> 000000000013b772> irq_exit+0xc2/0xcc
> 00000000002f6586> do_IRQ+0x132/0x1c8
> 0000000000114148> io_return+0x0/0x8
> 00000000002b850e> _raw_spin_lock_wait+0x86/0xa4
> ( 000003e047d6fa00> 0x3e047d6fa00)
> 000000000019eb9c> shrink_page_list+0x1a0/0x584
> 000000000019f184> shrink_inactive_list+0x204/0x5b0
> 000000000019f620> shrink_zone+0xf0/0x1d0
> 000000000019f882> shrink_zones+0xae/0x184
> 00000000001a02be> do_try_to_free_pages+0x96/0x3fc
> 00000000001a072c> try_to_free_pages+0x74/0x7c
> 0000000000198730> __alloc_pages_internal+0x22c/0x55c
> 000000000019b5a2> __do_page_cache_readahead+0x10a/0x2ac
> 000000000019b7cc> do_page_cache_readahead+0x88/0xa8
> 000000000019170e> filemap_fault+0x33a/0x448
> 00000000001a55bc> __do_fault+0x78/0x580
> 00000000001a962e> handle_mm_fault+0x1e6/0x4c0
> 00000000003b9e1e> do_dat_exception+0x29e/0x388
> 0000000000113c0c> sysc_return+0x0/0x8
> 0000020000214bde> 0x20000214bde
> Mem-Info:
> DMA per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Normal per-cpu:
> CPU 0: hi: 186, btch: 31 usd: 0
> CPU 1: hi: 186, btch: 31 usd: 0
> Active:1355277 inactive:1132712 dirty:0 writeback:0 unstable:0
> free:9269 slab:17875 mapped:765 pagetables:24402 bounce:0
> DMA free:33220kB min:2568kB low:3208kB high:3852kB active:1092112kB
> inactive:926
> 924kB present:2064384kB pages_scanned:21132286 all_unreclaimable? no
> lowmem_reserveݨ: 0 8064 8064
> Normal free:3856kB min:10276kB low:12844kB high:15412kB active:4328996kB
> inactiv
> e:3603924kB present:8257536kB pages_scanned:44557906 all_unreclaimable? yes
> lowmem_reserveݨ: 0 0 0
> DMA: 101*4kB 32*8kB 473*16kB 195*32kB 49*64kB 30*128kB 8*256kB 3*512kB
> 8*1024kB
> = 33220kB
> Normal: 0*4kB 0*8kB 1*16kB 0*32kB 0*64kB 0*128kB 1*256kB 1*512kB 3*1024kB =
> 3856
> kB
> 9283 total pagecache pages
> 0 pages in swap cache
> Swap cache stats: add 34513958, delete 34513958, find 6612011/8393146
> Free swap = 0kB
> Total swap = 35764956kB
> 2621440 pages RAM
> 54354 pages reserved
> 22356 pages shared
> 2538214 pages non-shared
> __ratelimit: 4 callbacks suppressed
> The following is only an harmless informational message.
> Unless you get a _continuous_flood_ of these messages it means
> everything is working fine. Allocations from irqs cannot be
> perfectly reliable and the kernel is designed to handle that.
> java: page allocation failure. order:0, mode:0x20, alloc_flags:0x7,
> pflags:0x400
> 040
> CPU: 1 Not tainted 2.6.27.45-0.1-default #1
> Process java (pid: 28831, task: 00000001ab64c638, ksp: 0000000215bbb5e0)
> 0000000000000000 000000027fbcf7b0 0000000000000002 0000000000000000
> 000000027fbcf850 000000027fbcf7c8 000000027fbcf7c8 00000000003b6696
>
> etc, etc for HUNDREDS of pages.. perhaps infinite.
>
> 00:
> 00: CP Q ALL
> 00: STORAGE = 15G CONFIGURED = 15G INC = 64M STANDBY = 0 RESERVED = 0
> 00: OSA 039C ATTACHED TO TCPIP 039C DEVTYPE OSA CHPID 01 OSD
> 00: OSA 039D ATTACHED TO TCPIP 039D DEVTYPE OSA CHPID 01 OSD
> 00: OSA 039E ATTACHED TO TCPIP 039E DEVTYPE OSA CHPID 01 OSD
> 00: OSA 03A0 ATTACHED TO DTCVSW2 03A0 DEVTYPE OSA CHPID 01 OSD
> 00: OSA 03A1 ATTACHED TO DTCVSW2 03A1 DEVTYPE OSA CHPID 01 OSD
> 00: OSA 03A2 ATTACHED TO DTCVSW2 03A2 DEVTYPE OSA CHPID 01 OSD
> 00: OSA 03C0 ATTACHED TO DTCVSW1 03C0 DEVTYPE OSA CHPID 02 OSD
> 00: OSA 03C1 ATTACHED TO DTCVSW1 03C1 DEVTYPE OSA CHPID 02 OSD
> 00: OSA 03C2 ATTACHED TO DTCVSW1 03C2 DEVTYPE OSA CHPID 02 OSD
> 00: FCP 5000 ATTACHED TO LINXDEV 5000 CHPID 46
> 00: WWPN C05076FAE3000400
> 00: FCP 5001 ATTACHED TO LINXD001 5001 CHPID 46
> 00: WWPN C05076FAE3000404
> 00: FCP 5002 ATTACHED TO LINXD002 5002 CHPID 46
> 00: WWPN C05076FAE3000408
> 00: FCP 5003 ATTACHED TO LINXD003 5003 CHPID 46
> 00: WWPN C05076FAE300040C
> 00: FCP 5100 ATTACHED TO LINXDEV 5100 CHPID 47
> 00: WWPN C05076FAE3000900
> 00: FCP 5101 ATTACHED TO LINXD001 5101 CHPID 47
> 00: WWPN C05076FAE3000904
> 00: FCP 5102 ATTACHED TO LINXD002 5102 CHPID 47
> 00: WWPN C05076FAE3000908
> 00: FCP 5103 ATTACHED TO LINXD003 5103 CHPID 47
> 00: WWPN C05076FAE300090C
> 00: DASD 9F7D CP SYSTEM VM6LXD 0
> 00: DASD 9F7E CP SYSTEM VM6LXE 0
> 00: DASD 9F80 CP SYSTEM VM6LX9 2
> 00: DASD 9F81 CP SYSTEM VM6LXA 2
> 00: DASD 9F82 CP SYSTEM VM6LXB 0
> 00: DASD 9F83 CP SYSTEM VM6LXC 0
> 00: DASD 9F84 CP OWNED VM6RES 135
> 00: DASD 9F85 CP OWNED VM6SPL 0
> 00: DASD 9F86 CP OWNED VM6PG1 0
> 00: DASD 9F87 CP OWNED VM6PG2 0
> 00: DASD 9F88 CP OWNED VM6LX1 4
> 00: DASD 9F89 CP SYSTEM VM6LX2 0
> 00: DASD 9F8A CP SYSTEM VM6LX3 0
> 00: DASD 9F8B CP SYSTEM VM6LX4 0
> 00: DASD 9F8C CP SYSTEM VM6LX5 2
> 00: DASD 9F8D CP SYSTEM VM6LX6 0
> 00: DASD 9F8E CP SYSTEM VM6LX7 0
> 00: DASD 9F8F CP SYSTEM VM6LX8 2
> 00: DASD 9FC7 CP SYSTEM VM6LX6 0
> 00: DASD 9FC8 CP SYSTEM VM6LX5 2
> 00: DASD 9FC9 CP SYSTEM VM6LX2 0
> 00: DASD 9FCA CP SYSTEM VM6LX4 0
> 00: DASD 9FCB CP SYSTEM VM6LX3 0
> 00: DASD 9FCE CP SYSTEM VM6LX1 4
> 00: DASD 9FCF CP SYSTEM VM6PG2 0
> 00: DASD 9FD0 CP SYSTEM VM6PG1 0
> 00: DASD 9FD1 CP SYSTEM VM6SPL 0
> 00: DASD 9FD2 CP SYSTEM VM6RES 135
>
>
>
>
>