Running *many* SLES10 SP2 with Oracle 10.2.0.4.

No issues with storage creep. Running with external grid control servers.

Recommendations:
----------------

> Start with a clean initialization of the Linux guest to reset your memory 
> footprint.
> In additional to Rob's Tools use SAR -r to illustrate the storage growth over 
> time.
> Execute a ps aux to profile each process and find which ORCL process (or 
> other ) is creeping.

Eg.

USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
oracle    5034  0.0  0.5 644064 16728 ?        Ss    2009   2:11 
ora_pmon_onbasedb
oracle    5036  0.0  0.4 644448 15240 ?        Ss    2009   0:21 
ora_psp0_onbasedb
oracle    5038  0.0 14.3 642888 442672 ?       Ss    2009   0:58 
ora_mman_onbasedb
oracle    5040  0.0 10.5 650116 323884 ?       Ss    2009   2:44 
ora_dbw0_onbasedb
oracle    5042  0.1  0.8 666088 26380 ?        Ss    2009  12:54 
ora_lgwr_onbasedb
oracle    5044  0.0  1.6 648512 50844 ?        Ss    2009   3:51 
ora_ckpt_onbasedb
oracle    5046  0.0 14.2 647708 438380 ?       Ss    2009   1:01 
ora_smon_onbasedb
oracle    5048  0.0  4.1 645612 127252 ?       Ss    2009   0:00 
ora_reco_onbasedb
oracle    5050  0.0 11.7 653932 360628 ?       Ss    2009   3:05 
ora_cjq0_onbasedb
oracle    5052  0.0 14.3 647816 442300 ?       Ss    2009   1:31 
ora_mmon_onbasedb
oracle    5054  0.0  2.0 645604 62060 ?        Ss    2009   4:05 
ora_mmnl_onbasedb


Gerard


-----Original Message-----
From: Linux on 390 Port [mailto:[email protected]] On Behalf Of Rodery, 
Floyd A Mr CIV US DISA CDB12
Sent: Wednesday, December 30, 2009 5:48 PM
To: [email protected]
Subject: SLES10 - Oracle/Memory Issues (oom-killer)

We've noticed some repetitive error messages in regards to a memory
issue on one of our SLES 10 SP2.  We notice the error below every
morning around the same time, with multiple oom-kills of oracle, perl,
etc.  Anyone have any thoughts as to why this might be happening with
the amount of memory this guest has?  If you have any thoughts or need
anymore information, I would certainly appreciate it.

#MEMINFO (For a reference)

MemTotal:      5138052 kB
MemFree:        144308 kB
Buffers:        121768 kB
Cached:        3717360 kB
SwapCached:      46352 kB
Active:        2849880 kB
Inactive:      1590368 kB
HighTotal:           0 kB
HighFree:            0 kB
LowTotal:      5138052 kB
LowFree:        144308 kB
SwapTotal:     2352736 kB
SwapFree:      2116048 kB
Dirty:             376 kB
Writeback:           0 kB
AnonPages:      584892 kB
Mapped:        2159592 kB
Slab:           172056 kB
CommitLimit:   4921760 kB
Committed_AS:  5506584 kB
PageTables:     334304 kB
VmallocTotal: 4289716224 kB
VmallocUsed:      5024 kB
VmallocChunk: 4289711032 kB
HugePages_Total:     0
HugePages_Free:      0
HugePages_Rsvd:      0
Hugepagesize:     2048 kB


#ERROR MESSAGE  (excerpt from /var/log/warn, this error is repeated over
and over, killing several different processes within a couple minutes)

Dec 27 04:37:33 *SERVER NAME* kernel: oracle invoked oom-killer:
gfp_mask=0x201d2, order=0, oomkilladj=0
Dec 27 04:37:34 *SERVER NAME*  kernel: 00000000ccd13918 c54c2cf000000001
000000008142d938 000000000061a350 
Dec 27 04:37:34 *SERVER NAME*  kernel:        0000000061104620
0000000000000000 0000000000000000 0000000000105b52 
Dec 27 04:37:34 *SERVER NAME*  kernel:        0000000000000000
0000000000000000 00000000006009c0 0000000000000418 
Dec 27 04:37:34 *SERVER NAME*  kernel:        0000000000000001
0000000000000008 000000000000000e 00000000ccd139c0 
Dec 27 04:37:34 *SERVER NAME*  kernel:        00000000004ad158
0000000000105b52 00000000ccd13948 00000000ccd13988 
Dec 27 04:37:34 *SERVER NAME*  kernel: Call Trace:
Dec 27 04:37:34 *SERVER NAME*  kernel: ([<0000000000105b6e>]
dump_stack+0x2aa/0x364)
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<00000000001c3994>]
out_of_memory+0x3b8/0x934
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<00000000001c7216>]
__alloc_pages+0x29a/0x370
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<00000000001ce582>]
do_page_cache_readahead+0x156/0x3a8
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<00000000001bd55c>]
filemap_nopage+0x210/0xa4c
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<00000000001e18be>]
__handle_mm_fault+0x282/0x114c
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<0000000000102080>]
do_dat_exception+0x584/0x858
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<0000000000115d16>]
sysc_return+0x0/0x10
Dec 27 04:37:34 *SERVER NAME*  kernel:  [<000000008061041c>] 0x8061041c
Dec 27 04:37:34 *SERVER NAME*  kernel: 
Dec 27 04:37:34 *SERVER NAME*  kernel: Mem-info:
Dec 27 04:37:34 *SERVER NAME*  kernel: DMA per-cpu:
Dec 27 04:37:34 *SERVER NAME*  kernel: CPU    0: Hot: hi:  186, btch:
31 usd: 159   Cold: hi:   62, btch:  15 usd:  58
Dec 27 04:37:34 *SERVER NAME*  kernel: CPU    1: Hot: hi:  186, btch:
31 usd: 156   Cold: hi:   62, btch:  15 usd:  27
Dec 27 04:37:34 *SERVER NAME*  kernel: Normal per-cpu:
Dec 27 04:37:34 *SERVER NAME*  kernel: CPU    0: Hot: hi:  186, btch:
31 usd: 151   Cold: hi:   62, btch:  15 usd:  56
Dec 27 04:37:34 *SERVER NAME*  kernel: CPU    1: Hot: hi:  186, btch:
31 usd: 165   Cold: hi:   62, btch:  15 usd:  14
Dec 27 04:37:34 *SERVER NAME*  kernel: Free pages:       21488kB (0kB
HighMem)
Dec 27 04:37:34 *SERVER NAME*  kernel: Active:832717 inactive:260329
dirty:0 writeback:0 unstable:0 free:5372 slab:14827 mapped:650
pagetables:14672
2
Dec 27 04:37:34 *SERVER NAME*  kernel: DMA free:16016kB min:3660kB
low:4572kB high:5488kB active:807232kB inactive:896400kB
present:2097152kB pages_
scanned:3937815 all_unreclaimable? yes
Dec 27 04:37:34 *SERVER NAME*  kernel: lowmem_reserve[]: 0 0 3072 3072
Dec 27 04:37:34 *SERVER NAME*  kernel: Normal free:5472kB min:5492kB
low:6864kB high:8236kB active:2523636kB inactive:144916kB
present:3145728kB pag
es_scanned:7066898 all_unreclaimable? yes
Dec 27 04:37:34 *SERVER NAME*  kernel: lowmem_reserve[]: 0 0 0 0
Dec 27 04:37:34 *SERVER NAME*  kernel: DMA: 1260*4kB 1258*8kB 1*16kB
0*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB =
16016kB
Dec 27 04:37:34 *SERVER NAME*  kernel: Normal: 10*4kB 507*8kB 0*16kB
1*32kB 1*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB =
5472kB
Dec 27 04:37:34 *SERVER NAME*  kernel: Swap cache: add 31963696, delete
31964356, find 17660088/21209122, race 8+295
Dec 27 04:37:34 *SERVER NAME*  kernel: Free swap  = 0kB
Dec 27 04:37:35 *SERVER NAME*  kernel: Total swap = 2352736kB
Dec 27 04:37:35 *SERVER NAME*  kernel: Free swap:            0kB
Dec 27 04:37:35 *SERVER NAME*  kernel: 1310720 pages of RAM
Dec 27 04:37:35 *SERVER NAME*  kernel: 26207 reserved pages
Dec 27 04:37:35 *SERVER NAME*  kernel: 66904 pages shared
Dec 27 04:37:35 *SERVER NAME*  kernel: 352 pages swap cached
Dec 27 04:37:35 *SERVER NAME*  kernel: Out of Memory: Kill process 26578
(oracle) score 48132 and children.
Dec 27 04:37:35 *SERVER NAME*  kernel: Out of memory: Killed process
26578 (oracle).

#Floyd Rodery
DoD - DISA - DECC MECH
CDB 12 Linux on System z
717.605.8639 | 430.8639
[email protected]



----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to