ok will do. Just a little background: We are doing reads of up to 220MB/s for 20min (aggregated on all 10 nodes) and towards the end of the 20min we are writing ~45 x 2k files to the OCFS2 volume. During the read, I notice that the Cache Buffers on all the nodes are exhausted .
This oops only happens currently on one of the nodes. I am relucatnt to force a reboot on oops. Is this a must? Thanks Laurence On Thu, Sep 24, 2009 at 8:06 PM, Sunil Mushran <sunil.mush...@oracle.com>wrote: > So a read on some file on a xfs volume, triggered a mem alloc which inturn > triggered the kernel to free up some memory. The oops happens when it is > trying to free up an ocfs2 inode. > > Do: > # cat /proc/sys/kernel/panic_on_oops > > If this returns 0, do: > # echo 1 > /proc/sys/kernel/panic_on_oops > This is documented in the user's guide. > > File a bugzilla in oss.oracle.com/bugzilla. _Attach_ this oops report. > Do not cut-paste. It is hard to read. Also _attach_ the objdump output. > # objdump -DSl /lib/modules/`uname -r`/kernel/fs/ocfs2/ocfs2.ko > >/tmp/ocfs2.out > > Bottomline, that it is working just means that you will encounter the > problem > later. The problem in this case will most likely be another oops. Or, a > hang. > > Upload the outputs. I'll try to see if we have already addressed this > issue. > This kernel is fairly old, btw. > > Sunil > > Laurence Mayer wrote: > >> OS: Ubuntu 8.04 x64 >> Kern: Linux n1 2.6.24-24-server #1 SMP Tue Jul 7 19:39:36 UTC 2009 x86_64 >> GNU/Linux >> 10 Node Cluster >> OCFS2 Version: >> ocfs2-tools 1.3.9-0ubuntu1 >> ocfs2-tools-static-dev 1.3.9-0ubuntu1 ocfs2console >> 1.3.9-0ubuntu1 >> r...@n1:~# cat /proc/meminfo >> MemTotal: 16533296 kB >> MemFree: 47992 kB >> Buffers: 179240 kB >> Cached: 13185084 kB >> SwapCached: 72 kB >> Active: 4079712 kB >> Inactive: 12088860 kB >> SwapTotal: 31246416 kB >> SwapFree: 31246344 kB >> Dirty: 2772 kB >> Writeback: 4 kB >> AnonPages: 2804460 kB >> Mapped: 51556 kB >> Slab: 223976 kB >> SReclaimable: 61192 kB >> SUnreclaim: 162784 kB >> PageTables: 12148 kB >> NFS_Unstable: 8 kB >> Bounce: 0 kB >> CommitLimit: 39513064 kB >> Committed_AS: 3698728 kB >> VmallocTotal: 34359738367 kB >> VmallocUsed: 53888 kB >> VmallocChunk: 34359684419 kB >> HugePages_Total: 0 >> HugePages_Free: 0 >> HugePages_Rsvd: 0 >> HugePages_Surp: 0 >> Hugepagesize: 2048 kB >> >> I have started seeing the below on one of the nodes. The node does not >> reboot it continues to function "normally" >> >> Is this a memory issue? >> >> Please can you provide direction. >> >> >> Sep 24 16:31:46 n1 kernel: [75206.689992] CPU 0 >> Sep 24 16:31:46 n1 kernel: [75206.690018] Modules linked in: ocfs2 crc32c >> libcrc32c nfsd auth_rpcgss exportfs ipmi_devintf ipmi_si ipmi_msghandler >> ipv6 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs iptable_filter >> ip_tables x_tables xfs ib_iser rdma_cm ib_cm iw_cm ib_sa ib_mad ib_core >> ib_addr iscsi_tcp libiscsi scsi_transport_iscsi nfs lockd nfs_acl sunrpc >> parport_pc lp parport loop serio_raw psmouse i2c_piix4 i2c_core dcdbas evdev >> button k8temp shpchp pci_hotplug pcspkr ext3 jbd mbcache sg sr_mod cdrom >> sd_mod ata_generic pata_acpi usbhid hid ehci_hcd tg3 sata_svw >> pata_serverworks ohci_hcd libata scsi_mod usbcore thermal processor fan >> fbcon tileblit font bitblit softcursor fuse >> Sep 24 16:31:46 n1 kernel: [75206.690455] Pid: 15931, comm: read_query >> Tainted: G D 2.6.24-24-server #1 >> Sep 24 16:31:46 n1 kernel: [75206.690509] RIP: 0010:[<ffffffff8856c404>] >> [<ffffffff8856c404>] :ocfs2:ocfs2_meta_lock_full+0x6a4/0xec0 >> Sep 24 16:31:46 n1 kernel: [75206.690591] RSP: 0018:ffff8101c64c9848 >> EFLAGS: 00010292 >> Sep 24 16:31:46 n1 kernel: [75206.690623] RAX: 0000000000000092 RBX: >> ffff81034ba74000 RCX: 00000000ffffffff >> Sep 24 16:31:46 n1 kernel: [75206.690659] RDX: 00000000ffffffff RSI: >> 0000000000000000 RDI: ffffffff8058ffa4 >> Sep 24 16:31:46 n1 kernel: [75206.690695] RBP: 0000000100080000 R08: >> 0000000000000000 R09: 00000000ffffffff >> Sep 24 16:31:46 n1 kernel: [75206.690730] R10: 0000000000000000 R11: >> 0000000000000000 R12: ffff81033fca4e00 >> Sep 24 16:31:46 n1 kernel: [75206.690766] R13: ffff81033fca4f08 R14: >> ffff81033fca52b8 R15: ffff81033fca4f08 >> Sep 24 16:31:46 n1 kernel: [75206.690802] FS: 00002b312f0119f0(0000) >> GS:ffffffff805c5000(0000) knlGS:00000000f546bb90 >> Sep 24 16:31:46 n1 kernel: [75206.690857] CS: 0010 DS: 0000 ES: 0000 CR0: >> 000000008005003b >> Sep 24 16:31:46 n1 kernel: [75206.690890] CR2: 00002b89f1e81000 CR3: >> 0000000168971000 CR4: 00000000000006e0 >> Sep 24 16:31:46 n1 kernel: [75206.690925] DR0: 0000000000000000 DR1: >> 0000000000000000 DR2: 0000000000000000 >> Sep 24 16:31:46 n1 kernel: [75206.690961] DR3: 0000000000000000 DR6: >> 00000000ffff0ff0 DR7: 0000000000000400 >> Sep 24 16:31:46 n1 kernel: [75206.690998] Process read_query (pid: 15931, >> threadinfo ffff8101c64c8000, task ffff81021543f7d0) >> Sep 24 16:31:46 n1 kernel: [75206.691054] Stack: ffff810243c402af >> ffff810243c40299 ffff81021b462408 000000011b462440 >> Sep 24 16:31:46 n1 kernel: [75206.691116] ffff8101c64c9910 >> 0000000100000000 ffff810217564e00 ffffffff8029018a >> Sep 24 16:31:46 n1 kernel: [75206.691176] 0000000000000296 >> 0000000000000001 ffffffffffffffff ffff81004c052f70 >> Sep 24 16:31:46 n1 kernel: [75206.691217] Call Trace: >> Sep 24 16:31:46 n1 kernel: [75206.691273] [isolate_lru_pages+0x8a/0x210] >> isolate_lru_pages+0x8a/0x210 >> Sep 24 16:31:46 n1 kernel: [75206.691323] [<ffffffff8857d4db>] >> :ocfs2:ocfs2_delete_inode+0x16b/0x7e0 >> Sep 24 16:31:46 n1 kernel: [75206.691362] >> [shrink_inactive_list+0x202/0x3c0] shrink_inactive_list+0x202/0x3c0 >> Sep 24 16:31:46 n1 kernel: [75206.691409] [<ffffffff8857d370>] >> :ocfs2:ocfs2_delete_inode+0x0/0x7e0 >> Sep 24 16:31:46 n1 kernel: [75206.691449] >> [fuse:generic_delete_inode+0xa8/0x450] generic_delete_inode+0xa8/0x140 >> Sep 24 16:31:46 n1 kernel: [75206.691495] [<ffffffff8857cd6d>] >> :ocfs2:ocfs2_drop_inode+0x7d/0x160 >> Sep 24 16:31:46 n1 kernel: [75206.691533] [d_kill+0x3c/0x70] >> d_kill+0x3c/0x70 >> Sep 24 16:31:46 n1 kernel: [75206.691566] [prune_one_dentry+0xc1/0xe0] >> prune_one_dentry+0xc1/0xe0 >> Sep 24 16:31:46 n1 kernel: [75206.691600] [prune_dcache+0x166/0x1c0] >> prune_dcache+0x166/0x1c0 >> Sep 24 16:31:46 n1 kernel: [75206.691635] >> [shrink_dcache_memory+0x3e/0x50] shrink_dcache_memory+0x3e/0x50 >> Sep 24 16:31:46 n1 kernel: [75206.691670] [shrink_slab+0x124/0x180] >> shrink_slab+0x124/0x180 >> Sep 24 16:31:46 n1 kernel: [75206.691707] [try_to_free_pages+0x1e4/0x2f0] >> try_to_free_pages+0x1e4/0x2f0 >> Sep 24 16:31:46 n1 kernel: [75206.691749] [__alloc_pages+0x196/0x3d0] >> __alloc_pages+0x196/0x3d0 >> Sep 24 16:31:46 n1 kernel: [75206.691790] >> [__do_page_cache_readahead+0xe0/0x210] __do_page_cache_readahead+0xe0/0x210 >> Sep 24 16:31:46 n1 kernel: [75206.691834] >> [ondemand_readahead+0x117/0x1c0] ondemand_readahead+0x117/0x1c0 >> Sep 24 16:31:46 n1 kernel: [75206.691871] >> [do_generic_mapping_read+0x13d/0x3c0] do_generic_mapping_read+0x13d/0x3c0 >> Sep 24 16:31:46 n1 kernel: [75206.691908] [file_read_actor+0x0/0x160] >> file_read_actor+0x0/0x160 >> Sep 24 16:31:46 n1 kernel: [75206.691949] >> [xfs:generic_file_aio_read+0xff/0x1b0] generic_file_aio_read+0xff/0x1b0 >> Sep 24 16:31:46 n1 kernel: [75206.692026] [xfs:xfs_read+0x11c/0x250] >> :xfs:xfs_read+0x11c/0x250 >> Sep 24 16:31:46 n1 kernel: [75206.692067] [xfs:do_sync_read+0xd9/0xbb0] >> do_sync_read+0xd9/0x120 >> Sep 24 16:31:46 n1 kernel: [75206.692101] [getname+0x1a9/0x220] >> getname+0x1a9/0x220 >> Sep 24 16:31:46 n1 kernel: [75206.692140] [<ffffffff80254530>] >> autoremove_wake_function+0x0/0x30 >> Sep 24 16:31:46 n1 kernel: [75206.692185] [vfs_read+0xed/0x190] >> vfs_read+0xed/0x190 >> Sep 24 16:31:46 n1 kernel: [75206.692220] [sys_read+0x53/0x90] >> sys_read+0x53/0x90 >> Sep 24 16:31:46 n1 kernel: [75206.692256] [system_call+0x7e/0x83] >> system_call+0x7e/0x83 >> Sep 24 16:31:46 n1 kernel: [75206.692293] >> Sep 24 16:31:46 n1 kernel: [75206.692316] >> Sep 24 16:31:46 n1 kernel: [75206.692317] Code: 0f 0b eb fe 83 fd fe 0f 84 >> 73 fc ff ff 81 fd 00 fe ff ff 0f >> Sep 24 16:31:46 n1 kernel: [75206.692483] RSP <ffff8101c64c9848> >> >> >> Thanks >> Laurence >> >> _______________________________________________ >> Ocfs2-users mailing list >> Ocfs2-users@oss.oracle.com >> http://oss.oracle.com/mailman/listinfo/ocfs2-users >> >> > >
_______________________________________________ Ocfs2-users mailing list Ocfs2-users@oss.oracle.com http://oss.oracle.com/mailman/listinfo/ocfs2-users