* Tao Ma <[EMAIL PROTECTED]> [01.08.08 10:58] Hi, hanks for your quick reply.
Here the details: xxx:/ # SPident CONCLUSION: System is up-to-date! found SLE-10-i386-SP1 + "online updates" xxx:/ # uname -r 2.6.16.46-0.12-bigsmp xxx:/ # cat /proc/fs/ocfs2/version OCFS2 1.2.5-SLES-r2997 Tue Mar 27 16:33:19 EDT 2007 (build sles) xxx:/ # debugfs.ocfs2 -V debugfs.ocfs2 1.2.3 We have 6 nodes in the cluster and the described behavior (freeze of processes in a certain directory) was observed on all 6 nodes. Thanks. > Hi, > Please provide the detail info of ocfs2 version which may be helpful > for > diagnose. > > Peter Selzner wrote: > >Hi, > >we had this entries in /var/log/messeges a few days ago: > >Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: bug > >expression: i_size_read(inode) != (le64_to_cpu(fe->i_size) - *bytes_extended) > >Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: Inode > >8323098 i_size = 1572864, dinode i_size = 1568768, bytes_extended = 0, > >new_i_size = 1576960 Jul 28 23:30:47 xxx kernel: klogd 1.4.1, ---------- > >state > >change ---------- Jul 28 23:30:47 xxx kernel: ------------[ cut here > >]------------ > >Jul 28 23:30:47 xxx kernel: kernel BUG at fs/ocfs2/file.c:790! > >Jul 28 23:30:47 xxx kernel: invalid opcode: 0000 [#1] > >Jul 28 23:30:47 xxx kernel: SMP Jul 28 23:30:47 xxx kernel: last sysfs file: > >/class/infiniband/mthca1/board_id > >Jul 28 23:30:47 xxx kernel: Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm > >ocfs2_nodemanager configfs cpqci mptctl mptbase ipmi_si ipmi_devintf > >ipmi_msghandler rdma_ucm rds ib_ucm ib_sdp rdma_cm iw_cm > >ib_addr ib_local_sa ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad bonding > >ib_mthca ib_mad ib_core button battery ac raw loop dm_round_robin > >dm_multipath > >dm_mod usbhid hw_random ide_cd uhci_hcd e1000 > >cdrom ehci_hcd bnx2 usbcore ext3 jbd ata_piix ahci libata edd fan thermal > >processor cciss sg qla2400 qla2300 qla2xxx firmware_class qla2xxx_conf > >intermodule piix sd_mod scsi_mod ide_disk ide_core > >Jul 28 23:30:47 xxx kernel: CPU: 2 Jul 28 23:30:47 xxx kernel: EIP: > >0060:[<f9de8173>] Tainted: P U VLI Jul 28 23:30:47 xxx kernel: > >EFLAGS: > >00210292 (2.6.16.46-0.12-bigsmp #1) Jul 28 23:30:47 xxx kernel: EIP is at > >ocfs2_extend_file+0x3cd/0xf9b [ocfs2] > >Jul 28 23:30:47 xxx kernel: eax: 0000008c ebx: 00000000 ecx: ffffff00 > >edx: 00200286 > >Jul 28 23:30:47 xxx kernel: esi: 00000000 edi: 00000000 ebp: df05f000 > >esp: e398de70 > >Jul 28 23:30:47 xxx kernel: ds: 007b es: 007b ss: 0068 > >Jul 28 23:30:47 xxx kernel: Process mv (pid: 12268, threadinfo=e398c000 > >task=f7f80660) > >Jul 28 23:30:47 xxx kernel: Stack: <0>00000000 dd4f9d88 ce48c000 00000000 > >00000000 00000001 cf253280 dd4f9b80 Jul 28 23:30:47 xxx kernel: > >dd4f9ee4 0017f000 00000000 00000000 f9ddf432 e398dea8 dd4f9b80 00000000 Jul > >28 > >23:30:47 xxx kernel: 00000001 e398deb4 e398deb4 ce48c000 00000000 > >00000000 ece0bc00 00000000 Jul 28 23:30:47 xxx kernel: Call Trace: > >Jul 28 23:30:47 xxx kernel: [<f9ddf432>] ocfs2_status_completion_cb+0x0/0xa > >[ocfs2] > >Jul 28 23:30:47 xxx kernel: [<f9df72f2>] > >ocfs2_write_lock_maybe_extend+0xb2f/0xde3 [ocfs2] > >Jul 28 23:30:47 xxx kernel: [<f9dea85d>] ocfs2_file_write+0x125/0x24d > >[ocfs2] > >Jul 28 23:30:47 xxx kernel: [<f9dea738>] ocfs2_file_write+0x0/0x24d [ocfs2] > >Jul 28 23:30:47 xxx kernel: [<c0164714>] vfs_write+0xaa/0x152 > >Jul 28 23:30:47 xxx kernel: [<c0164d1f>] sys_write+0x3c/0x63 > >Jul 28 23:30:47 xxx kernel: [<c0103cab>] sysenter_past_esp+0x54/0x79 > >Jul 28 23:30:47 xxx kernel: Code: 8b 4c 24 3c ff 71 04 ff 31 68 16 03 00 00 > >68 > >2b b5 e0 f9 ff 70 10 8b 00 ff b0 c0 00 00 00 68 b1 fd e0 f9 e8 ca a8 33 c6 > >83 > >c4 3c <0f> 0b 16 03 db fb e0 f9 8b 5c 24 20 > >8b 03 0f ae e8 89 f6 8b 74 It was impossible to do "ls -al" in a certain > >directory (each process that > >"touched" files in this directory ends in DEAD state (uninterruptible sleep). > >Any suggestions? Thanks. > How do this happen and could you please explain it in more detail? e.g, how > many nodes are in your cluster? you hang in one node, how about other nodes > or > what you are doing in other nodes. > > Regards, > Tao Mit freundlichen Gruessen Peter Selzner -- | Peter Selzner mail: [EMAIL PROTECTED] | | Kommunales Rechenzentrum (KRZ) tel: +49 (0)5261-252-273 | | Minden-Ravensberg / Lippe fax: +49 (0)5261-932-273 | _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
