No, the kernel is old. A year+ old. Refer to this announcement below. http://oss.oracle.com/pipermail/ocfs2-announce/2008-July/000026.html
From the stack, it looks you are encountering the rename/extend race that was fixed a long time ago. http://oss.oracle.com/projects/ocfs2/news/article_14.html Peter Selzner wrote: > * Tao Ma <[EMAIL PROTECTED]> [01.08.08 10:58] > Hi, > > hanks for your quick reply. > > Here the details: > > xxx:/ # SPident > > CONCLUSION: System is up-to-date! > found SLE-10-i386-SP1 + "online updates" > > xxx:/ # uname -r > 2.6.16.46-0.12-bigsmp > > xxx:/ # cat /proc/fs/ocfs2/version > OCFS2 1.2.5-SLES-r2997 Tue Mar 27 16:33:19 EDT 2007 (build sles) > > xxx:/ # debugfs.ocfs2 -V > debugfs.ocfs2 1.2.3 > > We have 6 nodes in the cluster and the described behavior (freeze of > processes in a certain directory) was observed on all 6 nodes. Thanks. > > >> Hi, >> Please provide the detail info of ocfs2 version which may be helpful >> for >> diagnose. >> >> Peter Selzner wrote: >> >>> Hi, >>> we had this entries in /var/log/messeges a few days ago: >>> Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: bug >>> expression: i_size_read(inode) != (le64_to_cpu(fe->i_size) - >>> *bytes_extended) >>> Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: Inode >>> 8323098 i_size = 1572864, dinode i_size = 1568768, bytes_extended = 0, >>> new_i_size = 1576960 Jul 28 23:30:47 xxx kernel: klogd 1.4.1, ---------- >>> state >>> change ---------- Jul 28 23:30:47 xxx kernel: ------------[ cut here >>> ]------------ >>> Jul 28 23:30:47 xxx kernel: kernel BUG at fs/ocfs2/file.c:790! >>> Jul 28 23:30:47 xxx kernel: invalid opcode: 0000 [#1] >>> Jul 28 23:30:47 xxx kernel: SMP Jul 28 23:30:47 xxx kernel: last sysfs >>> file: >>> /class/infiniband/mthca1/board_id >>> Jul 28 23:30:47 xxx kernel: Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm >>> ocfs2_nodemanager configfs cpqci mptctl mptbase ipmi_si ipmi_devintf >>> ipmi_msghandler rdma_ucm rds ib_ucm ib_sdp rdma_cm iw_cm >>> ib_addr ib_local_sa ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad bonding >>> ib_mthca ib_mad ib_core button battery ac raw loop dm_round_robin >>> dm_multipath >>> dm_mod usbhid hw_random ide_cd uhci_hcd e1000 >>> cdrom ehci_hcd bnx2 usbcore ext3 jbd ata_piix ahci libata edd fan thermal >>> processor cciss sg qla2400 qla2300 qla2xxx firmware_class qla2xxx_conf >>> intermodule piix sd_mod scsi_mod ide_disk ide_core >>> Jul 28 23:30:47 xxx kernel: CPU: 2 Jul 28 23:30:47 xxx kernel: EIP: >>> 0060:[<f9de8173>] Tainted: P U VLI Jul 28 23:30:47 xxx kernel: >>> EFLAGS: >>> 00210292 (2.6.16.46-0.12-bigsmp #1) Jul 28 23:30:47 xxx kernel: EIP is at >>> ocfs2_extend_file+0x3cd/0xf9b [ocfs2] >>> Jul 28 23:30:47 xxx kernel: eax: 0000008c ebx: 00000000 ecx: ffffff00 >>> edx: 00200286 >>> Jul 28 23:30:47 xxx kernel: esi: 00000000 edi: 00000000 ebp: df05f000 >>> esp: e398de70 >>> Jul 28 23:30:47 xxx kernel: ds: 007b es: 007b ss: 0068 >>> Jul 28 23:30:47 xxx kernel: Process mv (pid: 12268, threadinfo=e398c000 >>> task=f7f80660) >>> Jul 28 23:30:47 xxx kernel: Stack: <0>00000000 dd4f9d88 ce48c000 00000000 >>> 00000000 00000001 cf253280 dd4f9b80 Jul 28 23:30:47 xxx kernel: >>> dd4f9ee4 0017f000 00000000 00000000 f9ddf432 e398dea8 dd4f9b80 00000000 Jul >>> 28 >>> 23:30:47 xxx kernel: 00000001 e398deb4 e398deb4 ce48c000 00000000 >>> 00000000 ece0bc00 00000000 Jul 28 23:30:47 xxx kernel: Call Trace: >>> Jul 28 23:30:47 xxx kernel: [<f9ddf432>] >>> ocfs2_status_completion_cb+0x0/0xa >>> [ocfs2] >>> Jul 28 23:30:47 xxx kernel: [<f9df72f2>] >>> ocfs2_write_lock_maybe_extend+0xb2f/0xde3 [ocfs2] >>> Jul 28 23:30:47 xxx kernel: [<f9dea85d>] ocfs2_file_write+0x125/0x24d >>> [ocfs2] >>> Jul 28 23:30:47 xxx kernel: [<f9dea738>] ocfs2_file_write+0x0/0x24d [ocfs2] >>> Jul 28 23:30:47 xxx kernel: [<c0164714>] vfs_write+0xaa/0x152 >>> Jul 28 23:30:47 xxx kernel: [<c0164d1f>] sys_write+0x3c/0x63 >>> Jul 28 23:30:47 xxx kernel: [<c0103cab>] sysenter_past_esp+0x54/0x79 >>> Jul 28 23:30:47 xxx kernel: Code: 8b 4c 24 3c ff 71 04 ff 31 68 16 03 00 00 >>> 68 >>> 2b b5 e0 f9 ff 70 10 8b 00 ff b0 c0 00 00 00 68 b1 fd e0 f9 e8 ca a8 33 c6 >>> 83 >>> c4 3c <0f> 0b 16 03 db fb e0 f9 8b 5c 24 20 >>> 8b 03 0f ae e8 89 f6 8b 74 It was impossible to do "ls -al" in a certain >>> directory (each process that >>> "touched" files in this directory ends in DEAD state (uninterruptible >>> sleep). >>> Any suggestions? Thanks. >>> >> How do this happen and could you please explain it in more detail? e.g, how >> many nodes are in your cluster? you hang in one node, how about other nodes >> or >> what you are doing in other nodes. >> >> Regards, >> Tao >> > > > Mit freundlichen Gruessen > Peter Selzner > > -- > | Peter Selzner mail: [EMAIL PROTECTED] | > | Kommunales Rechenzentrum (KRZ) tel: +49 (0)5261-252-273 | > | Minden-Ravensberg / Lippe fax: +49 (0)5261-932-273 | > > _______________________________________________ > Ocfs2-users mailing list > [email protected] > http://oss.oracle.com/mailman/listinfo/ocfs2-users > _______________________________________________ Ocfs2-users mailing list [email protected] http://oss.oracle.com/mailman/listinfo/ocfs2-users
