The ocfs2 version should be the same on all the nodes. Mixing nodes
with 1.1.8 and 1.2.1 will cause problems. We had fixed a lot of issues
in 1.2.1. I'll write more when I reread your prev email.

Vladan Gunjic wrote:
I'm using ocfs2 and all modules from Suse (SLES9), no self compilations.
Here are the details:

* 32-bit machine (writing to ocfs2 partition/LUN and where the corruption was 
reported):
Kernel: 2.6.5-7.257-bigsmp #1 SMP  i686 i386 GNU/Linux
OCFS2 rpms:     ocfs2console-1.2.1-4.2
                        ocfs2-tools-1.2.1-4.2
o2cb_ctl -V:    o2cb_ctl version 1.2.1
/etc/init.d/o2cb status:
                        Module "configfs": Loaded
                        Filesystem "configfs": Mounted
                        Module "ocfs2_nodemanager": Loaded
                        Module "ocfs2_dlm": Loaded
                        Module "ocfs2_dlmfs": Loaded
                        Filesystem "ocfs2_dlmfs": Mounted
                        Checking cluster dbrac: Online
                        Checking heartbeat: Active
/etc/init.d/ocfs2 status:
                        Configured OCFS2 mountpoints:  /mnt/emcpowera1 
mnt/emcpowere1
                        Active OCFS2 mountpoints:  /mnt/emcpowera1 
/mnt/emcpowere1

* 2 identical 64-bit machines (that are supposed to use the data after 32->64 
bit conversion):
Kernel: 2.6.5-7.257-smp #1 SMP x86_64 GNU/Linux
OCFS2 rpms:     ocfs2console-1.2.1-4.2
                        ocfs2-tools-1.2.1-4.2
o2cb_ctl -V:    o2cb_ctl version 1.2.1
/etc/init.d/o2cb status:
                        Module "configfs": Loaded
                        Filesystem "configfs": Mounted
                        Module "ocfs2_nodemanager": Loaded
                        Module "ocfs2_dlm": Loaded
                        Module "ocfs2_dlmfs": Loaded
                        Filesystem "ocfs2_dlmfs": Mounted
                        Checking cluster dbrac: Online
                        Checking heartbeat: Active
/etc/init.d/ocfs2 status:
                        Configured OCFS2 mountpoints:  /mnt/emcpowerd1
                        Active OCFS2 mountpoints:  /mnt/emcpowerd1
(other 2 64-bit machines have other LUN from 32-bit machine mounted)

modinfo on all 5 machines:

1. (32-bit)
license:        GPL
author:         Oracle
version:        1.2.1-SLES AC2C92855997647E2A862F0
description:    OCFS2 1.2.1-SLES Thu Apr 20 18:03:18 PDT 2006 (build sles)
depends:        ocfs2_nodemanager,ocfs2_dlm,jbd
supported:      yes
vermagic:       2.6.5-7.257-bigsmp SMP PENTIUMII REGPARM gcc-3.3


========== next 2 machines are mounting the LUN that was corrupted (will be one 
Oracle RAC):
2. (64-bit)
license:        GPL
author:         Oracle
version:        1.2.1-SLES AC2C92855997647E2A862F0
description:    OCFS2 1.2.1-SLES Thu Apr 20 18:03:18 PDT 2006 (build sles)
depends:        ocfs2_nodemanager,ocfs2_dlm,jbd
supported:      yes
vermagic:       2.6.5-7.257-smp SMP gcc-3.3

3. (64-bit)
license:        GPL
author:         Oracle
version:        1.2.1-SLES AC2C92855997647E2A862F0
description:    OCFS2 1.2.1-SLES Thu Apr 20 18:03:18 PDT 2006 (build sles)
depends:        ocfs2_nodemanager,ocfs2_dlm,jbd
supported:      yes
vermagic:       2.6.5-7.257-smp SMP gcc-3.3

========== next 2 machines are mounting the LUN that was NOT corrupted (will be 
another Oracle RAC):
4. (64-bit)
license:        GPL
author:         Oracle
version:        1.1.8-SLES E9BF6AA66857FAE88EF441B
description:    OCFS2 1.1.8-SLES Tue Dec 13 18:20:37 PST 2005 (build sles)
depends:        ocfs2_nodemanager,ocfs2_dlm,jbd
supported:      yes
vermagic:       2.6.5-7.252-smp SMP gcc-3.3

5. (64-bit)
license:        GPL
author:         Oracle
version:        1.1.8-SLES E9BF6AA66857FAE88EF441B
description:    OCFS2 1.1.8-SLES Tue Dec 13 18:20:37 PST 2005 (build sles)
depends:        ocfs2_nodemanager,ocfs2_dlm,jbd
supported:      yes
vermagic:       2.6.5-7.252-smp SMP gcc-3.3

Additionally I noticed last night, when I was shortly disabling the complete 
network of all of those machines that after restoring the network, the last two 
machines (older ocfs2 version) were confused and didn't rejoin the cluster 
before the system reboot.

So, I guess first step is to update last two on ocfs2 version 1.2.1 ?
Although they were not directly involved in corruption, maybe indirect ?

Thanks,
Vladan


-----Ursprüngliche Nachricht-----
Von: Sunil Mushran [mailto:[EMAIL PROTECTED] Gesendet: Dienstag, 1. August 2006 04:29
An: Vladan Gunjic
Cc: ocfs2-users@oss.oracle.com
Betreff: Re: [Ocfs2-users] ocfs2_search_chain: Group Descriptor has bad 
signature

What version of ocfs2 is on the nodes? Do modinfo ocfs2 on all nodes.

The version of OCFS2 shipped with SLES9 SP3 varies with kernel.
Are you using the modules shipped by suse or building them yourself?


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users@oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to