All,

 

We are having some problems getting OCFS2 to run, we are using kernel 2.6.15 with OCFS2 1.2.1. Compiling the OCFS2 sources went fine and all modules load perfectly.

 

However, we can only mount the OCFS2 volume on one machine at a time, when we try to mount the volume on the 2 other machines we get an error stating that another node is heartbeating in our slot. When we mount the volume on the 2 other machines and look at the dmesg of the first machine which has the volume mounted nothing else appears, not even a message of the other nodes joining the cluster.

 

The cluster.conf is the same on all 3 nodes:

cluster:

      node_count = 3

      name = ocfs2

 

node:

      ip_port = 7777

      ip_address = 172.28.100.27

      number = 1

      name = tilmysql1

      cluster = ocfs2

 

node:

      ip_port = 7777

      ip_address = 172.28.100.28

      number = 2

      name = tilmysql2

      cluster = ocfs2

 

node:

      ip_port = 7777

      ip_address = 172.28.100.29

      number = 3

      name = tilmysql3

      cluster = ocfs2

 

 

Dmesg output:

Mounting FS on node1 succeeds:

OCFS2 1.2.1 Fri May 26 11:27:14 CEST 2006 (build bd2f25ba0af9677db3572e3ccd92f739)

ocfs2_dlm: Nodes in domain ("38F7643CACA64C0A932E3B03419BBC62"): 1

kjournald starting.  Commit interval 5 seconds

ocfs2: Mounting device (8,17) on (node 1, slot 0)

 

Mounting FS on node2 fails when node1 has FS mounted:

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4159,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(3257,0):o2net_connect_expired:1444 ERROR: no connection established with node 1 after 10 seconds, giving up and returning errors.

(4157,1):dlm_request_join:786 ERROR: status = -107

(4157,1):dlm_try_to_join_domain:934 ERROR: status = -107

(4157,1):dlm_join_domain:1186 ERROR: status = -107

(4157,1):dlm_register_domain:1379 ERROR: status = -107

(4157,1):ocfs2_dlm_init:1996 ERROR: status = -107

(4157,1):ocfs2_mount_volume:1062 ERROR: status = -107

ocfs2: Unmounting device (8,17) on (node 2)

 

Mounting FS on node3 fails when node1 has FS mounted:

(4340,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(4340,0):o2hb_do_disk_heartbeat:962 ERROR: Device "sdb1": another node is heartbeating in our slot!

(3363,0):o2net_connect_expired:1444 ERROR: no connection established with node 1 after 10 seconds, giving up and returning errors.

(4338,0):dlm_request_join:786 ERROR: status = -107

(4338,0):dlm_try_to_join_domain:934 ERROR: status = -107

(4338,0):dlm_join_domain:1186 ERROR: status = -107

(4338,1):dlm_register_domain:1379 ERROR: status = -107

(4338,1):ocfs2_dlm_init:1996 ERROR: status = -107

(4338,1):ocfs2_mount_volume:1062 ERROR: status = -107

ocfs2: Unmounting device (8,17) on (node 3)

 

Also the ocfs2-tools 1.2.1 fails to build on Debian Sarge which we are using, we checked the dependencies and have these in place:

libglib2.0-dev (>= 2.2.3), libreadline5-dev, comerr-dev, uuid-dev, libblkid-dev (>= 1.36), debhelper (>= 3.0.5)

 

Building ocfs2-tools fails with an error on building fsck.ocfs2:

/usr/lib/libc_nonshared.a(elf-init.oS)(.gnu.linkonce.t.__i686.get_pc_thunk.bx+0x0): In function `__i686.get_pc_thunk.bx':

: multiple definition of `__i686.get_pc_thunk.bx'

../libocfs2/libocfs2.a(alloc.o)(.gnu.linkonce.t.__i686.get_pc_thunk.bx+0x0): first defined here

collect2: ld returned 1 exit status

make[2]: *** [fsck.ocfs2] Error 1

make[2]: Leaving directory `/usr/src/ocfs2-tools-1.2.1/fsck.ocfs2'

make[1]: *** [fsck.ocfs2] Error 2

make[1]: Leaving directory `/usr/src/ocfs2-tools-1.2.1'

make: *** [build-stamp] Error 2

debuild: fatal error at line 1219:

debian/rules build failed

 

Because we cannot build the ocfs2-tools 1.2.1 we are currently using the debian packages of ocfs-tools which are version 1.1.5. Could the outdated ocfs2-tools be causing the ‘another node is heartbeating in our slot’ errors?



HCN Logo

HCN, Hét Callcenter Netwerk B.V

Stationsstraat 19

Postbus 680, 5000 AR Tilburg

Sjon Stigter
System Engineer

Tel: 013 - 464 99 99

Mobiel: 06 53 86 16 87

Fax: 013 - 464 99 90

Email: [EMAIL PROTECTED]

Internet: http://www.hcn.nl


Dit bericht kan vertrouwelijke informatie bevatten. Indien u niet de geadresseerde van dit bericht bent, wordt u verzocht dit bericht te vernietigen zonder van de inhoud kennis te nemen en de inhoud ervan niet te gebruiken, niet te kopiëren en niet onder derden te verspreiden.

This message may contain information which is privileged or confidential. If you are not the named addressee of this message please destroy it without reading, using, copying or disclosing its contents to any other person.

_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Reply via email to