I'm in the process of trying to upgrade the kernel on my OCFS2 cluster from
2.6.18 to 2.6.22. I've downloaded and compiled the source for kernel 2.6.22.14
on my machines, and I've restarted each of the machines with the new kernel.
Two of the machines are x86_64 (Intel core2) machines, the other one is an
older Xeon/i686 machine. The cluster worked fine in 2.6.18.8. With the new
kernel, the first machine is able to mount the filesystem correctly, but when
the second machine goes to mount the filesystem I get the following error in
the /var/log/messages file and dmesg output:
Dec 4 15:19:51 vmdevel2 kernel: o2net: accepted connection from node vmdevel0
(num 2) at 192.168.30.20:7777
Dec 4 15:19:55 vmdevel2 kernel: ocfs2_dlm: Node 2 joins domain
9F27F6D72A954780B76A4FD9CD97A0AF
Dec 4 15:19:55 vmdevel2 kernel: ocfs2_dlm: Nodes in domain
("9F27F6D72A954780B76A4FD9CD97A0AF"): 1 2
Dec 4 15:19:55 vmdevel2 kernel: (31952,0):ocfs2_process_vote:206 ERROR:
message to node 2 fails with error -75!
One the second machine, the machine trying to join the cluster and mount the
filesystem, this is the output:
Dec 4 15:18:11 vmdevel0 kernel: o2net: connected to node vmdevel2 (num 1) at
192.168.30.22:7777
Dec 4 15:18:15 vmdevel0 kernel: OCFS2 1.3.3
Dec 4 15:18:15 vmdevel0 kernel: ocfs2_dlm: Nodes in domain
("9F27F6D72A954780B76A4FD9CD97A0AF"): 1 2
Dec 4 15:18:15 vmdevel0 kernel: kjournald starting. Commit interval 5 seconds
Seems the second machine believes it has joined okay, but then just hangs.
Once it actually started to heartbeat, but then locked hard and I had to power
cycle it to get it back. Anyone have any clues on where I might go to start?
Again, kernel is 2.6.22.14 vanilla; tools version is 1.2.2.
Thanks,
Nick
_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users