Re: [Ocfs2-users] Unstable Cluster

2011-12-21 Thread Tony Rios
So just as an update on this issue, it turns out that I had the Ethernet interface used for iSCSI traffic as the same interface for handling the OCFS2 cluster and it just couldn't keep up. Raising the timeouts and moving to separate GigE ports helped tremendously. The next part that seems to ha

Re: [Ocfs2-users] Unstable Cluster

2011-12-09 Thread Tony Rios
Thanks for the info Werner. I already have the heartbeat increased to 61, that helped tremendously with reboot issues in the beginning. I'll go ahead and increase the others to see if that helps. I can't imagine why it should be so troublesome to keep this cluster stable. I am seeing output drops

Re: [Ocfs2-users] Unstable Cluster

2011-12-09 Thread Werner Flamme
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Tony Rios [09.12.2011 09:38]: > To add to my previous message, > > After some time waiting, I try to bring the mount point up again, I > get these messages, appearing like it is going to work. > > [ 127.520176] (mount.ocfs2,1388,0):dlm_join_doma

Re: [Ocfs2-users] Unstable Cluster

2011-12-09 Thread Brian Kroth
Sérgio Surkamp 2011-12-09 11:17: Hi. Why are you using OCFS2 version 1.5.0 in production? As long as I known, 1.5 series is for developers only. I think that's just the version tag line they give on the mainline kernel. It's not just for developers, it just may not be as well supported by

Re: [Ocfs2-users] Unstable Cluster

2011-12-09 Thread Sérgio Surkamp
Hi. Why are you using OCFS2 version 1.5.0 in production? As long as I known, 1.5 series is for developers only. Regards, Sérgio Em Fri, 9 Dec 2011 00:42:25 -0800 Tony Rios escreveu: > I managed to get ahold of the kernel panic message because it's > happening on any new machines I try to intr

[Ocfs2-users] Unstable Cluster

2011-12-09 Thread Tony Rios
I managed to get ahold of the kernel panic message because it's happening on any new machines I try to introduce to the cluster: [ 66.276054] OCFS2 1.5.0 [ 66.337531] o2dlm: Nodes in domain A3AA504BE42E4D3D8A15248D8FCD49BB: 3 5 [ 66.380092] ocfs2: Mounting device (8,16) on (node 5, slot 2)

[Ocfs2-users] Unstable Cluster

2011-12-09 Thread Tony Rios
To add to my previous message, After some time waiting, I try to bring the mount point up again, I get these messages, appearing like it is going to work. [ 127.520176] (mount.ocfs2,1388,0):dlm_join_domain:1857 Timed out joining dlm domain A3AA504BE42E4D3D8A15248D8FCD49BB after 94000 msecs

[Ocfs2-users] Unstable Cluster

2011-12-09 Thread Tony Rios
Having major OCFS2 blues here... Still having issues maintaining a stable cluster. I've tried isolating the issue by getting an entirely dedicated ethernet switch for the OCFS2 cluster. I've tried shutting off all machines, and slowly bringing them back online. This sort of works. So far I ha

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-05 Thread Sunil Mushran
As I had mentioned earlier, 2.6.18 is missing a lot of fixes. We provide patch fixes for all bugs for all kernels starting from 2.6.20. That's the current cut-off. If you are on 2.6.18, you should look into switching to (rh)el5. That is 2.6.18 based. You could then just use ocfs2 1.2.7-1 packages

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-05 Thread rain c
Hi, as I wrote yesterday I applyed all the patches. Unfortunately it did not bring the wanted results. The same node crashed again with very similar messages today. I attached also the messages of the other node that stayed alive. Not to forget to mention that in the meantime I switched the

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-04 Thread Mark Fasheh
On Tue, Dec 04, 2007 at 09:56:23AM -0500, Randy Ramsdell wrote: > I thought about sending this directly to you, but it is related to the > ocfs2-users group, because I regularly have to compile the ocfs2 drivers > and we do not upgrade our kernels for this. Each time I only have to patch > a ver

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-04 Thread rain c
Hi, first of all thank you very much for providing the patches to me so fast! On Monday, December 3, 2007 7:18:12 PM Mark Fasheh wrote: > Attached is a pair of patches which applied more cleanly. Basically it > includes another tcp.c fix which the -EAGAIN fix built on top of. Both would > be goo

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-04 Thread Randy Ramsdell
Mark Fasheh wrote: On Mon, Dec 03, 2007 at 02:28:42PM -0500, Randy Ramsdell wrote: And this is the MAIN reason driver upgrades should never be tied to a kernel upgrade. I will never understand why vendors and developers force a complete kernel OS upgrade for a simple driver. It makes absolu

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-03 Thread Mark Fasheh
On Mon, Dec 03, 2007 at 02:28:42PM -0500, Randy Ramsdell wrote: > And this is the MAIN reason driver upgrades should never be tied to a > kernel upgrade. I will never understand why vendors and developers force a > complete kernel OS upgrade for a simple driver. It makes absolutely zero > sense

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-03 Thread Randy Ramsdell
rain c wrote: Hi, thanks very much for your answer. My problem is, that I connot really use kernel 2.6.22, because I also need the openVZ patch which is not available in a stable version for 2.6.22. Is there a way to backport ocfs2-Retry-if-it-returns-EAGAIN to 2.6.18? Further I wonder why on

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-03 Thread Mark Fasheh
On Mon, Dec 03, 2007 at 04:45:01AM -0800, rain c wrote: > thanks very much for your answer. > My problem is, that I connot really use kernel 2.6.22, because I also need > the openVZ patch which is not available in a stable version for 2.6.22. Is > there a way to backport ocfs2-Retry-if-it-returns-E

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-03 Thread Sunil Mushran
It's not the patch will not work with 2.6.18, it's that it may not apply cleanly. Play around with it. You have access to the patch as well as the kernel. rain c wrote: Hi, thanks very much for your answer. My problem is, that I connot really use kernel 2.6.22, because I also need the openVZ p

Re: [Ocfs2-users] Unstable Cluster Node

2007-12-03 Thread rain c
Hi, thanks very much for your answer. My problem is, that I connot really use kernel 2.6.22, because I also need the openVZ patch which is not available in a stable version for 2.6.22. Is there a way to backport ocfs2-Retry-if-it-returns-EAGAIN to 2.6.18? Further I wonder why only one (and alwa

Re: [Ocfs2-users] Unstable Cluster Node

2007-11-30 Thread Mark Fasheh
On Fri, Nov 30, 2007 at 03:25:27AM -0800, rain c wrote: > uname -a > Linux webhost1 2.6.18-028stab039 #2 SMP Tue Aug 21 17:49:05 UTC 2007 i686 > GNU/Linux > > Both nodes are in the same bladecenter an directly connected with 1Gbit/s by > the baldecenters internal ethernet switch. > > One of the

[Ocfs2-users] Unstable Cluster Node

2007-11-30 Thread rain c
Hi, I have a 2-Node OCFS2 Cluster on top of DRBD 8.0.4. The kernel version I use is: uname -a Linux webhost1 2.6.18-028stab039 #2 SMP Tue Aug 21 17:49:05 UTC 2007 i686 GNU/Linux Both nodes are in the same bladecenter an directly connected with 1Gbit/s by the baldecenters internal ethernet swi