Re: [Linux-cluster] Instability troubles

2008-01-11 Thread James Chamberlain
I have since switched all the nodes to use truely static addressing, and have not had a problem in the intervening week. I have not yet tried the "" trick that Lon mentioned, but I'm keeping that handy should problems crop up again. I take it back - just had another problem. I'm adding the

Re: [Linux-cluster] Fence don't work with HP iLo2 (fw.1.42

2008-01-11 Thread Lon Hohberger
On Fri, 2008-01-11 at 22:56 +0200, m.. mm.. wrote: > Hi > We are having some problems to get fence working with this hardware. > > Proliant 360 G5.. iLo firmware 1.42 > and os is, RedHat 5-> 2.6.18-8.el5 Cluster/GFS > > This script: fence_ilo > Fence shutdown works fine, but it don't restart it.

[Linux-cluster] Fence don't work with HP iLo2 (fw.1.42

2008-01-11 Thread m.. mm..
Hi We are having some problems to get fence working with this hardware. Proliant 360 G5.. iLo firmware 1.42 and os is, RedHat 5-> 2.6.18-8.el5 Cluster/GFS This script: fence_ilo Fence shutdown works fine, but it don't restart it. agent "fence_ilo" reports: failed to turn on Have someone working

[Linux-cluster] Graceful recover after connectivity failure

2008-01-11 Thread Cliff Hones
I am using Centos5.1 with GNBD and GNBD fencing. Following the failure of a cluster member - eg a temporary loss of connectivity - which results in the node being fenced, is there a clean way to re-join the cluster without having to reboot the affected node? I am finding that it is impossible to

Re: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread Paul n McDowell
Well, in my case that was fairly easy. We have hardware mirrored system disks (/, /usr, /var, /root, /opt.) so prior to performing my upgrades, I migrated any services that were running on that node, quiesced the system and then broke the mirror. I then brought the system back up with al

Re: [Linux-cluster] CS5 / something weird during tests

2008-01-11 Thread Lon Hohberger
On Fri, 2008-01-11 at 15:19 +0100, Alain Moulle wrote: > Hi > > On my two-nodes cluster with qdiskd : > when testing CS5 via a ifdown eth0 on node2(where is the heart-beat) > I have a strange behavior : the node2 is rebooted and service > is failovered by node1, fine. But after the reboot of node2

Re: [Linux-cluster] RE: scsi reservation

2008-01-11 Thread Ryan O'Hara
Did you pull the source from cvs or did you grab one of the tar.gz files? Ryan Alexandre Racine wrote: You are right, that package was not installed. So now I installed the package, and recompiled "fence", but "fence_scsi" is still not there in /sbin/ Any more idea? (Thanks for the first hi

Re: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread Paul n McDowell
I did a rolling upgrade of a 5 node GFS environment from 4.5 to 4.6 over a week and had no interoperability issues. I made sure that I had a solid roll-back plan before I upgraded each node just in case. [EMAIL PROTECTED] Sent by: [EMAIL PROTECTED] 01/11/2008 10:36 AM Please respond to lin

Re: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread gordan
I'm curious - please, do tell about the solid rollback plan. Something like the stackable wayback file system for root? On Fri, 11 Jan 2008, Paul n McDowell wrote: I did a rolling upgrade of a 5 node GFS environment from 4.5 to 4.6 over a week and had no interoperability issues. I made sure t

Re: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread chris barry
On Fri, 2008-01-11 at 10:36 -0500, [EMAIL PROTECTED] wrote: > Has anyone migrated from an existing 4.5 to the newer 4.6 cluster > suite? We would like to roll out 4.6 in our 11-node cluster, but only > one node at a time, over the course of 2-weeks. Does the two versions > intermix well enough to

RE: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread gordan
I have done similar, but again, all nodes at once (due to the fact that they all had all the filesystems including root shared via GFS). My upgrade was 5.0->5.1, though. Gordan On Fri, 11 Jan 2008, Steffen Plotner wrote: We have migrated from 4.5 to 4.6 without any problems, however all node

Re: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread Janne Peltonen
On Fri, Jan 11, 2008 at 04:13:38PM +, [EMAIL PROTECTED] wrote: > I have done similar, but again, all nodes at once (due to the fact that > they all had all the filesystems including root shared via GFS). My > upgrade was 5.0->5.1, though. I've done two successful upgrades 5.0-5.1 as a 'rolli

RE: [Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread Steffen Plotner
We have migrated from 4.5 to 4.6 without any problems, however all nodes at once - would not recommend different versions on nodes within the same cluster. Others might have other ideas? From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of [EM

[Linux-cluster] RHEL 4.5 -> 4.6 migration

2008-01-11 Thread rhurst
Has anyone migrated from an existing 4.5 to the newer 4.6 cluster suite? We would like to roll out 4.6 in our 11-node cluster, but only one node at a time, over the course of 2-weeks. Does the two versions intermix well enough to do that? Or do we need to do take some kind of special care or pre

Re: [Linux-cluster] cluster down network

2008-01-11 Thread Patrick Caulfeld
sahai srichock wrote: > I have two node cluster . > > > when i restart network on node2. > service network shutdown > service network start. > > and then > > on node2 > i can't start cman . Is cman running when you take the network down? you should shut down cman before removing the netwo

[Linux-cluster] CS5 / something weird during tests

2008-01-11 Thread Alain Moulle
Hi On my two-nodes cluster with qdiskd : when testing CS5 via a ifdown eth0 on node2(where is the heart-beat) I have a strange behavior : the node2 is rebooted and service is failovered by node1, fine. But after the reboot of node2 and re-launch of CS 4 daemons, I can't see via clustat any informa

RE: [Linux-cluster] Cluster fails after fencing by DRAC

2008-01-11 Thread Lon Hohberger
On Fri, 2008-01-11 at 12:00 +0100, MARY, Mathieu wrote: > hello, > > sorry to ask but is the "none" state a normal state for services? For cman services, yes. > I have issues with cluster services too and I've been told that this state > is not normal and indicates that the nodes didn't join th

[Linux-cluster] cluster down network

2008-01-11 Thread sahai srichock
I have two node cluster . /etc/cluster/cluster.conf

RE: [Linux-cluster] Cluster fails after fencing by DRAC

2008-01-11 Thread MARY, Mathieu
hello, sorry to ask but is the "none" state a normal state for services? I have issues with cluster services too and I've been told that this state is not normal and indicates that the nodes didn't join the fence domain that causing issues with rgmanager too. what does show clustat and cman_to