[Linux-cluster] GFS reports withdraw from cluster

2008-03-28 Thread Ewald Beekman
We have a RHEL4 2 node cluster running on
HP Integrity servers (rx4640, 4 Itanium's, 32GB RAM each).
The GFS is running on top of a SAN (EMC Cx380's).

We get these kinds of errors:
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: fatal: invalid 
metadata block
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0:   bh = 
16659526 (type: exp=5, found=4)
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0:   function = 
gfs_get_meta_buffer
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0:   file = 
/builddir/build/BUILD/gfs-kernel-2.6.9-75/up/src/gfs/dio.c, line = 1223
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0:   time = 
1206652995 Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: 
about to withdraw from the cluster
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: waiting for 
outstanding I/O
Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: telling LM to 
withdraw
Mar 27 22:23:15 idefix kernel: lock_dlm: withdraw abandoned memory
and then the filesystem is gone :-(

We are up2date with current patch levels:
GFS-6.1.15-1
GFS-kernel-2.6.9-75.12
cman-kernel-2.6.9-53.9
cman-1.0.17-0.el4_6.5

Not sure whether this is a SAN issue (that would be bad, 'cause
a lot of systemen are dependant of the SAN) or OS or inside
the GFS?

Any help is greatly appreciated.

Ewald...

-- 
Ewald Beekman, Security Engineer, Academic Medical Center,
dept. ADICT/AD  Server  en  Netwerkbeheer, The Netherlands
## Your mind-mint is:
Blessed is the man who, having nothing to say, abstains from giving
wordy evidence of the fact.
-- George Eliot

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite

2008-03-28 Thread Jeff Macfarland
Nice overview. Wish I had this a few weeks ago :-)

I am curious as to why LVM2 is required? With simple modification of the
scsi_reserve (and maybe fence_scsi), using an msdos partitioned disk
seems to work fine.

This is only in testing but I haven't seen any issues as of yet.

Ryan O'Hara wrote:
 Attached is the latest version of the Using SCSI Persistent
 Reservations with Red Hat Cluster Suite document for review.
 
 Feel free to send questions and comments.
 
 -Ryan


-- 
Jeff Macfarland ([EMAIL PROTECTED])
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


RE: [Linux-cluster] RHCS / DRBD / MYSQL

2008-03-28 Thread Lon Hohberger
On Fri, 2008-03-28 at 17:24 +1300, Arjuna Christensen wrote:

 === Resource Tree ===
 service {
   name = asteriskcluster;
   domain = asterisk;
   autostart = 1;
   ip {
 address = 192.168.111.1;
 monitor_link = 1;
 script {
   name = drbdcontrol;
   file = /etc/init.d/drbdcontrol;
 fs {
 name = mysqlfs;
 mountpoint = /var/lib/mysql;
 device = /dev/drbd0;
 fstype = ext3;
 force_unmount = 0;
 self_fence = 1;
 fsid = 11607;
 force_fsck = 0;
 options = ;
 script {
   name = asterisk;
   file = /etc/init.d/asterisk;
   }
 script {
   name = mysql;
   file = /etc/init.d/mysql;
   }
   }
 }
   }
 }

This looks fine.  The only thing I can think of is:

 * that your version of rg_test doesn't match clurgmgrd for some reason.
I don't know why that would happen.

 * you have an old version of ccs installed with a new version of
rgmanager.  rg_test doesn't use ccs - so the query it uses to find
children works even if ccs breaks.

I'd pull the most recent snapshot of rgmanager for the version you're
running (e.g. from the RHEL4 or RHEL5 branch; Fabio seemed to think
yours was pretty old).

-- Lon

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite

2008-03-28 Thread Tomasz Sucharzewski
Hello,

Ryan O'Hara wrote:
4  - Limitations
...
- Multipath devices are not currently supported.

What is the reason - it is strongly required to use at least two HBA in a SAN 
network, which is useless when using scsi reservation.

Regards,
Tomasz Sucharzewski

On Fri, 28 Mar 2008 09:20:53 -0500
Ryan O'Hara [EMAIL PROTECTED] wrote:

 4  - Limitations
 
 In addition to these requirements, fencing by way of SCSI persistent
 reservations also some limitations.
 
 - Multipath devices are not currently supported.

-- 
Tomasz Sucharzewski [EMAIL PROTECTED]

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite

2008-03-28 Thread Alex Kompel
On Fri, Mar 28, 2008 at 8:15 AM, Ryan O'Hara [EMAIL PROTECTED] wrote:

 The reason for the cluster LVM2 requirement is for device discovery. The
 scripts use LVM commands to find cluster volumes and then gets a list of
 devices that make up those volumes. Consider the alternative -- users
 would have to manually define a list of devices that need
 registrations/reservations. This would have to be defined on each node.
 What make this even more problematic is that each node may have
 different device names for shared storage devices (ie. what may be
 /deb/sdb on one node may be /deb/sdc on another). Furthermore, those
 device names could change between reboots. The general solution is to
 query clvmd for a list of cluster volumes and get a list of devices for
 those volumes.

You can also use symbolic links under /dev/disk/by-id/ which are
persistent across nodes/reboots.

-Alex

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite

2008-03-28 Thread Jeff Macfarland
True. Any solution for auto discovery? I've no problem with statically
defining a device,  but that would be pretty nice if possible.

Alex Kompel wrote:
 On Fri, Mar 28, 2008 at 8:15 AM, Ryan O'Hara [EMAIL PROTECTED] wrote:
 The reason for the cluster LVM2 requirement is for device discovery. The
 scripts use LVM commands to find cluster volumes and then gets a list of
 devices that make up those volumes. Consider the alternative -- users
 would have to manually define a list of devices that need
 registrations/reservations. This would have to be defined on each node.
 What make this even more problematic is that each node may have
 different device names for shared storage devices (ie. what may be
 /deb/sdb on one node may be /deb/sdc on another). Furthermore, those
 device names could change between reboots. The general solution is to
 query clvmd for a list of cluster volumes and get a list of devices for
 those volumes.
 
 You can also use symbolic links under /dev/disk/by-id/ which are
 persistent across nodes/reboots.
 
 -Alex
 
 --
 Linux-cluster mailing list
 Linux-cluster@redhat.com
 https://www.redhat.com/mailman/listinfo/linux-cluster
 
 ___
 
 Inbound Email has been scanned by Nexa Technologies Email Security Systems.
 ___

-- 
Jeff Macfarland ([EMAIL PROTECTED])
Nexa Technologies - 972.747.8879
Systems Administrator
GPG Key ID: 0x5F1CA61B
GPG Key Server: hkp://wwwkeys.pgp.net

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


RE: [Linux-cluster] RHCS / DRBD / MYSQL

2008-03-28 Thread Fabio M. Di Nitto

On Fri, 28 Mar 2008, Lon Hohberger wrote:


On Fri, 2008-03-28 at 17:24 +1300, Arjuna Christensen wrote:


=== Resource Tree ===
service {
  name = asteriskcluster;
  domain = asterisk;
  autostart = 1;
  ip {
address = 192.168.111.1;
monitor_link = 1;
script {
  name = drbdcontrol;
  file = /etc/init.d/drbdcontrol;
fs {
name = mysqlfs;
mountpoint = /var/lib/mysql;
device = /dev/drbd0;
fstype = ext3;
force_unmount = 0;
self_fence = 1;
fsid = 11607;
force_fsck = 0;
options = ;
script {
  name = asterisk;
  file = /etc/init.d/asterisk;
  }
script {
  name = mysql;
  file = /etc/init.d/mysql;
  }
  }
}
  }
}


This looks fine.  The only thing I can think of is:

* that your version of rg_test doesn't match clurgmgrd for some reason.
I don't know why that would happen.

* you have an old version of ccs installed with a new version of
rgmanager.  rg_test doesn't use ccs - so the query it uses to find
children works even if ccs breaks.

I'd pull the most recent snapshot of rgmanager for the version you're
running (e.g. from the RHEL4 or RHEL5 branch; Fabio seemed to think
yours was pretty old).


He is using an ubuntu package made up by a CVS snapshot from 20070315. All 
the versions of the binaries are coming from the same snapshot.


Fabio

--
I'm going to make him an offer he can't refuse.

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] Unformatting a GFS cluster disk

2008-03-28 Thread christopher barry
On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote:
 On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote:
  On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote:
  ...
Can you point me at any docs that describe how best to implement snaps
against a gfs lun?
   
   FYI, the NetApp snapshot capability is a result of their WAFL 
   filesystem
   http://www.google.com/search?q=netapp+wafl.  Basically, they use a
   copy-on-write mechanism that naturally maintains older versions of disk 
   blocks.
   
   A fun feature is that the multiple snapshots of a file have the identical
   inode value
   
  
  fun as in 'May you live to see interesting times' kinda fun? Or really
  fun?
 
 The former.  POSIX says that two files with the identical st_dev and
 st_ino must be the *identical* file, e.g., hard links.  On a snapshot,
 they could be two *versions* of a file with completely different
 contents.  Google suggests that this contradiction also exists
 elsewhere, such as with the virtual FS provided by ClearCase's VOB.
 

So, I'm trying to understand what to takeaway from this thread:
* I should not use them?
* I can use them, but having multiple snapshots introduces a risk that a
snap-restore could wipe files completely by potentially putting a
deleted file on top of a new file?
* I should use them - but not use multiples.
* something completely different ;)

Our primary goal here is to use snapshots to enable us to backup to tape
from the snapshot over FC - and not have to pull a massive amount of
data over GbE nfs through our NAT director from one of our cluster nodes
to put it on tape. We have thought about a dedicated GbE backup network,
but would rather use the 4Gb FC fabric we've got.

If anyone can recommend a better way to accomplish that, I would love to
hear about how other people are backing up large-ish (1TB) GFS
filesystems to tape.

Regards,
-C

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] Unformatting a GFS cluster disk

2008-03-28 Thread Wendy Cheng

christopher barry wrote:

On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote:
  

On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote:


On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote:
...
  

Can you point me at any docs that describe how best to implement snaps
against a gfs lun?
  

FYI, the NetApp snapshot capability is a result of their WAFL filesystem
http://www.google.com/search?q=netapp+wafl.  Basically, they use a
copy-on-write mechanism that naturally maintains older versions of disk blocks.

A fun feature is that the multiple snapshots of a file have the identical
inode value



fun as in 'May you live to see interesting times' kinda fun? Or really
fun?
  

The former.  POSIX says that two files with the identical st_dev and
st_ino must be the *identical* file, e.g., hard links.  On a snapshot,
they could be two *versions* of a file with completely different
contents.  Google suggests that this contradiction also exists
elsewhere, such as with the virtual FS provided by ClearCase's VOB.




So, I'm trying to understand what to takeaway from this thread:
* I should not use them?
* I can use them, but having multiple snapshots introduces a risk that a
snap-restore could wipe files completely by potentially putting a
deleted file on top of a new file?
* I should use them - but not use multiples.
* something completely different ;)
  


Wait ! First, the multiple snapshots sharing one inode interpretation 
about WAFL is not correct.  Second, there are plenty documents talking 
about how to do snapshots with Linux filesystems (e.g. ext3) on Netapp 
NOW web site where its customers can get accesses. Third, doing snapshot 
on GFS is easier than ext3 (since ext3 journal can be on different volume).


Will do a draft write-up as soon as I'm off my current task (sometime 
over this weekend).


-- Wendy

Our primary goal here is to use snapshots to enable us to backup to tape
from the snapshot over FC - and not have to pull a massive amount of
data over GbE nfs through our NAT director from one of our cluster nodes
to put it on tape. We have thought about a dedicated GbE backup network,
but would rather use the 4Gb FC fabric we've got.

If anyone can recommend a better way to accomplish that, I would love to
hear about how other people are backing up large-ish (1TB) GFS
filesystems to tape.

Regards,
-C

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster
  



--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] Unformatting a GFS cluster disk

2008-03-28 Thread Lombard, David N
On Fri, Mar 28, 2008 at 03:51:54PM -0400, christopher barry wrote:
 On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote:
  On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote:
   
   fun as in 'May you live to see interesting times' kinda fun? Or really
   fun?
  
  The former.  POSIX says that two files with the identical st_dev and
  st_ino must be the *identical* file, e.g., hard links.  On a snapshot,
  they could be two *versions* of a file with completely different
  contents.  Google suggests that this contradiction also exists
  elsewhere, such as with the virtual FS provided by ClearCase's VOB.
  
 
 So, I'm trying to understand what to takeaway from this thread:
 * I should not use them?

I'm not, in any way shape, or form, suggesting you avoid snapshots!
It has saved me from misery more than once.

 * I can use them, but having multiple snapshots introduces a risk that a
 snap-restore could wipe files completely by potentially putting a
 deleted file on top of a new file?
 * I should use them - but not use multiples.
 * something completely different ;)

I have no information on this.

-- 
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] Unformatting a GFS cluster disk

2008-03-28 Thread Lombard, David N
On Fri, Mar 28, 2008 at 04:54:22PM -0500, Wendy Cheng wrote:
 christopher barry wrote:
 On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote:
   
 On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote:
 
 On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote:
 A fun feature is that the multiple snapshots of a file have the 
 identical
 inode value
 
 Wait ! First, the multiple snapshots sharing one inode 
 interpretation about WAFL is not correct.

Same inode value.  I've experienced this multiple times, and, as
I noted, is a consequence of copy-on-write.

I've also had to help other people understand why various utilities
didn't work as expected, like gnu diff, which immediately reported
identical files as soon as it saw the identical values for st_dev
and st_ino in the two files it was asked to compare.

From the current diffutils (2.8.1) source:

  /* Do struct stat *S, *T describe the same file?  Answer -1 if unknown.  */
  #ifndef same_file
  # define same_file(s, t) \
  s)-st_ino == (t)-st_ino)  ((s)-st_dev == (t)-st_dev)) \
   || same_special_file (s, t))
  #endif

Second, there are plenty 
 documents talking about how to do snapshots with Linux filesystems 
 (e.g. ext3) on Netapp NOW web site where its customers can get 
 accesses.

I didn't say snapshots don't work on Linux.  I've used NetApp on
Linux and directly benefitted from snapshots.

-- 
David N. Lombard, Intel, Irvine, CA
I do not speak for Intel Corporation; all comments are strictly my own.

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


[Linux-cluster] SCSI reservation conflicts after update

2008-03-28 Thread Sajesh Singh
After updating my GFS cluster to the latest packages (as of 3/28/08) on 
an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
am receiving scsi reservation errors whenever the nodes are rebooted. 
The node is then subsequently rebooted at varying intervals without any 
intervention. I have tried to disable the scsi_reserve script from 
startup, but it does not seem to have any effect. I have also tried to 
use the sg_persist command to clear all reservations with the -C option 
to no avail. I first noticed something was wrong when the 2nd node of 
the 2 node cluster was being updated. That was the first sign of the 
scsi reservation errors on the console.


From my understanding persistent SCSI reservations are only needed if I 
am using the fence_scsi module.


I would appreciate any guidance.

Regards,

Sajesh Singh

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI reservation conflicts after update

2008-03-28 Thread christopher barry
On Fri, 2008-03-28 at 21:03 -0400, Sajesh Singh wrote:
 After updating my GFS cluster to the latest packages (as of 3/28/08) on 
 an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
 am receiving scsi reservation errors whenever the nodes are rebooted. 
 The node is then subsequently rebooted at varying intervals without any 
 intervention. I have tried to disable the scsi_reserve script from 
 startup, but it does not seem to have any effect. I have also tried to 
 use the sg_persist command to clear all reservations with the -C option 
 to no avail. I first noticed something was wrong when the 2nd node of 
 the 2 node cluster was being updated. That was the first sign of the 
 scsi reservation errors on the console.
 
  From my understanding persistent SCSI reservations are only needed if I 
 am using the fence_scsi module.
 
 I would appreciate any guidance.
 
 Regards,
 
 Sajesh Singh
 
 --
 Linux-cluster mailing list
 Linux-cluster@redhat.com
 https://www.redhat.com/mailman/listinfo/linux-cluster


Sajesh, start here:
http://www.mail-archive.com/linux-cluster@redhat.com/msg01029.html

I went through this too, and Ryan helped me out a lot!

Good Luck!
-C

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


Re: [Linux-cluster] SCSI reservation conflicts after update

2008-03-28 Thread Sajesh Singh



christopher barry wrote:

On Fri, 2008-03-28 at 21:03 -0400, Sajesh Singh wrote:
  
After updating my GFS cluster to the latest packages (as of 3/28/08) on 
an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
am receiving scsi reservation errors whenever the nodes are rebooted. 
The node is then subsequently rebooted at varying intervals without any 
intervention. I have tried to disable the scsi_reserve script from 
startup, but it does not seem to have any effect. I have also tried to 
use the sg_persist command to clear all reservations with the -C option 
to no avail. I first noticed something was wrong when the 2nd node of 
the 2 node cluster was being updated. That was the first sign of the 
scsi reservation errors on the console.


 From my understanding persistent SCSI reservations are only needed if I 
am using the fence_scsi module.


I would appreciate any guidance.

Regards,

Sajesh Singh

--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster




Sajesh, start here:
http://www.mail-archive.com/linux-cluster@redhat.com/msg01029.html

I went through this too, and Ryan helped me out a lot!

Good Luck!
-C
  

Christopher,
  I have read through the entire posting and a bit of 
information seems to be missing. Did you fix it by simply disabling the 
scsi_reserve script and clearing the stale reservations ?


Thanks,

Sajesh
--
Linux-cluster mailing list
Linux-cluster@redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster