[Linux-cluster] GFS reports withdraw from cluster
We have a RHEL4 2 node cluster running on HP Integrity servers (rx4640, 4 Itanium's, 32GB RAM each). The GFS is running on top of a SAN (EMC Cx380's). We get these kinds of errors: Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: fatal: invalid metadata block Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: bh = 16659526 (type: exp=5, found=4) Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: function = gfs_get_meta_buffer Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: file = /builddir/build/BUILD/gfs-kernel-2.6.9-75/up/src/gfs/dio.c, line = 1223 Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: time = 1206652995 Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: about to withdraw from the cluster Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: waiting for outstanding I/O Mar 27 22:23:15 idefix kernel: GFS: fsid=alpha_cluster:orabck.0: telling LM to withdraw Mar 27 22:23:15 idefix kernel: lock_dlm: withdraw abandoned memory and then the filesystem is gone :-( We are up2date with current patch levels: GFS-6.1.15-1 GFS-kernel-2.6.9-75.12 cman-kernel-2.6.9-53.9 cman-1.0.17-0.el4_6.5 Not sure whether this is a SAN issue (that would be bad, 'cause a lot of systemen are dependant of the SAN) or OS or inside the GFS? Any help is greatly appreciated. Ewald... -- Ewald Beekman, Security Engineer, Academic Medical Center, dept. ADICT/AD Server en Netwerkbeheer, The Netherlands ## Your mind-mint is: Blessed is the man who, having nothing to say, abstains from giving wordy evidence of the fact. -- George Eliot -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite
Nice overview. Wish I had this a few weeks ago :-) I am curious as to why LVM2 is required? With simple modification of the scsi_reserve (and maybe fence_scsi), using an msdos partitioned disk seems to work fine. This is only in testing but I haven't seen any issues as of yet. Ryan O'Hara wrote: Attached is the latest version of the Using SCSI Persistent Reservations with Red Hat Cluster Suite document for review. Feel free to send questions and comments. -Ryan -- Jeff Macfarland ([EMAIL PROTECTED]) Nexa Technologies - 972.747.8879 Systems Administrator GPG Key ID: 0x5F1CA61B GPG Key Server: hkp://wwwkeys.pgp.net -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
RE: [Linux-cluster] RHCS / DRBD / MYSQL
On Fri, 2008-03-28 at 17:24 +1300, Arjuna Christensen wrote: === Resource Tree === service { name = asteriskcluster; domain = asterisk; autostart = 1; ip { address = 192.168.111.1; monitor_link = 1; script { name = drbdcontrol; file = /etc/init.d/drbdcontrol; fs { name = mysqlfs; mountpoint = /var/lib/mysql; device = /dev/drbd0; fstype = ext3; force_unmount = 0; self_fence = 1; fsid = 11607; force_fsck = 0; options = ; script { name = asterisk; file = /etc/init.d/asterisk; } script { name = mysql; file = /etc/init.d/mysql; } } } } } This looks fine. The only thing I can think of is: * that your version of rg_test doesn't match clurgmgrd for some reason. I don't know why that would happen. * you have an old version of ccs installed with a new version of rgmanager. rg_test doesn't use ccs - so the query it uses to find children works even if ccs breaks. I'd pull the most recent snapshot of rgmanager for the version you're running (e.g. from the RHEL4 or RHEL5 branch; Fabio seemed to think yours was pretty old). -- Lon -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite
Hello, Ryan O'Hara wrote: 4 - Limitations ... - Multipath devices are not currently supported. What is the reason - it is strongly required to use at least two HBA in a SAN network, which is useless when using scsi reservation. Regards, Tomasz Sucharzewski On Fri, 28 Mar 2008 09:20:53 -0500 Ryan O'Hara [EMAIL PROTECTED] wrote: 4 - Limitations In addition to these requirements, fencing by way of SCSI persistent reservations also some limitations. - Multipath devices are not currently supported. -- Tomasz Sucharzewski [EMAIL PROTECTED] -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite
On Fri, Mar 28, 2008 at 8:15 AM, Ryan O'Hara [EMAIL PROTECTED] wrote: The reason for the cluster LVM2 requirement is for device discovery. The scripts use LVM commands to find cluster volumes and then gets a list of devices that make up those volumes. Consider the alternative -- users would have to manually define a list of devices that need registrations/reservations. This would have to be defined on each node. What make this even more problematic is that each node may have different device names for shared storage devices (ie. what may be /deb/sdb on one node may be /deb/sdc on another). Furthermore, those device names could change between reboots. The general solution is to query clvmd for a list of cluster volumes and get a list of devices for those volumes. You can also use symbolic links under /dev/disk/by-id/ which are persistent across nodes/reboots. -Alex -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI Reservations Red Hat Cluster Suite
True. Any solution for auto discovery? I've no problem with statically defining a device, but that would be pretty nice if possible. Alex Kompel wrote: On Fri, Mar 28, 2008 at 8:15 AM, Ryan O'Hara [EMAIL PROTECTED] wrote: The reason for the cluster LVM2 requirement is for device discovery. The scripts use LVM commands to find cluster volumes and then gets a list of devices that make up those volumes. Consider the alternative -- users would have to manually define a list of devices that need registrations/reservations. This would have to be defined on each node. What make this even more problematic is that each node may have different device names for shared storage devices (ie. what may be /deb/sdb on one node may be /deb/sdc on another). Furthermore, those device names could change between reboots. The general solution is to query clvmd for a list of cluster volumes and get a list of devices for those volumes. You can also use symbolic links under /dev/disk/by-id/ which are persistent across nodes/reboots. -Alex -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster ___ Inbound Email has been scanned by Nexa Technologies Email Security Systems. ___ -- Jeff Macfarland ([EMAIL PROTECTED]) Nexa Technologies - 972.747.8879 Systems Administrator GPG Key ID: 0x5F1CA61B GPG Key Server: hkp://wwwkeys.pgp.net -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
RE: [Linux-cluster] RHCS / DRBD / MYSQL
On Fri, 28 Mar 2008, Lon Hohberger wrote: On Fri, 2008-03-28 at 17:24 +1300, Arjuna Christensen wrote: === Resource Tree === service { name = asteriskcluster; domain = asterisk; autostart = 1; ip { address = 192.168.111.1; monitor_link = 1; script { name = drbdcontrol; file = /etc/init.d/drbdcontrol; fs { name = mysqlfs; mountpoint = /var/lib/mysql; device = /dev/drbd0; fstype = ext3; force_unmount = 0; self_fence = 1; fsid = 11607; force_fsck = 0; options = ; script { name = asterisk; file = /etc/init.d/asterisk; } script { name = mysql; file = /etc/init.d/mysql; } } } } } This looks fine. The only thing I can think of is: * that your version of rg_test doesn't match clurgmgrd for some reason. I don't know why that would happen. * you have an old version of ccs installed with a new version of rgmanager. rg_test doesn't use ccs - so the query it uses to find children works even if ccs breaks. I'd pull the most recent snapshot of rgmanager for the version you're running (e.g. from the RHEL4 or RHEL5 branch; Fabio seemed to think yours was pretty old). He is using an ubuntu package made up by a CVS snapshot from 20070315. All the versions of the binaries are coming from the same snapshot. Fabio -- I'm going to make him an offer he can't refuse. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Unformatting a GFS cluster disk
On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote: On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote: On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote: ... Can you point me at any docs that describe how best to implement snaps against a gfs lun? FYI, the NetApp snapshot capability is a result of their WAFL filesystem http://www.google.com/search?q=netapp+wafl. Basically, they use a copy-on-write mechanism that naturally maintains older versions of disk blocks. A fun feature is that the multiple snapshots of a file have the identical inode value fun as in 'May you live to see interesting times' kinda fun? Or really fun? The former. POSIX says that two files with the identical st_dev and st_ino must be the *identical* file, e.g., hard links. On a snapshot, they could be two *versions* of a file with completely different contents. Google suggests that this contradiction also exists elsewhere, such as with the virtual FS provided by ClearCase's VOB. So, I'm trying to understand what to takeaway from this thread: * I should not use them? * I can use them, but having multiple snapshots introduces a risk that a snap-restore could wipe files completely by potentially putting a deleted file on top of a new file? * I should use them - but not use multiples. * something completely different ;) Our primary goal here is to use snapshots to enable us to backup to tape from the snapshot over FC - and not have to pull a massive amount of data over GbE nfs through our NAT director from one of our cluster nodes to put it on tape. We have thought about a dedicated GbE backup network, but would rather use the 4Gb FC fabric we've got. If anyone can recommend a better way to accomplish that, I would love to hear about how other people are backing up large-ish (1TB) GFS filesystems to tape. Regards, -C -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Unformatting a GFS cluster disk
christopher barry wrote: On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote: On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote: On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote: ... Can you point me at any docs that describe how best to implement snaps against a gfs lun? FYI, the NetApp snapshot capability is a result of their WAFL filesystem http://www.google.com/search?q=netapp+wafl. Basically, they use a copy-on-write mechanism that naturally maintains older versions of disk blocks. A fun feature is that the multiple snapshots of a file have the identical inode value fun as in 'May you live to see interesting times' kinda fun? Or really fun? The former. POSIX says that two files with the identical st_dev and st_ino must be the *identical* file, e.g., hard links. On a snapshot, they could be two *versions* of a file with completely different contents. Google suggests that this contradiction also exists elsewhere, such as with the virtual FS provided by ClearCase's VOB. So, I'm trying to understand what to takeaway from this thread: * I should not use them? * I can use them, but having multiple snapshots introduces a risk that a snap-restore could wipe files completely by potentially putting a deleted file on top of a new file? * I should use them - but not use multiples. * something completely different ;) Wait ! First, the multiple snapshots sharing one inode interpretation about WAFL is not correct. Second, there are plenty documents talking about how to do snapshots with Linux filesystems (e.g. ext3) on Netapp NOW web site where its customers can get accesses. Third, doing snapshot on GFS is easier than ext3 (since ext3 journal can be on different volume). Will do a draft write-up as soon as I'm off my current task (sometime over this weekend). -- Wendy Our primary goal here is to use snapshots to enable us to backup to tape from the snapshot over FC - and not have to pull a massive amount of data over GbE nfs through our NAT director from one of our cluster nodes to put it on tape. We have thought about a dedicated GbE backup network, but would rather use the 4Gb FC fabric we've got. If anyone can recommend a better way to accomplish that, I would love to hear about how other people are backing up large-ish (1TB) GFS filesystems to tape. Regards, -C -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Unformatting a GFS cluster disk
On Fri, Mar 28, 2008 at 03:51:54PM -0400, christopher barry wrote: On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote: On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote: fun as in 'May you live to see interesting times' kinda fun? Or really fun? The former. POSIX says that two files with the identical st_dev and st_ino must be the *identical* file, e.g., hard links. On a snapshot, they could be two *versions* of a file with completely different contents. Google suggests that this contradiction also exists elsewhere, such as with the virtual FS provided by ClearCase's VOB. So, I'm trying to understand what to takeaway from this thread: * I should not use them? I'm not, in any way shape, or form, suggesting you avoid snapshots! It has saved me from misery more than once. * I can use them, but having multiple snapshots introduces a risk that a snap-restore could wipe files completely by potentially putting a deleted file on top of a new file? * I should use them - but not use multiples. * something completely different ;) I have no information on this. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] Unformatting a GFS cluster disk
On Fri, Mar 28, 2008 at 04:54:22PM -0500, Wendy Cheng wrote: christopher barry wrote: On Fri, 2008-03-28 at 07:42 -0700, Lombard, David N wrote: On Thu, Mar 27, 2008 at 03:26:55PM -0400, christopher barry wrote: On Wed, 2008-03-26 at 13:58 -0700, Lombard, David N wrote: A fun feature is that the multiple snapshots of a file have the identical inode value Wait ! First, the multiple snapshots sharing one inode interpretation about WAFL is not correct. Same inode value. I've experienced this multiple times, and, as I noted, is a consequence of copy-on-write. I've also had to help other people understand why various utilities didn't work as expected, like gnu diff, which immediately reported identical files as soon as it saw the identical values for st_dev and st_ino in the two files it was asked to compare. From the current diffutils (2.8.1) source: /* Do struct stat *S, *T describe the same file? Answer -1 if unknown. */ #ifndef same_file # define same_file(s, t) \ s)-st_ino == (t)-st_ino) ((s)-st_dev == (t)-st_dev)) \ || same_special_file (s, t)) #endif Second, there are plenty documents talking about how to do snapshots with Linux filesystems (e.g. ext3) on Netapp NOW web site where its customers can get accesses. I didn't say snapshots don't work on Linux. I've used NetApp on Linux and directly benefitted from snapshots. -- David N. Lombard, Intel, Irvine, CA I do not speak for Intel Corporation; all comments are strictly my own. -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
[Linux-cluster] SCSI reservation conflicts after update
After updating my GFS cluster to the latest packages (as of 3/28/08) on an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp) I am receiving scsi reservation errors whenever the nodes are rebooted. The node is then subsequently rebooted at varying intervals without any intervention. I have tried to disable the scsi_reserve script from startup, but it does not seem to have any effect. I have also tried to use the sg_persist command to clear all reservations with the -C option to no avail. I first noticed something was wrong when the 2nd node of the 2 node cluster was being updated. That was the first sign of the scsi reservation errors on the console. From my understanding persistent SCSI reservations are only needed if I am using the fence_scsi module. I would appreciate any guidance. Regards, Sajesh Singh -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI reservation conflicts after update
On Fri, 2008-03-28 at 21:03 -0400, Sajesh Singh wrote: After updating my GFS cluster to the latest packages (as of 3/28/08) on an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp) I am receiving scsi reservation errors whenever the nodes are rebooted. The node is then subsequently rebooted at varying intervals without any intervention. I have tried to disable the scsi_reserve script from startup, but it does not seem to have any effect. I have also tried to use the sg_persist command to clear all reservations with the -C option to no avail. I first noticed something was wrong when the 2nd node of the 2 node cluster was being updated. That was the first sign of the scsi reservation errors on the console. From my understanding persistent SCSI reservations are only needed if I am using the fence_scsi module. I would appreciate any guidance. Regards, Sajesh Singh -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster Sajesh, start here: http://www.mail-archive.com/linux-cluster@redhat.com/msg01029.html I went through this too, and Ryan helped me out a lot! Good Luck! -C -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster
Re: [Linux-cluster] SCSI reservation conflicts after update
christopher barry wrote: On Fri, 2008-03-28 at 21:03 -0400, Sajesh Singh wrote: After updating my GFS cluster to the latest packages (as of 3/28/08) on an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp) I am receiving scsi reservation errors whenever the nodes are rebooted. The node is then subsequently rebooted at varying intervals without any intervention. I have tried to disable the scsi_reserve script from startup, but it does not seem to have any effect. I have also tried to use the sg_persist command to clear all reservations with the -C option to no avail. I first noticed something was wrong when the 2nd node of the 2 node cluster was being updated. That was the first sign of the scsi reservation errors on the console. From my understanding persistent SCSI reservations are only needed if I am using the fence_scsi module. I would appreciate any guidance. Regards, Sajesh Singh -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster Sajesh, start here: http://www.mail-archive.com/linux-cluster@redhat.com/msg01029.html I went through this too, and Ryan helped me out a lot! Good Luck! -C Christopher, I have read through the entire posting and a bit of information seems to be missing. Did you fix it by simply disabling the scsi_reserve script and clearing the stale reservations ? Thanks, Sajesh -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster