[Linux-cluster] How to resolve Open Disconnected Pending state?

2009-04-02 Thread Kirby Zhou
The machine which exported gnbd is power off. The client machine fall into the state 'Open Disconnected Pending'. Any process access the dead gnbd fall into state 'D'. How can I destroy the gnbd block device on the client machine? [r...@xen-727057 ~]# gnbd_import -n -l Device name : 63.131.xvdb

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Kadlecsik Jozsef
On Tue, 31 Mar 2009, Kadlecsik Jozsef wrote: I'll restore the kernel on a not so critical node and will try to find out how to trigger the bug without mailman. If that succeeds then I'll remove the patch in question and re-run the test. It'll need a few days, surely, but I'll report the

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Wendy Cheng
Kadlecsik Jozsef wrote: If you have any idea what to do next, please write it. Do you have your kernel source somewhere (in tar ball format) so people can look into it ? -- Wendy -- Linux-cluster mailing list Linux-cluster@redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster

[Linux-cluster] Unable to mount GFS File System in RHEL5.2 (32bit)

2009-04-02 Thread Stevan Colaco
Dear All, I have setup 2 node cluster on RHEL5.2 (32bit) + Quorum Partition + GFS Partition. i could make gfs file system but issues while trying to mount it. is it due to GFS module not loaded? unable to load GFS module. below are the details, anyone has faced this issue before, please

[Linux-cluster] Network Interface Binding for cman

2009-04-02 Thread Mrugesh Karnik
Hi, How do I specify which network interfaces to listen on, to cman? I specifically need it to listen on two interfaces. The system has four interfaces in total. I'm on CentOS 5.2. Thanks, Mrugesh -- Linux-cluster mailing list Linux-cluster@redhat.com

RE: [Linux-cluster] Network Interface Binding for cman

2009-04-02 Thread Jeff Sturm
It binds to a multicast address. That address is bound to one interface normally. If you need two interfaces, look into ethernet bonding. -Original Message- From: linux-cluster-boun...@redhat.com [mailto:linux-cluster-boun...@redhat.com] On Behalf Of Mrugesh Karnik Sent: Thursday,

[Linux-cluster] rgmanager stop just hangs, clurgmgrd never terminates

2009-04-02 Thread Arwin L Tugade
Hey all, I ran into an issue where my cluster was quorate but none of the services were showing up via the clustat command. When I tried to do a /sbin/service rgmanager stop, it hangs indefinitely. The sigterm is sent but the clurgmgrd processes don't stop. What I ended up doing was

Re: [Linux-cluster] rgmanager stop just hangs, clurgmgrd never terminates

2009-04-02 Thread Fernando Lozano
Hi Arwin, I have the same problem on a two-node cluster (two KVM vitual machines) and on another two-node cluster with real Dell servers. If I flush iptables rules BEFORE starting cman, everything works fine. But if I start cman and rgmanager with iptables rules, I see no services and rgmanager

RE: [Linux-cluster] rgmanager stop just hangs, clurgmgrd never terminates

2009-04-02 Thread Arwin L Tugade
Yup, matter of fact, I disabled iptables altogether. The cluster comes up fine and I have services running once again (this is a test setup btw). Just to let you know I managed to get the cluster in this state when I was doing some failover testing. I'm just wondering why when I do a

Re: [Linux-cluster] Trouble after Openais upgrade to 0.80.3-22.el5

2009-04-02 Thread Steven Dake
Likely you ran into the segfault that happens during the upgrade process from some 5.2 to 5.3 nodes. You can reboot your cluster with all either 5.2 or alternatively 5.3 nodes or wait until the 5.3.z stream release becomes available which resolves this problem. regards -steve On Wed, 2009-04-01

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Kadlecsik Jozsef
Hi, On Thu, 2 Apr 2009, Wendy Cheng wrote: If you have any idea what to do next, please write it. Do you have your kernel source somewhere (in tar ball format) so people can look into it ? I have created the tarballs, you can find them at http://www.kfki.hu/~kadlec/gfs/: - Kernel is

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Kadlecsik Jozsef
On Thu, 2 Apr 2009, Kadlecsik Jozsef wrote: If you have any idea what to do next, please write it. Spent again some time looking through the git commits and that triggered some wild guessing: - commit ddebb0c3dc7d0b87c402ba17731ad41abdd43f2d ? It is a temporary fix for 2.6.26, which is

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Wendy Cheng
Kadlecsik Jozsef wrote: - commit 82d176ba485f2ef049fd303b9e41868667cebbdb gfs_drop_inode as .drop_inode replacing .put_inode. .put_inode was called without holding a lock, but .drop_inode is called under inode_lock held. Might it be a problem? I was planning to take a look over the

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Kadlecsik Jozsef
On Thu, 2 Apr 2009, Wendy Cheng wrote: Kadlecsik Jozsef wrote: - commit 82d176ba485f2ef049fd303b9e41868667cebbdb gfs_drop_inode as .drop_inode replacing .put_inode. .put_inode was called without holding a lock, but .drop_inode is called under inode_lock held. Might it be a problem?

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Wendy Cheng
Kadlecsik Jozsef wrote: On Thu, 2 Apr 2009, Wendy Cheng wrote: Kadlecsik Jozsef wrote: - commit 82d176ba485f2ef049fd303b9e41868667cebbdb gfs_drop_inode as .drop_inode replacing .put_inode. .put_inode was called without holding a lock, but .drop_inode is called under inode_lock

[Linux-cluster] RHEL5.3 Cluster - backup fencing methods

2009-04-02 Thread Kaerka Phillips
Hi - I've got an issue with a 4-node cluster, and I'm hoping to get some good advice or best-practices for this. The 4-node cluster is on dell hardware, using DRAC cards as the primary fencing device, but I'd like to eliminate the single-point of failure introduced with the cabling for this

Re: [Linux-cluster] Unable to mount GFS File System in RHEL5.2 (32bit)

2009-04-02 Thread Jon Erickson
I'm having the same problem... When running the mount command with the '-v' option it says something about errno 19? I don't remember exactly, I can post more info tomorrow. On Thu, Apr 2, 2009 at 11:11 AM, Stevan Colaco stevan.col...@gmail.com wrote: Dear All, I have setup 2 node cluster

Re: [Linux-cluster] rgmanager stop just hangs, clurgmgrd never terminates

2009-04-02 Thread Fernando Lozano
Arwin, Doesn't you log shows one node trying to fence the other? Clean_start prevents that at cluster startup, but on failover the survivor wants to fence the other. You may need to use fence_ack to let one node belive the other was fenced if you do not have a real fence device, for example Dell

Re: [Linux-cluster] Unable to mount GFS File System in RHEL5.2 (32bit)

2009-04-02 Thread Kaerka Phillips
It looks like there is a mix between the gfs and gfs2 filesystem and modules on your system -- your loaded module is GFS2, so perhaps try mounting with -t gfs2, except that you will need to have made the filesystem with GFS2 as well. All of my mounted GFS2 filesystems show gfs2 as the FS type

Re: [Linux-cluster] Network Interface Binding for cman

2009-04-02 Thread Mrugesh Karnik
On Friday 03 Apr 2009 08:39:07 Mrugesh Karnik wrote: On Thursday 02 Apr 2009 21:34:12 Jeff Sturm wrote: It binds to a multicast address. That address is bound to one interface normally. Well, how do I specify which interface to bind that multicast address to? I see the `bindnetaddr'

Re: [Linux-cluster] Freeze with cluster-2.03.11

2009-04-02 Thread Wendy Cheng
Kadlecsik Jozsef wrote: - commit 82d176ba485f2ef049fd303b9e41868667cebbdb gfs_drop_inode as .drop_inode replacing .put_inode. .put_inode was called without holding a lock, but .drop_inode is called under inode_lock held. Might it be a problem Based on code reading ... 1.