Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-07 Thread Adrian Ulrich
> I will try the "renice" solution you proposed. re-niceing corosync should not be required as the process is supposed to run with RT-Priority anyway. > I have been thinking that I could increase the "token" timeout value in > /etc/corosync/corosync.conf , to prevent short "hiccups". Did you

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Hall, Shawn
Passerini [mailto:marco.passer...@csc.fi] Sent: Tuesday, November 06, 2012 7:13 AM To: lustre-discuss@lists.lustre.org Cc: Hall, Shawn Subject: Re: [Lustre-discuss] Large Corosync/Pacemaker clusters Hi, I'm also setting up a high-available Lustre system, I configured pairs for the OSSes and M

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-11-06 Thread Marco Passerini
in nodes would have low memory issues > related to networking and would get stonithed due to unresponsiveness. > > Thanks, > Shawn > > -Original Message- > From: Charles Taylor [mailto:tay...@hpc.ufl.edu] > Sent: Wednesday, October 24, 2012 3:33 PM > To: Hall, Shaw

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-31 Thread Hall, Shawn
Cc: lustre-discuss@lists.lustre.org Subject: Re: [Lustre-discuss] Large Corosync/Pacemaker clusters FWIW, we are running HA Lustre using corosync/pacemaker.We broke our OSSs and MDSs out into individual HA *pairs*. Thought about other configurations but it was our first step into corosync/pa

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-24 Thread Jeff Johnson
Shawn, In my opinion you shouldn't be running corosync on any more than two machines. They should be configured in self contained pairs (mds pair, oss pairs). Anything beyond that would be chaos to manage, even if it worked. Don't forget the stonith portion. Not every block storage implementat

Re: [Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-24 Thread Charles Taylor
FWIW, we are running HA Lustre using corosync/pacemaker.We broke our OSSs and MDSs out into individual HA *pairs*. Thought about other configurations but it was our first step into corosync/pacemaker so we decided to keep it as simple as possible. Seems to work well.I'm not sure I w

[Lustre-discuss] Large Corosync/Pacemaker clusters

2012-10-24 Thread Hall, Shawn
Hi, We're setting up fairly large Lustre 2.1.2 filesystems, each with 18 nodes and 159 resources all in one Corosync/Pacemaker cluster as suggested by our vendor. We're getting mixed messages on how large of a Corosync/Pacemaker cluster will work well between our vendor an others. 1.