FWIW, we are running HA Lustre using corosync/pacemaker. We broke our OSSs and MDSs out into individual HA *pairs*. Thought about other configurations but it was our first step into corosync/pacemaker so we decided to keep it as simple as possible. Seems to work well. I'm not sure I would attempt what you are doing though it may be perfectly fine. When HA is a requirement, it probably makes sense to avoid pushing the limits of what works.
Doesn't really help you much other than to provide a data point with regard to what other sites are doing. Good luck and report back. Charlie Taylor UF HPC Center On Oct 19, 2012, at 12:52 PM, Hall, Shawn wrote: > Hi, > > We’re setting up fairly large Lustre 2.1.2 filesystems, each with 18 nodes > and 159 resources all in one Corosync/Pacemaker cluster as suggested by our > vendor. We’re getting mixed messages on how large of a Corosync/Pacemaker > cluster will work well between our vendor an others. > > 1. Are there Lustre Corosync/Pacemaker clusters out there of this size > or larger? > 2. If so, what tuning needed to be done to get it to work well? > 3. Should we be looking more seriously into splitting this > Corosync/Pacemaker cluster into pairs or sets of 4 nodes? > > Right now, our current configuration takes a long time to start/stop all > resources (~30-45 mins), and failing back OSTs puts a heavy load on the cib > process on every node in the cluster. Under heavy IO load, the many of the > nodes will show as “unclean/offline” and many OST resources will show as > inactive in crm status, despite the fact that every single MDT and OST is > still mounted in the appropriate place. We are running 2 corosync rings, > each on a private 1 GbE network. We have a bonded 10 GbE network for the > LNET. > > Thanks, > Shawn > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
