On Jan 15, 2008 16:03 -0600, Robert Olson wrote: > Setting up my system that has no OST failover, so would like to set > for failout. Have the issues in the 1.6 betas been worked out in > 1.6.4.1?
Very little testing is done on failout mode, because even with a single OSS node the common behaviour is to just reboot the node and continue using the OSTs thereon. You can set "lctl -w lnet.panic_on_lbug=1" and "lctl -w kernel.panic_on_oops" and the node will reboot if a bug is hit in Lustre or the kernel. While not 100% covering (it won't reboot on a deadlock, for example) it is fairly useful. > On Mar 22, 2007, at 4:55 PM, Nathaniel Rutman wrote: > > > Well, your question prompted me to try this out. > > > > There are two issues: > > 1. failout mode cannot be set on a live filesystem, and can't be > > set with lctl conf_param. > > The wiki page has instructions for setting failout mode at mkfs time > > https://mail.clusterfs.com/wikis/lustre/MountConf > > You can also set failout mode with tunefs and writeconf: > > > > tunefs.lustre --writeconf --param="failover.mode=failout" /dev/sda > > > > There can be no Lustre servers or clients running when changing the > > failover mode. > > > > 2. failout mode is broken in the 1.6 betas. I have an untested > > patch in bug 12005 > > https://bugzilla.lustre.org/show_bug.cgi?id=12005 > > Using failout mode in the betas without this patch will probably > > lead to an LBUG on the OST. > > > > > > swin wang wrote: > >> In our test, we didn't set the failout mode in mkfs, but set it on > >> the mdt/mgs > >> with lctl: > >> lctl conf_param testfs-OST0001.failover.mode=failout > >> but it seem didn't work. when OST0001 is failed, the > >> client operation is still blocked (with 1.5.97). > >> > >> 2007/3/22, Nathaniel Rutman < [EMAIL PROTECTED] > >> <mailto:[EMAIL PROTECTED]>>: > >> > >> swin wang wrote: > >> > We current use 1.5.97, we try to set it to failout mode, but it > >> didn't > >> > work > >> > int this version, what we want is: when read/write the > >> failed OST, > >> > it return > >> > IO errors, but still can create and read/write new files, > >> when the > >> > failed OST > >> > is ok, we can read/write files on the failed OST. > >> That's what failout mode it. How did you try to set it? > >> > >> > I'm not sure if the 1.4.x version with "failout" mode can > >> provide what we > >> > want? > >> > > >> > >> > >> --------------------------------------------------------------------- > >> --- > >> > >> _______________________________________________ > >> Lustre-discuss mailing list > >> [email protected] > >> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > >> > > > > _______________________________________________ > > Lustre-discuss mailing list > > [email protected] > > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss > > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss Cheers, Andreas -- Andreas Dilger Sr. Staff Engineer, Lustre Group Sun Microsystems of Canada, Inc. _______________________________________________ Lustre-discuss mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
