On Mon, 2008-10-06 at 15:47 +0100, Malcolm Cowe wrote: > Hey Brian, Hey Malcolm,
> I'll have to re-install the system from scratch in order to be able to > answer some of your questions, which I'll get started on this evening. OK. > What I was hoping for in the first instance was a sanity check of our > installation methods. I think I commented on those. If you are going to build your OFED stack you don't need to install the one we provide. > With respect to the OFED stack used, we are using the latest official > software stack supplied by Voltaire. The reason for this is that there > is more to OFED than just the kernel modules, including many libraries > and tools, None of these should be necessary for Lustre to use I/B. > plus the latest firmware for the cards. Hrm. Can you not upgrade firmware independent of upgrading the whole OFED stack? That seems very limiting. > It's what the customer has asked for, and it is what the card vendor > expects us to do. Fair enough. I was just pointing out that you don't need our OFED stack if you are going to install your own. > We may be able to get away with OFED 1.3, but I would still like some > guidance on how to install the rest of the OFED stack We don't supply the userspace tools because they are not really necessary for Lustre. > do we use the OFED source to rebuild everything, or can we pick the > Lustre supplied kernel modules and just layer on the other stuff > separately? Yes, you should be able to do that. I say that quite generally as I'm not entirely clear on your operating environment. > Finally, when I said that one file system fails versus another passes, > I mean that the server locks solid, crashes, usually with no debug to > speak of (nothing in the system logs). Nothing on the console either? > Even while the system is up and running the lustre kernel, if we > attempt a clean shutdown, the kernel panics. Hrm. A panic is quite different than locking solid with no messages at all. A solid lock with no messages is indicative of hardware problems. > Since I need to rebuild the systems anyway, I will also try to install > the packages in the order mentioned by Megan Larko, to see how that > affects the installation. I'm not entirely convinced of her process. You should not need to use --force and reinstall packages already installed. I'd be more interested in knowing exactly your installation steps and the errors you get from it. Please try to avoid the use of --force so we can see why it's necessary. You will have to use "rpm -U" with e2fsprogs though as she mentions. Do all of your work with the "script(1)" tool so you can easily log it. b.
signature.asc
Description: This is a digitally signed message part
_______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
