On Wed, Dec 20, 2006 at 09:22:21AM -0600, Pappas, Bill wrote: > I'm looking for some feedback from luster and pvfs users.
I asked Bill to raise this question on the PVFS list, so I'd better respond to him :> Since this is "home field" so to speak, I won't feel bad speaking positivley about PVFS, but I'll try to be objective with respect to Lustre's strenghts and weaknesses. I read their mailing lists, and I'm sure they read ours. > Specifically ---I'm interested in any thoughts on why one would go > to luster or pvfs for their hpc file system needs. Here are a few PVFS strengths: - (mostly) userspace design makes porting, supporting, installation, and development much easier. The small kernel module we have lets serial applications access PVFS convienently. - Tightly integrated driver in ROMIO (a widely deployed MPI-IO implementation). - Native support for noncontiguous data patterns (similar to MPI datatypes), which are common in scientific applications. > What fundamentally makes pvfs different from lustre? That's a hard question to answer without sounding either like a raving Lustre hater or a zealous PVFS fan, but I'll give it a shot: Lustre appears to be designed first and foremost to be a POSIX file system which could also handle parallel I/O. PVFS was designed first and foremost to be a fast, scalable filesystem for parallel I/O, which can also handle serial I/O workloads. It's important to stress there are workloads that are an excelent fit for Lustre, just as other workloads excel on PVFS. It seems like many people find the PVFS development and user community fairly open. We don't require assignment of copyright to pepole contributing patches. We have one CVS tree, and while it might have a lot of active branches at a given time, there is no "commercial" version of PVFS hidden from interested parties. We do our best to cary out development questions on public mailing lists. Several of us hang out on IRC (#pvfs2 on irc.freenode.net) and answer questions when people drop in. > I realize that one may claim (that for specific requirements) luster > or pvfs may be more suitable or just plain better. So....I'd like > to know which requirement(s) led you to luster or pvfs? If your typical application needs to scale to thousands of clients, or you have applications that make use of MPI-IO (or higher level libraries built on top of MPI-IO like parallel HDF5 or Parallel-NetCDF), PVFS would be an excelent choice. If your typical applicaiton is serial in nature, or you require strict POSIX semantics, Lustre would be a good choice. > I would definitely like to know any limitations you've seen in either > fs. Installation complications? Scalabilty. Reliability. Speed. I have not set up Lustre myself, but reports from many who have suggest it somewhat more involved than setting up PVFS. We require no kernel patches for PVFS: we've taken great pains to compile our kernel module standalone against many different kernel.org and vendor kernels. As already pointed out in this thread, PVFS is pretty portable to many different architectures. You can set up a test installation of PVFS on top of any directory you like: no need for a dedicated device (though it wouldn't hurt performanc of course). We designed PVFS from the beginning to scale well. We've run on IBM Watson's 16k node bluegene system, and plan on deploying PVFS on some of the large argonne systems we've got in the works. PVFS reliability seems to be pretty good in practice. Sure, we definitely get bug reports, which we act on as quickly as possible (again, being mostly userspace, we can debug a lot of issues quickly). Hardware failure is responsible for a lot of the outages we see at sites like Argonne. Speed is decent and getting better. Earlier this year we completed an examination of metadata performance and came up with some new approaches to speed that up. We're currently working on optimizing our I/O rates (they aren't horrible now, but can be improved). So, that's the high-level discussion of PVFS. Would it make sense in your environment? Well, the way I see it, it doesn't cost anything to download and install PVFS, and the time commmitment in setting it up isn't that high either. If you have hardware available and you can set up a test system, that might give you the best idea if PVFS is right for you. If you have any questions or would like to hear more about any of the points above, feel free to ask. ==rob -- Rob Latham Mathematics and Computer Science Division A215 0178 EA2D B059 8CDF Argonne National Lab, IL USA B29D F333 664A 4280 315B _______________________________________________ Pvfs2-users mailing list [email protected] http://www.beowulf-underground.org/mailman/listinfo/pvfs2-users
