Re: [lopsa-tech] shared network disks - vs gfs - vs distributed filesystem - vs ...

Narayan Desai Wed, 01 Jul 2009 16:58:25 -0700

On Wed, 1 Jul 2009 19:29:43 -0400 Edward Ned Harvey wrote:

  Ned> > You seem to have conflicting wants/needs. First you said this:
  Ned> > --- On Wed, 7/1/09, Edward Ned Harvey <[email protected]> wrote:
  Ned> > > I have a bunch of compute servers.  They all have local disks
  Ned> > > mounted as /scratch to use for computation scratch space.  This
  Ned> > ensures
  Ned> > > maximum performance on all systems, and no competition for a shared
  Ned> > > resource during crunch time.  At present, all of their /scratch
  Ned> > > directories are local, separate and distinct.
  Ned> > 
  Ned> > Then this:
  Ned> > > I think it would be awesome
  Ned> > > if /scratch looked the same on all systems.
  Ned> > 
  Ned> > Does "look the same" mean configured the same? You didn't really expand
  Ned> > on this statement and clarify the goal, which I'm not sure is
  Ned> > uniformity, accessibility, or a combo of both.


  Ned> You're right - although it was clear in my mind, I see how that was
  Ned> confusing.  Let me try again:

  Ned> If you go into some directory and do "ls" (or whatever), then the results
  Ned> should be the same regardless of which machine you're on.  I do not want 
a
  Ned> centralized network file server, because of the bandwidth and diskspace
  Ned> bottleneck.  I want a distributed filesystem, which would provide the
  Ned> aforementioned ubiquitousness of namespace, but also allows you to do 
heavy
  Ned> IO on some machine without necessitating heavy network traffic.  A 
minimal
  Ned> amount of traffic is probably required, just so the other machines all 
have
  Ned> awareness of the existence of some file, but the file contents themselves
  Ned> are not needed to traverse the network until some other machine requests 
the
  Ned> contents of the file.

Well, there are two ways to go about it. Either you have a single
namespace backed by a single canonical version (which can be done with
something like cachefs to help improve performance via local disk
caching), or you have a single namespace backed by a distributed
canonical version. There are a pile of filesystems that sort of work
this way, with a variety of tradeoffs. PVFS and Lustre do this, as does
GPFS (which costs $$ but works well in our experience). Distributed
filesystems get tricky due to both metadata issues (for example, walking
the directory hierarchy can involve talking to multiple servers,
depending on how metadata is distributed) and locking, where you need to
start worrying about things like consensus protocols. PVFS doesn't do
locking, so this latter issue isn't a problem for it; it gets improved
performance and robustness in exchange. Any of these will usually stripe
files across multiple servers, to improve performance.

A big issue that you would run into on a filesystem that works as you
described above is that the failure of any machine participating would
potentially take a chunk out of the filesystem. This could include
either partial or complete files, or worse, directories out of the
directory hierarchy. So once you start ramping up the client count, you
want to start looking at replication management. 

So, you could also look at something like Hadoop's hdfs. I don't know a ton
about it, but it works hard to be robust (like ensuring multiple copies
of files exist across your network) so I would assume (tho I don't know)
that it ends up being slower than the options above.

There are probably a variety of solution that would give you 70% of what
you want, but I don't know of any filesystem that would give you
everything. This is a surprisingly hard problem to solve well. Tradeoffs
abound. 
 -nld

_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] shared network disks - vs gfs - vs distributed filesystem - vs ...

Reply via email to