Re: [lopsa-tech] shared network disks - vs gfs - vs distributed filesystem - vs ...

Edward Ned Harvey Wed, 01 Jul 2009 16:31:27 -0700

> You seem to have conflicting wants/needs. First you said this:
> --- On Wed, 7/1/09, Edward Ned Harvey <[email protected]> wrote:
> > I have a bunch of compute servers.  They all have local disks
> > mounted as /scratch to use for computation scratch space.  This
> ensures
> > maximum performance on all systems, and no competition for a shared
> > resource during crunch time.  At present, all of their /scratch
> > directories are local, separate and distinct.
> 
> Then this:
> > I think it would be awesome
> > if /scratch looked the same on all systems.
> 
> Does "look the same" mean configured the same? You didn't really expand
> on this statement and clarify the goal, which I'm not sure is
> uniformity, accessibility, or a combo of both.


You're right - although it was clear in my mind, I see how that was
confusing.  Let me try again:

If you go into some directory and do "ls" (or whatever), then the results
should be the same regardless of which machine you're on.  I do not want a
centralized network file server, because of the bandwidth and diskspace
bottleneck.  I want a distributed filesystem, which would provide the
aforementioned ubiquitousness of namespace, but also allows you to do heavy
IO on some machine without necessitating heavy network traffic.  A minimal
amount of traffic is probably required, just so the other machines all have
awareness of the existence of some file, but the file contents themselves
are not needed to traverse the network until some other machine requests the
contents of the file.


> You named the storage "/scratch", implying it is just a temporary usage
> space. Are you possibly adding requirements here that are unnecessary?

I did not mean to imply scratch is temporary - You see, we already have a
NFS server, which is backed up, so I named the local directory "scratch" so
users know it's not backed up.


> We have similar HPC systems that write results to local disk space.
> When the computation is completely done, the results are rsynced to
> separate network accessible storage space; the local space is then
> reclaimed for the next job. The rsync is controlled by LSF scripts, but
> any job management system will have similar capabilities. The network
> available results can then be perused by engineers. If they want to
> keep the results around permanently, they move the results at their
> discretion to longer term storage. Anything that isn't moved by the
> engineers after 7 days is considered unimportant, and deleted after 7
> days.
> 
> Would that paradigm work for you?

I have done exactly the same in the past - er - I should say the users have
done the same.  It's acceptable.  In fact, what we have now is also
acceptable.  I'm just trying to make it better.  (and learn more).  The two
downsides of the above are limitation of disk space on the nfs server, and
actually since a bunch of machines are all pushing their results up to the
nfs server, performance does become an issue.

What we have now, is as follows:
Each machine has a local disk mounted as /scratch.
Each machine exports it.
Each machine has an automount directory, /scratches
So you can access any scratch directory from any machine.  For example,
/scratches/machinename is the nfs mountpoint for machinename:/scratch

This is almost ideal - except - in order to access the data that was
generated last night, the user must know on which machine that data was
created.  Acceptable, but still could be cooler.  :-)


_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Re: [lopsa-tech] shared network disks - vs gfs - vs distributed filesystem - vs ...

Reply via email to