Re: [Beowulf] Clearing out scratch space

2018-06-13 Thread Matt Wallis
On 12/06/2018 6:06 PM, John Hearns via Beowulf wrote: My personal take is that heirarchical storage is the answere, automatically pushing files to slower and cheaper tiers. This is my preference as well, if manual intervention is required, it won't get done, but you do need to tune it a fair

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Jeff White
We also use a "workspace" mechanism to control our scratch filesystem.  It's a home-grown system which works like so: 0. /scratch filesystem configured with root as its owner and no other write permission allowed. 1. User calls mkworkspace which does a setuid, creates the directory, chowns i

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Ellis H. Wilson III
On 06/12/2018 10:25 AM, John Hearns via Beowulf wrote: > Disclaimer: I work for one such parallel and fast (and often used for scratch) company called Panasas. Ellis, I know Panasas well of course.  You are a great bunch of guys and girls, and have pulled my chestnuts from the fire many times (

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread John Hearns via Beowulf
> Disclaimer: I work for one such parallel and fast (and often used for scratch) company called Panasas. Ellis, I know Panasas well of course. You are a great bunch of guys and girls, and have pulled my chestnuts from the fire many times (such as the plaintive call from the customer - we can't acc

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Ellis H. Wilson III
On 06/12/2018 04:06 AM, John Hearns via Beowulf wrote: In the topic on avoiding fragmentation Chris Samuel wrote: >Our trick in Slurm is to use the slurmdprolog script to set an XFS project >quota for that job ID on the per-job directory (created by a plugin which >also makes subdirectories t

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Nick Evans
at {$job -1} we used local scratch and tmpwatch. This had a wrapper script that would exclude files and folders for any user that currently running a job on the node. This way nothing got removed until the users job had finished even if they hadn't accessed the files for a while and you don't have

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Skylar Thompson
On Tue, Jun 12, 2018 at 10:06:06AM +0200, John Hearns via Beowulf wrote: > What do most sites do for scratch space? We give users access to local disk space on nodes (spinning disk for older nodes, SSD for newer nodes), which (for the most part) GE will address with the $TMPDIR job environment var

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Chris Samuel
On Tuesday, 12 June 2018 6:06:06 PM AEST John Hearns via Beowulf wrote: > What do most sites do for scratch space? At ${JOB-1} we used GPFS and so for the scratch filesystem we used the GPFS policy engine to identify and remove files that had not been read/written for more than the defined numb

Re: [Beowulf] Clearing out scratch space

2018-06-12 Thread Dmitri Chubarov
Hello, John, In HLRS they have what they call a Workspace mechanism ( https://wickie.hlrs.de/platforms/index.php/Workspace_mechanism) where each user creates a scratch directory for their project under $SCRATCH_ROOT that has end-of-life time encoded in the name and a symlink to this directory in t