Re: [Beowulf] Clearing out scratch space

Ellis H. Wilson III Tue, 12 Jun 2018 06:42:44 -0700

On 06/12/2018 04:06 AM, John Hearns via Beowulf wrote:

In the topic on avoiding fragmentation Chris Samuel wrote:
 >Our trick in Slurm is to use the slurmdprolog script to set an XFS project
 >quota for that job ID on the per-job directory (created by a plugin which
 >also makes subdirectories there that it maps to /tmp and /var/tmp for the
 >job) on the XFS partition used for local scratch on the node.

I had never thought of that, and it is a very neat thing to do.
What I would like to discuss is the more general topic of clearing filesfrom 'fast' storage.Many sites I have seen have dedicated fast/parallel storage which isreferred to as scratch space.The intention is to use this scratch space for the duration of aproject, as it is expensive.However I have often seen that the scratch space i used as permanentstorage, contrary to the intentions of whoever sized it, paid for it andinstalled it.
I feel that the simplistic 'run a cron job and delete files older than Ndays' is outdated.
My personal take is that heirarchical storage is the answere,automatically pushing files to slower and cheaper tiers.

Disclaimer: I work for one such parallel and fast (and often used forscratch) company called Panasas.

I disagree with the notion that hierarchical storage is a silver bulletin many HPC-oriented cases. In fact, I would argue in some environmentsit poses serious risks to being able to keep a lid on your storage costand DC footprint, whether that's for scratch, home, or archive storage.People (including myself in my own systems testing) can generate anenormous amount of data that has near-zero value past the projectcurrently being worked on, which may only be measured in weeks or lownumbers of months. In many of my cases it's cheaper to regenerate thosemany TBs of data then to hold onto it for a year or more. Auto-tieringscratch data to cheaper storage as it gets colder seems like an easyanswer as it takes some of this responsibility away from the users, butyou'll still want to /someday/ ditch that data entirely (forscratch-like data that is). Culling through piles of likelymechanically named files you haven't looked at in a long time isdifficult as a human exercise, and without sufficiently complex mediaasset management it's also difficult from a storage perspective as yourdata may take a /long/ time to even list, much less grep through, whenpulling from true archive storage.

For true scratch I think the solution presented by many of the postersrelating to automatic deletion policies managed by administrators, whichdevelops and forces good data habits, is ultimately the cleanestsolution in the long-term.

Now, tiering within a storage layer based on different types or accessfrequencies of data is perfectly reasonable and is something we do inour systems. Also, using external software to automatically tier coldbut persistent data (i.e., home dir data) from fast to archive storageis also reasonable. But there are a lot of pitfalls from trying toautomatically tier data that isn't supposed to be (eternally) persistentIMHO.


Best,

ellis

--
Ellis H. Wilson III, Ph.D.
     www.ellisv3.com
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Re: [Beowulf] Clearing out scratch space

Reply via email to