I agree with Jonathan.
In my experience, if you look at why there are many small files being stored by 
researchers, these are either the results of data acquisition - high speed 
cameras, microscopes, or in my experience a wind tunnel. Or the images are a 
sequence of images produced by a simulation which are later post-processed into 
a movie or Ensight/Paraview format. When questioned, the resaechers will always 
say "but I would like to keep this data available just in case". In reality 
those files are never looked at again. And as has been said if you have a tape 
based archiving system you could end up with thousands of small files being 
spread all over your tapes.  So it is legitimate to make zips / tars of 
directories like that.

I am intrigued to see that GPFS has a policy facility which can call an 
external program. That is useful.

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of Jonathan Buzzard
Sent: Tuesday, July 25, 2017 11:02 AM
To: gpfsug main discussion list <[email protected]>
Subject: Re: [gpfsug-discuss] GPFS, LTFS/EE and data-in-inode?

On Mon, 2017-07-24 at 11:49 -0400, [email protected] wrote:
> On Mon, 24 Jul 2017 12:43:10 +0100, Jonathan Buzzard said:
>
> > For an archive service how about only accepting files in actual
> > "archive" formats and then severely restricting the number of files
> > a user can have?
> >
> > By archive files I am thinking like a .zip, tar.gz, tar.bz or similar.
>
> After having dealt with users who fill up disk storage for almost 4
> decades now, I'm fully aware of those advantages. :)
>
> ( /me ponders when an IBM 2314 disk pack with 27M of space was "a lot"
> in 1978, and when we moved 2 IBM mainframes in 1989, 400G took 2,500+
> square feet, and now 8T drives are all over the place...)
>
> On the flip side, my current project is migrating 5 petabytes of data
> from our old archive system that didn't have such rules (mostly due to
> politics and the fact that the underlying XFS filesystem uses a 4K
> blocksize so it wasn't as big an issue), so I'm stuck with what people put in 
> there years ago.

I would be tempted to zip up the directories and move them ziped ;-)

JAB.

--
Jonathan A. Buzzard                 Email: jonathan (at) buzzard.me.uk
Fife, United Kingdom.

_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fgpfsug.org%2Fmailman%2Flistinfo%2Fgpfsug-discuss&data=01%7C01%7Cjohn.hearns%40asml.com%7Ce8a4016223414177bf9408d4d33bdb31%7Caf73baa8f5944eb2a39d93e96cad61fc%7C1&sdata=pean0PRBgJJmtbZ7TwO%2BxiSvhKsba%2FRGI9VUCxhp6kM%3D&reserved=0
-- The information contained in this communication and any attachments is 
confidential and may be privileged, and is for the sole use of the intended 
recipient(s). Any unauthorized review, use, disclosure or distribution is 
prohibited. Unless explicitly stated otherwise in the body of this 
communication or the attachment thereto (if any), the information is provided 
on an AS-IS basis without any express or implied warranties or liabilities. To 
the extent you are relying on this information, you are doing so at your own 
risk. If you are not the intended recipient, please notify the sender 
immediately by replying to this message and destroy all copies of this message 
and any attachments. Neither the sender nor the company/group of companies he 
or she represents shall be liable for the proper and complete transmission of 
the information contained in this communication, or for any delay in its 
receipt.
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to