Hi Jim,
If you never worked with policy rules before, you may want to start by building
your nerves to it.
In the /usr/lpp/mmfs/samples/ilm path you will find several examples of
templates that you can use to play around. I would start with the 'list' rules
first.
Some of those templates are a bit complex, so here is one script that I use on
a regular basis to detect files larger than 1MB (you can even exclude specific
filesets):
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dss-mgt1:/scratch/r/root/mmpolicyRules # cat mmpolicyRules-list-large
/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])
/* Define three external lists */
RULE EXTERNAL LIST 'largefiles' EXEC
'/gpfs/fs0/scratch/r/root/mmpolicyRules/mmpolicyExec-list'
/* Generate a list of all files that have more than 1MB of space allocated. */
RULE 'r2' LIST 'largefiles'
SHOW('-u' vc(USER_ID) || ' -s' || vc(FILE_SIZE))
/*FROM POOL 'system'*/
FROM POOL 'data'
/*FOR FILESET('root')*/
WEIGHT(FILE_SIZE)
WHERE KB_ALLOCATED > 1024
/* Files in special filesets, such as mmpolicyRules, are never moved or deleted
*/
RULE 'ExcSpecialFile' EXCLUDE
FOR FILESET('mmpolicyRules','todelete','tapenode-stuff','toarchive')
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
And here is another to detect files not looked at for more than 6 months. I
found more effective to use atime and ctime. You could combine this with the
one above to detect file size as well.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
dss-mgt1:/scratch/r/root/mmpolicyRules # cat
mmpolicyRules-list-atime-ctime-gt-6months
/* A macro to abbreviate VARCHAR */
define([vc],[VARCHAR($1)])
/* Define three external lists */
RULE EXTERNAL LIST 'accessedfiles' EXEC
'/gpfs/fs0/scratch/r/root/mmpolicyRules/mmpolicyExec-list'
/* Generate a list of all files, directories, plus all other file system
objects,
like symlinks, named pipes, etc, accessed prior to a certain date AND that
are
not owned by root. Include the owner's id with each object and sort them by
the owner's id */
/* Files in special filesets, such as mmpolicyRules, are never moved or deleted
*/
RULE 'ExcSpecialFile' EXCLUDE
FOR FILESET ('scratch-root','todelete','root')
RULE 'r5' LIST 'accessedfiles'
DIRECTORIES_PLUS
FROM POOL 'data'
SHOW('-u' vc(USER_ID) || ' -a' || vc(ACCESS_TIME) || ' -c' ||
vc(CREATION_TIME) || ' -s ' || vc(FILE_SIZE))
WHERE (DAYS(CURRENT_TIMESTAMP) - DAYS(ACCESS_TIME) > 183) AND
(DAYS(CURRENT_TIMESTAMP) - DAYS(CREATION_TIME) > 183) AND NOT USER_ID = 0
AND NOT (PATH_NAME LIKE '/gpfs/fs0/scratch/r/root/%')
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Note that both these scripts work on a system wide (or root fileset) basis, and
will not give you specific directories, unless you run them several times on
specific directories (not very efficient). To produce general lists per
directory you would need to do some post processing on the lists, with 'awk' or
some other scripting language. If you need some samples I can send you.
And finally, you need to be more specific by what you mean by 'archivable'.
Once you produce the list you can do several things with them or leverage the
rules to actually execute things, such as move, delete, or hsm stuff. The
/usr/lpp/mmfs/samples/ilm path has some samples as well.
On 4/3/2020 18:25:33, Jim Kavitsky wrote:
Hello everyone,
I'm managing a low-multi-petabyte Scale filesystem with hundreds of millions of
inodes, and I'm looking for the best way to locate archivable directories. For
example, these might be directories where whose contents were greater than 5 or
10TB, and whose contents had atimes greater than two years.
Has anyone found a great way to do this with a policy engine run? If not, is
there another good way that anyone would recommend? Thanks in advance,
yes, there is another way, the 'mmfind' utility, also in the same sample path. You have
to compile it for you OS (mmfind.README). This is a very powerful canned procedure that
lets you run the "-exec" option just as in the normal linux version of 'find'.
I use it very often, and it's just as efficient as the other policy rules based
alternative.
Good luck.
Keep safe and confined.
Jaime
Jim Kavitsky
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss
.
.
. ************************************
TELL US ABOUT YOUR SUCCESS STORIES
http://www.scinethpc.ca/testimonials
************************************
---
Jaime Pinto - Storage Analyst
SciNet HPC Consortium - Compute/Calcul Canada
www.scinet.utoronto.ca - www.computecanada.ca
University of Toronto
661 University Ave. (MaRS), Suite 1140
Toronto, ON, M5G1M1
P: 416-978-2755
C: 416-505-1477
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss