Big vote for cron jobs.

Our snapshot are created by a script, installed on each GPFS node. The script 
handles naming, removing old snapshots, checking that sufficient disk space 
exists before creating a snapshot,
etc.  We do snapshots every 15 minutes, keeping them with lower frequency over 
longer intervals. For example:

        current hour:           keep 4 snapshots
        hours -2 .. -8          keep 3 snapshots per hour
        hours -8 .. -24         keep 2 snapshots per hour
        days -1 .. -5           keep 1 snapshot per hour
        days -5 .. -15          keep 4 snapshots per day
        days -15 .. -30         keep 1 snapshot per day

the duration & frequency & minimum disk space can be adjusted per-filesystem.

The automation is done through a cronjob that runs on each GPFS (DSS-G) server 
to create the snapshot only if the node is currently the cluster master, as in:

        */15 * * * * root mmlsmgr -Y | grep -q "clusterManager.*:$(hostname 
--long):" && /path/to/snapshotter

This requires no locking and ensures that only a single instance of snapshots 
is created at each time interval.

We use the same trick to gather GPFS health stats, etc., ensuring that the data 
collection only runs on a single node (the cluster manager).


-- 
Mark Bergman                                           voice: 215-746-4061      
 
mark.berg...@pennmedicine.upenn.edu                      fax: 215-614-0266
http://www.med.upenn.edu/cbica/
IT Technical Director, Center for Biomedical Image Computing and Analytics
Department of Radiology                         University of Pennsylvania


_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to