We have all of our GPFSmetadata on FlashCache devices (nee Ramsan) and that 
helps a lot. We also have our data going into monotonically increasing buckets 
of about 30TB that we call lockers (e.g. locker100, locker101, locker102), with 
1 primary active at a time.

We have an hourly job that scans the most recent 2 lockers (taked about 45 
seconds each) to generate a file list using the ILM 'LIST' policy of all files 
that have been modified or created in the last hour. That goes to a file that 
has all of the names which then trickles to a custom backup daemon that has up 
to 10 threads for rsyncing these over to our HSM server (running GPFS/TSM space 
management). From there things automatically get backed up and archived. Not 
all hourlies are necessarily complete (we can't guarantee that nobody is still 
hanging on to $lockernum-2 for instance), so we have a daily that scans the 
entire 3PB to find anything created/updated in the last 24 hours and does an 
rsync on that. There's no harm in duplication of hourlies from the rsync 
perspective because rsync takes care of that (already exists on destination). 
The daily job takes about 45 minutes. Needless to say it would be impossible 
without metadata on a fast flash device.



Sent from my android device.

-----Original Message-----
From: "Kallback-Rose, Kristy A" <[email protected]>
To: gpfsug main discussion list <[email protected]>
Sent: Sun, 25 Oct 2015 22:39
Subject: [gpfsug-discuss] ILM and Backup Question

Simon wrote recently in the GPFS UG Blog: "We also got into discussion on 
backup and ILM, and I think its amazing how everyone does these things in their 
own slightly different way. I think this might be an interesting area for 
discussion over on the group mailing list. There's a lot of options and 
different ways to do things!”

Yes, please! I’m *very* interested in what others are doing.

We (IU) are currently doing a POC with GHI for DR backups (GHI=GPFS HPSS 
Integration—we have had HPSS for a very long time), but I’m interested what 
others are doing with either ILM or other methods to brew their own backup 
solutions, how much they are backing up and with what regularity, what 
resources it takes, etc.

If you have anything going on at your site that’s relevant, can you please 
share?

Thanks,
Kristy

Kristy Kallback-Rose
Manager, Research Storage
Indiana University
_______________________________________________
gpfsug-discuss mailing list
gpfsug-discuss at spectrumscale.org
http://gpfsug.org/mailman/listinfo/gpfsug-discuss

Reply via email to