Very nice insight. Thanks Brian, Cliff, and Phil!
On Thu, Aug 7, 2008 at 11:14 AM, Brian J. Murrell <[EMAIL PROTECTED]> wrote: > On Thu, 2008-08-07 at 10:51 -0700, Cliff White wrote: >> >> Just to be clear, there is a potential data loss issue due to the time >> delta between the backup and the live system. Any transactions in play >> that miss the snapshot could result in lost data, as the MDS will replay >> transaction logs and delete orphans on startup. So testing on your live >> system definately is for the brave. > > Indeed. There are a couple of alternatives to consider. I know your > production MO will be to take an LVM snapshot of the running MDT and > back that up, but if the MDT (i.e. filesystem) were shut down prior to > the backup, what you restore should be an identical MDT which you could > then start the filesystem against without the risks of in-play > transactions and orphan deletion. But indeed it is not a 100% > reproduction of what would happen restoring from an in-production > backup. > > Alternatively, rather than trying to start the OSTs against the restored > MDT you could simply do a filesystem level (i.e. ldiskfs) comparison of > the restored MDT against the production MDT. > > Indeed, there are other variations that you could use to satisfy > yourself that the restore worked. > > I would highly suggest you do any of this testing either on a testbed > (which you could build with a VirtualBox virtual cluster) or on your > production system before you put production data on it. It is good > system deployment policy to have fully tested backup and restore > policies before going live anyway. > > b. > > > _______________________________________________ > Lustre-discuss mailing list > [email protected] > http://lists.lustre.org/mailman/listinfo/lustre-discuss > > _______________________________________________ Lustre-discuss mailing list [email protected] http://lists.lustre.org/mailman/listinfo/lustre-discuss
