Edward Ned Harvey (lopser) wrote: > > Thanks to everyone that commented on this. My choices seem to be > > narrowing down to 2 categories: > > > > #1. NDMP to either a dedup device (Data Oomain, etc.) or to ZFS with lots of > > disk (for data integrity). This works nicely in that it backups/restores > > the CIFS > > and NFS acls on my multi-protocol file systems. My question is will it > > scale if I > > end up with 100 or 200TB on the VNX? I am assuming 10GB connections from > > the VNX to the backup server to the disk target. Is anyone using NDMP to > > backup this amount of data? > > > > #2. Use a continuous incremental approach (Commvault, TiBS, etc.) where I > > only backup the changes each day. This solves the possible scaling problem, > > but this approach backups via a NFS or CIFS share which mean they only see > > the acls of the protocol they used to access the share. Does anyone use an > > approach like this and if so, what do you do about multi-protocol file > > systems. > My experience with scalability and backups suggests that the > time to backup is the problem to focus on, rather than how > you'll get enough storage. Even with a netapp backing up via > ndmp, the filer has to walk the entire filesystem searching for > files that have changed since the last backup, and if you have a > lot of files, that takes a long time. On that system, a modest > 4T system, the nightly incrementals were up to 10-12 hours per > night when we were able to phase out that system in favor of > ZFS.
> IMHO, you need to have the ability to do instant block-level > incremental snapshots. ZFS does this. Netapp does too, if you > use Snapmirror (extra licensing.) And various other vendors > have larger more expensive enterprise solutions as well. > > If you're running something Linuxy, using inotify would avoid walking the entire filesystem. Possibly you could mod one of the rsync backup scripts that already uses inotify to detect changes in near real time to accumulate a list of changed files. Then back up only the changed files once a day. That would be half-way there, but not quite as granular as a block level backup with dedupe. I've used this one to good effect: https://mattmccutchen.net/utils/continusync OTOH, drbd would provide a block-level HA solution. There's even commercial support available for Red Hat, and two quite good tutorials. http://insights.oetiker.ch/linux/drbd/ https://alteeve.ca/w/2-Node_Red_Hat_KVM_Cluster_Tutorial Perhaps you could coax drbd to instead store a list of changed blocks instead of updating an image, providing a block-level incremental backup. Stir in some block-level de-duping (a trivial hash plus a Bloom filter, backed by an sha1-hash?) and you'd have fast, small incremental backups. The advantage would be not having to maintain a separate full copy of your filesystem. Could be a fun project. Good luck, -- Charles _______________________________________________ Discuss mailing list [email protected] https://lists.lopsa.org/cgi-bin/mailman/listinfo/discuss This list provided by the League of Professional System Administrators http://lopsa.org/
