On Fri, Aug 24, 2018 at 6:09 AM Mick <[email protected]> wrote: > > However, you may prefer to use clonezilla instead of dd. The dd command will > copy each and every bit and byte of the partition whether it has data on it or > not. It is not particularly efficient. Clonezilla will perform better at > this task. > > Personally, I would only keep a back up of the filesystem contents with e.g. > rsync, and reformat the partition and restore its contents in the case of a > disaster recovery scenario.
Just to summarize the sorts of options you have: dd = bit level copy. Output is the size of the partition, period, though you could compress the output by piping it into a compression utility/etc. Restored partition is identical to original, including unallocated space, file fragmentation, etc. clonezilla/partimage/etc = sparse bit level copy. Output is the size of all blocks that contain useful data, and can be further compressed. Restored partition will contain zeros in the place of free space, but will still preserve file fragmentation, special filesystem features, etc. Basically these tools operate like dd at a block level, but they first identify which blocks are used/unused. Savings is minimal for a full filesystem, and substantial for a near-empty one. These tools will fall back to dd if they can't identify free space, and can support a wider variety of filesystems quickly because they don't have to be able to mount/read the filesystem, just figure out which blocks matter. I'll also note that with clonezilla you get a fairly nice all-in-one bootable image that can store these images remotely via ssh/samba/etc, which makes restoring images onto bare metal very easy. tar/rsync/etc = file level copy. Output is the logical size of all the files on the filesystem. Restore partition will only contain file contents - details like fragmentation, trailing unused space in blocks, unused space in general, or many filesystem-specific features like snapshots/etc will NOT be preserved. On the other hand it is trivial to restore this data to any filesystem of any type of any sufficient size. The other solutions make resizing or changing filesystems more-or-less impossible unless you can mount the image files and then do a subsequent file-level copy (which is no different than doing a file level copy in the first place). I'd toss in one other general category: dump/send/etc - filesystem-specific serializing tools. The tools are specific to the filesystem, so you can't just point them at a whole hard drive with varying partition types like you can with clonezilla. They may or may not reproduce details like fragmentation, but they will efficiently store the actual data and will reproduce all filesystem-specific features (snapshots, special attributes, etc). They may also contain features that make them more efficient (especially for incremental backups) because they can use an algorithm suited for the low-level data structures employed by the filesystem, instead of doing scanning at the file/directory level. For example, it could just read all the metadata on the disk sequentially as it is physically stored on the disk, instead of traversing it from root down to leaf in the directory hierarchy which could result in lots of seeks. Filesystems like btrfs/zfs have data structures that make it VERY efficient to compare two related snapshots and find just the differences between them, including differences of one block in the middle of a large file without having to read the whole file. Restoration usually is flexible with regard to filesystem size, but not type. That is, if you have a 100GB filesystem with 20GB of data, you could restore it to a 30GB filesystem of the same type, but not one of a different type as with tar. The best solution for you obviously depends on your needs. I try to go with the last category in general as it is far more efficient. But, clonezilla is my general tool for replicating whole systems/etc since it does that so well and works with anything. For partial backups of high-value data I use duplicity, which is file-level (and supports various cloud/etc options for storage). -- Rich

