On Fri, Aug 24, 2018 at 6:09 AM Mick <[email protected]> wrote:
>
> However, you may prefer to use clonezilla instead of dd.  The dd command will
> copy each and every bit and byte of the partition whether it has data on it or
> not.  It is not particularly efficient.  Clonezilla will perform better at
> this task.
>
> Personally, I would only keep a back up of the filesystem contents with e.g.
> rsync, and reformat the partition and restore its contents in the case of a
> disaster recovery scenario.

Just to summarize the sorts of options you have:

dd = bit level copy.  Output is the size of the partition, period,
though you could compress the output by piping it into a compression
utility/etc.  Restored partition is identical to original, including
unallocated space, file fragmentation, etc.

clonezilla/partimage/etc = sparse bit level copy.  Output is the size
of all blocks that contain useful data, and can be further compressed.
Restored partition will contain zeros in the place of free space, but
will still preserve file fragmentation, special filesystem features,
etc.  Basically these tools operate like dd at a block level, but they
first identify which blocks are used/unused.  Savings is minimal for a
full filesystem, and substantial for a near-empty one.  These tools
will fall back to dd if they can't identify free space, and can
support a wider variety of filesystems quickly because they don't have
to be able to mount/read the filesystem, just figure out which blocks
matter.  I'll also note that with clonezilla you get a fairly nice
all-in-one bootable image that can store these images remotely via
ssh/samba/etc, which makes restoring images onto bare metal very easy.

tar/rsync/etc = file level copy.  Output is the logical size of all
the files on the filesystem.  Restore partition will only contain file
contents - details like fragmentation, trailing unused space in
blocks, unused space in general, or many filesystem-specific features
like snapshots/etc will NOT be preserved.  On the other hand it is
trivial to restore this data to any filesystem of any type of any
sufficient size.  The other solutions make resizing or changing
filesystems more-or-less impossible unless you can mount the image
files and then do a subsequent file-level copy (which is no different
than doing a file level copy in the first place).

I'd toss in one other general category:

dump/send/etc - filesystem-specific serializing tools.  The tools are
specific to the filesystem, so you can't just point them at a whole
hard drive with varying partition types like you can with clonezilla.
They may or may not reproduce details like fragmentation, but they
will efficiently store the actual data and will reproduce all
filesystem-specific features (snapshots, special attributes, etc).
They may also contain features that make them more efficient
(especially for incremental backups) because they can use an algorithm
suited for the low-level data structures employed by the filesystem,
instead of doing scanning at the file/directory level.  For example,
it could just read all the metadata on the disk sequentially as it is
physically stored on the disk, instead of traversing it from root down
to leaf in the directory hierarchy which could result in lots of
seeks.  Filesystems like btrfs/zfs have data structures that make it
VERY efficient to compare two related snapshots and find just the
differences between them, including differences of one block in the
middle of a large file without having to read the whole file.
Restoration usually is flexible with regard to filesystem size, but
not type.  That is, if you have a 100GB filesystem with 20GB of data,
you could restore it to a 30GB filesystem of the same type, but not
one of a different type as with tar.

The best solution for you obviously depends on your needs.  I try to
go with the last category in general as it is far more efficient.
But, clonezilla is my general tool for replicating whole systems/etc
since it does that so well and works with anything.  For partial
backups of high-value data I use duplicity, which is file-level (and
supports various cloud/etc options for storage).

-- 
Rich

Reply via email to