On Tue, Jan 28, 2020 at 08:06:18PM +1100, russ...@coker.com.au wrote:
> On Monday, 20 January 2020 2:34:09 AM AEDT Craig Sanders via luv-main wrote:
> > > [ paraphrased from memory because I deleted it: Russell said  ]
> > > [ something about using btrfs on small boxes, and zfs only on ]
> > > [ big storage servers                                         ]
> >
> > Unless you need to make regular backups from workstations or small servers
> > to a "big storage" ZFS backup server. In that case, use zfs so you can use
> > 'zfs send'.  Backups will be completed in a very small fraction of the
> > time they'd take with rsync....the time difference is huge - minutes vs
> > hours.  That's fast enough to do them hourly or more frequently if needed,
> > instead of daily.
>
> It really depends on the type of data.

No, it really doesn't.

> Backing up VM images via rsync is slow because they always have relatively
> small changes in the middle of large files.

rsyncing **ANY** large set of data is slow, whether it's huge files like VM
images or millions of small files (e.g. on a mail server).

rsync has to check at least the file sizes and timestamps, and then the block
checksums on every run. On large sets, this WILL take many hours, no matter
how much or how little has actually changed.

'zfs send' and 'btrfs send' already know exactly which blocks have changed and
they just send those blocks, no need for checking.  Why? Because a snapshot is
effectively just a list of blocks in use at a particular point in time.  COW
ensures that if a file is created or changed or deleted, the set of blocks in
the next snapshot will be different.

(a minor benefit of this is that if a file or directory is moved to another
directory in the same dataset, the only blocks that actually changed were the
blocks containing the directory info, so they're the only blocks that need be
sent. rsync, however, would send the entire directory contents because it's
all "new" data. Transparent compression also helps 'zfs send' - compressed
data requires fewer blocks to storer it....rsync, though, can't benefit from
transparent compression as it has to compare the source file's *uncompressed*
data with the target copy)

rsync is still useful as a tool for moving/copying data from one location to
another (whether on the same machine or to a different machine), but it's no
longer a good choice for backups. it just takes too long - by the time it has
finished, the source data will have changed.  It's an improved "cp".

I guess it's also still useful for backing up irrelevant machines like those
running MS Windows. But they should be storing important data on the file
server anyway, so they can be blown away and re-imaged whenever required.

> I guess you have to trade off the features of using one filesystem
> everywhere vs the ability to run filesystems independently of what
> applications will run on top.  I like the freedom to use whichever
> filesystem best suits the server.

I prefer to use the filesystem that's best for all machines on the network.

If ZFS is in use on the file-server or backup-server, then that means zfs
on everything else. If it's btrfs on the server, then it should be btrfs on
everything.

send/receive alone are worth putting in the time & effort to standardise, and
both zfs & btrfs also offer many more very useful features.

And if neither is currently in use, then that means scheduling appropriate
times & days to convert everything over to ZFS, starting with the server(s).
btrfs is not an option here because it just isn't as good as zfs...if i'm
going to go to all that trouble and hassle, i may as well get the most/best
benefit in exchange.

craig

--
craig sanders <c...@taz.net.au>
_______________________________________________
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main

Reply via email to