On Mon, Nov 23, 2020 at 10:32:18PM -0500, Jon LaBadie wrote: > I did reply to the original message, but looking back it was addressed > to you rather than the list. In case it was overlooked, here were my > regarding space:
I saw it, just absent-minded about replying sometimes... Good that you repeated it for the list, in any case, I figure. > Did not know what VDO was, so I read a Red Hat description. It seems > to consist of 3 components each I question the value of for amanda backup. > Hopefully someone with VDO experience can share it. I first heard about VDO from a friend who (surprise, surprise) works for Red Hat. He also runs a hosting company on the side and has been very happy with what VDO has done for his backup server, but his backup system is pretty rudimentary (sounds like it's a homebrew rsync-based solution), so I suppose his results may not be entirely indicative of what to expect with amanda+VDO. > Only one copy of duplicate blocks: Were your files being backed up > individually, as I do in a separate backup of my Home directory using > rsync, this could provide a worthwhile savings. But you will likely > be merging your files into a tarball or a dumpfile. The original > disk block alignment will be lost and likely not even match in one > day's tarball to the next. Ah, yes. I hadn't considered the tarball aspect of it. I figured it would be able to make good use of the deduplication if I have amanda write "uncompressed" and use only VDO's compression, but you've got a good point about merging the files still messing things up in that scenario. > LZ4 compression on the fly: I don't know the cpu load for the server > compressing 8TB of data daily. I assume that the CPU load would be comparable (not identical, of course, but in the same ballpark) for on-the-fly VDO compression vs. amanda compressing the data at that level. > There are points where amanda calculates how much space is left on > the device based on it configuration-specified size and how much it > has already sent. Of course there is actually more space available > because the compression occurs after amanda's involvement. The > difference may cause amanda to make less than optimal decisions. > > Amanda administrators who use tape drive compression face the same > problem. I believe most over specify the size of the storage medium > to allow more complete tape utilization. Hrm, yes... Now that you mention it, I do have foggy memories of seeing this discussed when I was using amanda previously. But, yeah, if block deduplication isn't going to be a significant benfit (and it sounds like it probably won't), then it probably would be better to skip VDO and have amanda handle the compression itself. > In the distant past I backed up windows systems by mounting the drives > on UNIX host. Most often used NFS. Any particular reason to prefer NFS over SMB? > Windows, at least then, does not like a file to be opened by multiple > processes. So each backup included several files that did not backup > because the file was already opened by another Windows process. And a > few system files were never backed up. I'm pretty certain this is still the case. One of the most annoying misfeatures of Windows, IMO. > Regarding backing up a KVM snapshot, would that mean that to recover > one file you would have to take a new snapshot, restore the entire system > from the backed up snap, copy the file to somewhere else, restore the > new snap, copy the file to final location? Pretty much, yeah. That's why I haven't even considered snapshot-based backups of my linux virts. But, if it makes the Windows admin happy... -- Dave Sherohman
