Hi Jamie, A thousand apologies for not acknowledging your email almost four months ago. You did a great job of educating me, and bringing to my attention the -q switch. Thanks.
Craig On Sat, 2018-11-24 at 16:01 +0000, Jamie Landeg-Jones wrote: > Craig Hartnett <[email protected]> wrote: > > > Yet in both cases, the command does not exit for about 16-21 minutes, > > which is what was going to lead me to complain. However, the actual > > restore was done about as quickly as one would expect. > > Hi. tarsnap follows the "tar" standard. In fact, it actually uses the > bsd "libarchive" library to do the tar-bits. > > The thing with the tar standard, is that a file can exist at any place in > the archive, and also *more than once* in an archive. (I don't know why, > but if I had to guess, I'd say this is to allow appending to tar archives > an update to a file - useful for tape backups!) > > For the first case, tarsnap therefore needs to scan the whole archive > when you are restoring directories or wildcards. > > For the second case, to ensure you always restore the latest version of a > file, even when you specify a specific filename directly, it still needs > to scan through everything if it's found your file. > > To get around this issue, Colin has added a "quick" mode flag that causes > the restore to stop as soon as the listed files are restored. Note, > using this will only work properly if each specific filename you want > to restore is specified literally, no wildcards or just direcory names. > > This is what you want to use. > > From the tarsnap manpage: > > | -q (--fast-read) (x and t modes only) Extract or list only the first > | archive entry that matches each pattern or filename operand. > | Exit as soon as each specified pattern or filename has been matched. > | By default, the archive is always read to the very end, since there > | can be multiple entries with the same name and, by convention, later > | entries overwrite earlier entries. This option is provided as a > | performance optimization. > > Now, this raises 2 other questions, but you'll need a reply from Colin or > Graham on these!: > > 1) We can never append to tarsnap archives. The way they are stored, this is > nonsensical. > > Restoring the same file more than once in an archive therefore makes no > sense. > > Therefore, wouldn't it be better if on a single tarsnap run, tarsnap > refused > to backup the same file more than once, and then similarly on a resore, > always assumed "-q" when all the named entries to be retrieved are > mentioned > literally? > > Obviously, if an entry turns out to be a directory, and not a file, > tarsnap will have to then fall back to scanning the full index (as indeed > it would if it first happens to find a filename whose path is a subset of > the named entry [i.e. the named entry is a directory, but a file within it > happens to be in the index before the directory itself...] Though I guess > this > latter point would never happen with tarsnap, so may be moot) > > 2) tarsnap archives are not sequential-only-access files. Why isn't the index > of an archives contents held more optimally than it appears? (There will > be > a good reason for this, I just don't know what it is!) > > Cheers, > Jamie
