Hi Jamie & tarsnap users, On 08/16/15 17:28, Jamie Landeg-Jones wrote: > I tried to restore a file recently, but it failed, as it turned out that it > was hardlinked to another file. To retrieve it, I had to know which file it > was linked to, and restore that too. > > The 'root' file isn't even necesarily the first alphabetically - just the one > tarsnap backs up first. > > Is this a known issue? If it's difficult to get tarsnap to restore the file > automatically, I'd at least expect a more informative error message like > "Unable to restore hard linked file, unless you also restore file xxxxx", > or similar.
Yes, known issue as of about a year ago; as far as I know you're only the second person to trip over this. It's an awkward problem relating to the way the tar format works: Because tar is a streaming format, when we see data for the first time there's no way to know if that is hardlinked to a file which we will want to extract later -- and when we come to the hardlink we want to extract later, trying to "rewind" the tape is problematic. (Normal tar utilities run into the same problems, incidentally.) Right now I'm looking at two ways of attacking this: 1. Include data in every archive entry, including hard links -- this would make archives larger, but tarsnap's deduplication should make that mostly irrelevant. 2. Make a note of hardlinks where we didn't extract the first copy of the data, and then add a second pass through the archive to recover those -- this would keep archives the same size, but is considerably more complicated and potentially bug-prone due to edge cases like extracting files into directories which are being created with read-only permissions. If anyone has comments on these options or suggestions for other approaches, please comment on the github issue I've opened for tracking this: https://github.com/Tarsnap/tarsnap/issues/18 -- Colin Percival Security Officer Emeritus, FreeBSD | The power to serve Founder, Tarsnap | www.tarsnap.com | Online backups for the truly paranoid
