On 10/25/14 12:21, Scott Robison wrote: > Is there a means to determine what is "new" between two "full" archives?
Not directly. > I do a full archive of an entire partition each day, and am a little > surprised by how much new data exists. I would like to "diff" the two > archives, and am hoping there is something built in that is relatively > more efficient than a brute force approach. I can brute force it if need > be, but would rather not. Thanks! Because Tarsnap's deduplication happens after files have been squished together into a tar stream and that tar stream has been split into blocks, it's not feasible to track backwards to figure out which file a particular new block came from. (For that matter, you can get blocks which contain pieces from several different files.) The best trick I've found for tracking down what is changing (aside from the obvious 'find . -mtime -1d') is to run tarsnap with a small value for --maxbw-rate (e.g., --maxbw-rate 50000) and then send it a SIGINFO (or SIGUSR1 if your OS doesn't have SIGINFO) every second. This will prompt tarsnap to repeatedly print its current progress, and when it slows down dramatically you've found a place where it is finding lots of new data which it needs to upload. Colin Percival
