Colin Percival wrote on 19/01/2016 21:35: > Hi Igor, > > On 01/19/16 03:14, Igor Ostapenko wrote: >> Colin Percival wrote on 19/01/2016 11:17: >>> Looks like the compression on file.tar.xz is getting in the way -- tarsnap >>> can't find any duplicated blocks, because in the compressed files there >>> aren't any. >> >> Then it looks I don't understand tarsnap deduplication mechanism >> correctly. That is, in this particular case I didn't expect tarsnap to >> find duplicates in *.tar.xz file, but I did expect it to respect >> previously archived file blob to be re-used (referenced) somehow in the >> next archive with absolutely the same file. > > Oh, it's exactly the same file? I assumed it was a new daily file. That's > strange then.
Yes, this is my case. > > Speaking of strange though, and looking more closely... >> $ # The first run >> $ tarsnap -cvf .test.daily.20160119104958 .test >> a .test >> a .test/file.tar.xz >> Total size Compressed size >> All archives 8.3 GB 3.5 GB >> (unique data) 1.4 GB 622 MB >> This archive 10 MB 10 MB >> New data 10 MB 10 MB >> >> $ # The second run >> $ tarsnap -cvf .test.daily.20160119105034 .test >> a .test >> a .test/file.tar.xz >> Total size Compressed size >> All archives 8.3 GB 3.5 GB >> (unique data) 1.4 GB 622 MB >> This archive 10 MB 10 MB >> New data 10 MB 10 MB > > The unique compressed data is 622 MB in both cases. Are you sure that > you didn't delete .test.daily.20160119104958 before you ran tarsnap > again to create .test.daily.20160119105034 ? > The second run was invoked right after the first one. There were no deletion. Actually, write-only key is used in this situation. Yep, 'unique data' is still the same. Probably it means that deduplication is fine and it's just a question to '--print-stats'.
