Hi,

[EMAIL PROTECTED] wrote on 05.01.2007 at 10:27:41 [[BackupPC-users] tar vs. 
cpio: Survivability of archives]:
> [...]
> Of course, in researching this further, I can't seem to find a resource 
> that agrees with the fact that a corrupted tar is unrecoverable beyond a 
> bad block.  Am I wrong on this?  What are your thoughts?

yes, I guess you are. A quick test along the lines of

% tar cf test.tar some/path
% dd if=test.tar of=test2.tar bs=512 skip=10
% tar tvf test2.tar

(or even "dd if=test.tar bs=512 skip=10 | tar tv")

says

> tar: This does not look like a tar archive
> tar: Skipping to next header

and then continues to list the files in the archive from that point. In
fact, a second test like

% dd if=test.tar of=test3.tar bs=512 skip=20
% tar tvf test3.tar

produces (in my case!) a listing missing just the first file in the test2
listing (which was larger than 5k, obviously :).
What really surprised me were my first two tests with skip=1 and skip=2,
which failed to produce even the warning. Maybe there's room for a boot
loader at the beginning of a tar archive? ;-)

Ok, so let's do a more interesting test to find out how tar handles
corruption in the middle of an archive:

% perl -e 'use Fcntl; sysopen F, "test.tar", O_RDWR or die "boo: $!"; seek F, 
512 * 40, SEEK_SET; syswrite F, " " x 5120; close F;'
% tar tvf test.tar

You might need to increase the 5120 if you happen to be overwriting only
file content and no tar header information, but at some point you'll read

> tar: Skipping to next header

and at the end of the listing

> tar: Error exit delayed from previous errors

So, tar seems to handle corruption quite gracefully, losing only the files
actually affected by the corruption.

As you said, matters are quite different with some compression algorithms.
Maybe the reference you have in mind actually meant "tgz" archives (gzip
compressed tar archives)? As I understand it, gzip has exactly the problem
you describe, whereas bzip2, for instance, compresses in independent blocks
(meaning a block containing corruption will be lost from the point of
corruption onward, but following blocks will not). Considering bzip2 blocks
are 100k - 900k in size, you will probably lose more than with tar alone.
I agree fully about not using compression on archives.

All tests done with GNU tar 1.13.25. In fact, GNU tar 1.12 (file date Nov 19
1999 :-) behaves identically except for lacking the 'Error exit delayed ...'
message.

Regards,
Holger

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
BackupPC-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/backuppc-users
http://backuppc.sourceforge.net/

Reply via email to