Re: Decorrupting a .tar file

Jakob Bohm Wed, 11 Nov 2020 12:02:25 -0800

On 2020-11-11 00:39, I. Hope Nothing wrote:

Hello all,
I have a large (183 GB) .tar file that has become corrupted. This isactually the _secondary_ backup of this data. The primary backup (aUSB HDD) was lost, so I was disappointed to find that _this_ backupisn't easily accessible.
From inspection and memory, it seems that this .tar file was corruptedby a poorly invoked file transfer operation, e.g., FTP with mixed upASCII/binary settings. Each line ends with '^M' before the '\n', andbecause this tarball has a lot of binary data in it `dos2unix -f` isunlikely to restore all occurrences of mangled line endings.
The first line of the .tar file is "Password:", and I can think ofseveral possibilities as to how this could have happened.
I have made a copy of the file to perform surgery on it.Unsurprisingly, the results of `dos2unix -f corrupted_tar_file.tar`crash out after only a couple of dozen entries when listing: `tar tvfcorrupted_tar_file_unix_eol.tar`.
There's a lot of binary data I want to keep on here. I am willing andkeen to learn how to forensically retrieve my data, and I wouldgreatly appreciate any help pointing me in the right direction. Thankyou for reading this far already!!
If you need transcripts of anything please let me know!!

This is simple hints for attempting manual rescue.

1. If possible, obtain a less corrupted copy of the tar file.
  For example, if it was corrupted when extracting it from a tape
  over ssh or rlogin, try extracting it again using a binary-safe
  protocol.  Similarly if it was corrupted after decompressing with
  gzip, bzip2 or any other such tool, try decompressing again.

2. Try to obtain a dos2unix implementation that doesn't try to be
  "smart", basically, you need to do a binary search replace from
  \r\n to \n while leaving alone any other bytes with the value 13.
  This will still loose any \r\n sequence that was in the original
  data, but there will probably be less corruption than in the file
  that was erroneously subjected to the opposite search replace.

3. Look up the tar file format specifications, it is actually a
  relatively simple file format and you will need to understand it
  to do the manual data rescue.  In particular, you will need to
  understand the PAX and GNU extensions to the format.

4. Using a binary file viewer, look for the tar header that marks
  the start of a much wanted file.  Then look for the tar header
  of the next file in the archive.  The bytes between the two
  headers are supposed to be your file contents and the header
  before the contents should give the number of bytes in the
  uncorrupted file.  If you did step 2 above, the actual data
  will probably be slightly too short due to too many removed \r
  characters, or due to the terminal protocol also removing some
  other bytes.

5. Use knowledge of your actual file format to figure out where
  an \r was probably lost and use the correct file length from
  the tar header as a cross check of your efforts.

6. Repeat steps 4 and 5 for each file.

Good luck, you will need it.

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

Re: Decorrupting a .tar file

Reply via email to