On Tue, Jun 8, 2010 at 8:05 PM, Michael Di Domenico <mdidomeni...@gmail.com> wrote: > Not that it's elegant, the first thing that pops to mind is using > 'split' to chunk the file into many little bits and then md5 each bit
While this may let you know that a file has been corrupted, it won't help recovering that file. Some compression algorithms, which may be considered as storage algorithms if you turn compression off, have options to create recovery records. For instance, in the RAR format (http://en.wikipedia.org/wiki/RAR), you can choose how much redundant data you want to include in your archive (whose size will be increased accordingly). Excerpt from Alexander Roshal's rar user's manual: """ rr[N] Add data recovery record. Optionally, redundant information (recovery record) may be added to an archive. This will cause a small increase of the archive size and helps to recover archived files in case of floppy disk failure or data losses of any other kind. A recovery record contains up to 524288 recovery sectors. The number of sectors may be specified directly in the 'rr' command (N = 1, 2 .. 524288) or, if it is not specified by the user, it will be selected automatically according to the archive size: a size of the recovery information will be about 1% of the total archive size, usually allowing the recovery of up to 0.6% of the total archive size of continuously damaged data. It is also possible to specify the recovery record size in percent to the archive size. Just append the percent character to the command parameter. For example: rar rr3% arcname If data is damaged continuously, then each rr-sector helps to recover 512 bytes of damaged information. This value may be lower in cases of multiple damage. """ Cheers, -- Kilian _______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf