While watching the disk struggle reading bits of metadata that's been read 
before, it gave me an idea for a tools - I just wish I even had a fraction of 
the skills needed to build it.
While the "image the whole thing and work form the image" does work - it 
doesn't work for disks like these f***ing Seagates with their crappy firmware, 
and it can need a disproportionately long time to get results. Even without the 
Seagate firmware problem, it can take weeks to finish imaging a large drive and 
you can't do anything at all with the copy until it's finished.

So I was thinking ...
To mount the partition/filesystem needs a certain amount of metadata to be 
readable, and in particular every time I have to restart the process, it has to 
re-read that information. So what if I had a cache that would cache stuff that 
had been read, so that you would only ever need to read any block off the 
faulty disk one time ?

The way I thought it might work is this ...
It would present a fake drive device that you'd use as a proxy for the real 
disk. Every time you read from this fake drive, it would do a back to back read 
against the real drive IF it hasn't already read that block - in which case it 
would just return the cached block. It would of course need a chunk of disk 
space large enough for an image of the whole drive (or partition) plus the 
overhead of a bitmap for which blocks have been cached, and which have had 
persistent read errors.
It would need options, perhaps interactive*, to determine whether it re-tries 
failed blocks, or just replies with a media error result to the calling 
program. I'm fairly certain that many of the blocks I get errors on have been 
read at some point, so over time it could build up a fairly good cache - and 
thus allow mounting a filesystem much more reliably.

* That might also need a helper program that would determine what the block 
contained**. Taking this case, if it's a media file I'd not bother retrying 
until I'd done all the easy stuff. But if it's filesystem metadata (sometimes 
when it fails, mount give the error "mount: /dev/sdf5: can't read superblock") 
then it needs retrying.

** Supplmentary question ...
Is there any tool which given a block number will work out what it belongs to ? 
Using ddrescue to image a drive, at the end you'll be left with a file of what 
couldn't be copied - which is pretty useless in terms of working out if any 
particular file is affected or not.



As for my progress ...
I've recovered about 25% of the TV recordings files, which is better than the 
"nothing" I was half expecting. Sometimes the disk will crap out before the 
filesystem is mounted, sometimes it won't - hence the idea of the caching setup 
above. I am certain that if I tried to image the drive then I'd be at it for 
weeks, and probably get nothing back - worse, I'd have an image but random 
files would have holes in them and I'd have no idea which. Doing it file by 
file I know whether a file is complete or not - and as I say, I have about 25% 
recovered with no errors.

_______________________________________________
Dng mailing list
[email protected]
https://mailinglists.dyne.org/cgi-bin/mailman/listinfo/dng

Reply via email to