Re: [Bug-ddrescue] Slow reads for x time to exit and whitespace skipping

Scott Dwyer Fri, 27 Jan 2017 20:11:18 -0800

Even though I have initially slammed the idea of skipping "whitespace",I have thought more about it, and will provide a possible theory ofoperation, if it were to ever be implemented. Although I still say it isdifficult to implement, and would only be feasible for certain situations.

The definition of whitespace would be areas filled completely withzeros. Meaning the entire cluster being read must be processed to see ifany bytes were not zero. If a non-zero byte is found, then theprocessing of that cluster stops, and the cluster is considered used.But if it is all zeros, then it is considered whitespace. This would addsome overhead to the program, although it is unclear how much it wouldaffect performance.

Once it is determined that a number of zero filled clusters have beenread in a row, it could trigger a form of skipping. The skipping wouldend and be reset once a cluster was found that was non-zero. How much toskip is the question, as you are skipping for a different reason than abad spot, so you don't want to get crazy with the skipping. It must bereasonably limited. The data could be read backwards after data wasfound, or maybe a reverse pass would be better.

That all sounds great, until you try to implement it alongside with thenormal skipping algorithm of bad blocks. It suddenly gets verycomplicated, as you have to try to figure out what to do when you haveboth bad blocks and whitespace. Also, it must be decided what sizedictates possible whitespace. If you based it on a number of emptyclusters, what happens when the user changes the cluster size to 1? Thatcould cause premature skipping, so there would need to be a size valueprovided to base skipping on. And do you keep separate track of areasskipped because of bad/slow blocks and areas skipped due to suspectedwhitespace? If so, how is that best processed in further passes?

And all of this is based on assuming that large chunks of zeros areactually unused space. While this is most likely true, it cannot alwaysbe assumed. And this would most likely work best on large drives thatonly had a small percent of space used. With modern drive size growing,I guess this condition is more likely to happen. There could/would belarge areas of the drive that have not been written to since the drivehas been in use. Filesystems can tend to clump things together, butthere is no guarantee that you would not skip good data. But then thepoint could be made that you can skip good data when skipping due to badblocks.

So is this a good idea? I don't know. It is like a poor man's version ofprocessing the filesystem. My initial instinct is that it is not thebest idea, but I guess it could work in some cases if done right.


Regards,
Scott

On 1/27/2017 2:47 PM, Antonio Diaz Diaz wrote:

Thanks to all for the feedback.
I tend to agree with Scott in that skipping unused space can'tpossibly work with any sort of consistency. Therefore I'll forgetabout it until someone shows with data that it can be useful. Forexample showing a correspondence between unused sectors and sectorscontaining the empty pattern, plus a bitmap showing that the usedsectors are grouped. If the used sectors are scattered, then findingthem is, as Scott said, like playing roulette.
Thanks,
Antonio.

_______________________________________________
Bug-ddrescue mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-ddrescue



_______________________________________________
Bug-ddrescue mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-ddrescue

Re: [Bug-ddrescue] Slow reads for x time to exit and whitespace skipping

Reply via email to