The only change I'd like to make is to bias ddrescue toward splitting or
trimming the large blocks first. It is annoying to see it working on small
blocks, which are unlikely to contain whole files, when you know there are
several hundred megabyte blocks waiting for attention.


On 23 January 2013 06:17, Antonio Diaz Diaz <[email protected]> wrote:

> Hello Caius Severus,
>
>
> kwb78 wrote:
>
>> I think that in a situation where the drive has many bad sectors
>> across the whole disk, as opposed to contained within a few areas,
>> ddrescue does not approach the problem optimally.
>>
>
> Optimality has more than one dimension. :-)
>
>
>
>  If I have understood the way the splitting phase occurs at present, the
>> drive is read sequentially starting from the first unsplit area forwards
>> until a certain number of bad sectors have been encountered and then it
>> jumps an arbitrary distance ahead. It the repeats this, gradually breaking
>> down the unsplit areas until it has read every sector and recovered the
>> data
>> or marked it bad.
>>
>
> Yes. This is point 4 of "Algorithm" in ddrescue's manual[1].
>
> [1]http://www.gnu.org/**software/ddrescue/manual/**
> ddrescue_manual.html#Algorithm<http://www.gnu.org/software/ddrescue/manual/ddrescue_manual.html#Algorithm>
>
>
>
>  When there are only a few areas of bad sectors this approach works quite
>> well, but with larger numbers of bad sectors it is painfully slow. The
>> reason for this is the time penalty for reading a bad sector can be of the
>> order of seconds for each one. When it is attempting to read 8 or more
>> consecutive sectors before skipping, this means that it can spend a minute
>> or more between skips.
>>
>
> Ddrescue 1.14 skipped after a minimum of 2 consecutive errors, but this
> produced logfiles too large (see the point 4 mentioned above). The
> algorithm of ddrescue is a compromise between splitting first the larger
> areas and keeping logfile size under control. Of course this compromise can
> be improved.
>
>
>
>  My suggested algorithm is as follows:
>>
>> Following trimming,
>>
>> 1. Examine the log file to locate the largest unsplit area on the disk
>> that
>> is directly adjacent to a known good area.
>>
>
> There are no such areas. After trimming, all non-split areas are flanked
> by bad sectors, or else they would have been fully copied or trimmed out.
>
>
>
>  3. Upon encountering 2 bad sectors in a row, stop (since the probability
>> is
>> that the next sector will also be bad).
>>
> [...]
>
>  5. When there are no remaining unsplit areas next to good areas, choose
>> the
>> largest unsplit area and begin reading from the middle, not the edge.
>>
>
> These steps make logfile grow fast. The only way they can be implemented
> is by alternating them with full reads on the smallest non-split areas so
> that logfile size is kept reasonable.
>
> Currently ddrescue avoids splitting areas if there are more than 1000
> blocks in the logfile. Maybe this threshold could be set with a command
> line option. (--max-logfile-size ?).
>
>
>
>  6. Keep doing this until the unsplit areas are all below an arbitrary
>> minimum size, at which point go back to reading linearly.
>>
>
> Or until logfile has grown beyond a given size.
>
> Thank you for sharing your observations. I'll try to optimize splitting a
> little more. :-)
>
>
> Regards,
> Antonio.
>
>
> ______________________________**_________________
> Bug-ddrescue mailing list
> [email protected]
> https://lists.gnu.org/mailman/**listinfo/bug-ddrescue<https://lists.gnu.org/mailman/listinfo/bug-ddrescue>
>
_______________________________________________
Bug-ddrescue mailing list
[email protected]
https://lists.gnu.org/mailman/listinfo/bug-ddrescue

Reply via email to