This message is from the T13 list server.

embedded response to Hale's embedded response:

...Harlan

On 3/12/02, Hale Landis <[EMAIL PROTECTED]> replied:
>This message is from the T13 list server.
>
>
>On Tue, 12 Mar 2002 13:28:34 -0800, Harlan Andrews wrote:
>>This message is from the T13 list server.
>>The more realistic way of implementing ReadLong/WriteLong is as
>>follows:
>
>And this is the algorithm implemented by many drive that still
>support R/W Long (but I have heard of several variations on this
>theme so beware!).
Always beware.  One should ALWAYS find out how the commands are 
implemented.

>
>>1) Report that the drive has only 4 ECC bytes in the ID data
>>(even if it has 200 ECC bytes).
>>2) Host does a ReadLong of LBA N. Drive returns data plus 4 ECC
>>bytes AND SAVES THE REST OF THE ECC BYTES.
>>3) Host alters some bits of the data and does a WriteLong back to
>>LBA N ( including the 4 ECC bytes.
>>4) Drive writes the data and the FULL ECC back to the media.
>>5) Host does a Read of LBA N. Drive reads the media and returns
>>an error OR the corrected data.
>
>Ahhh...  Now here is the problem...  What is a host to think if
>it has flipped only one bit of the data but the drive reports an
>uncorrectable error?  That one flipped bit added to the other
>"errors" in the PRML read processing might be all that is needed
>to make an uncorrectable error.  What does this prove?  And this
>scheme can not test the other 60 ECC bytes a typical modern drive
>has.  Remember that testing ECC means you must corrupt not only
>the data bytes but the ECC bytes too.
>
If I'm really trying to create an uncorrectable error I change a large 
number of data bytes AND also change the 4 ECC bytes.

When I'm testing ECC correction, I change a small number of data bytes 
and leave the ECC bytes alone.   Typically, this does NOT create an 
error, and I simply compare the data bytes to insure that the correction 
was correct.   If it does create an error, I ignore it and try again.   
It is any Data Miscompare which is important.  We HAVE found data 
miscompares this way which showed that the ECC was not robust enough.


>>6A) If an error is returned, the host knows that the data was
>>uncorrectable, BUT the host also must realize that there could
>>have been some OTHER error bits in the data which possibly caused
>>the data to be uncorrectable.
>
>Yep.  Sounds like testing conditions that have no known valid
>test result.
Results are valid.  See above.

>
>>6B) If NO error is returned, then the data should be the same as
>>what the host wrote via the WriteLong command.
[Oops.  I meant to say that the (corrected) data should be the same as 
the origonal READ data.]

>
>Yes.  But what if the drive implemented my simple algorithm?  In
>this case the result is the same and nothing was tested.
As I mentioned above, one should always know how each command is 
implemented.  If the drive used that 'simple algorithm' we wouldn't buy 
it (unless it was fixed).

>
>>1) They allow the Host to force an error thereby testing that the
>>error reporting actually works.  This also has other uses (as
>>reported by Raymond Liu.)
>
>Unless you have talked to the drive manufacturer you do not know
>if your Write Long command is adequate to force a real error.
Again, we DO talk to the drive manufacturers and we ARE able to force 
errors when desired.

>
>>2) They allow the Host to introduce correctable errors and test
>>that the error correction actually works (i.e. does not
>>mis-correct).  If an error is returned then the assumption is
>>that there was some OTHER error bit(s) read back incorrectly.
>>That is also useful information.
>
>Again, sounds like "random" testing of something (for the "fun of
>it") because the test has no single valid result.  Is this really
>a valid test?  I don't think so.  But then I'm from the "old
>school" of testing methodology when a test either Passed or
>Failed.
It has (unfortunately) produced VERY real failures.  The mis-corrections 
which we found very quickly with ReadLong and WriteLong took MUCH longer 
to appear with normal data.  But they DID appear with normal data.


>
>
>
>*** Hale Landis *** www.ata-atapi.com ***
>
>
>

...Harlan




Reply via email to