Dan Williams <[email protected]> writes:

>>> I hit an infinite clear loop when DSM Clear Uncorrectable Error
>>> function fails.  Haven't looked into the details, but I suspect this
>>> unconditional retry is the cause of this.
>>
>> Thanks Toshi - that makes sense. I think the right thing to do would be
>> if the DSM fails, return an EIO yes? (Or should we ignore the fact that
>> there was an error, clear ->has_err, and let the write take its course
>> (possibly generate a CMCI)
>>
>> It will still be in the badblock list, and for reads ->rw_bytes will
>> still check and fail them.
>>
>> I'll send out a new series with a fix, but we really need to get a unit
>> test for BTT error clearing, and I'm working on implementing the new
>> error injection DSMs in libndctl and nfit_test to do that.
>>
>
> I think as much as possible we should try to not fail writes. Leave
> the badblock entry in place so that we get an error on the next read.
> Upper-level software reacts more aggressively to write errors than
> read errors.

I don't think it's wise to lie about data integrity.  If a write cannot
be completed, it *needs* to fail.  You can't make any assumptions about
what applications will do with the result.

-Jeff
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to