Re: [PATCH v6 6/6] libnvdimm, btt: rework error clearing

Dan Williams Thu, 24 Aug 2017 15:11:57 -0700

On Thu, Aug 24, 2017 at 2:40 PM, Kani, Toshimitsu <[email protected]> wrote:
> On Thu, 2017-08-24 at 17:07 -0400, Jeff Moyer wrote:
>> Dan Williams <[email protected]> writes:
>>
>> > > > I hit an infinite clear loop when DSM Clear Uncorrectable Error
>> > > > function fails.  Haven't looked into the details, but I suspect
>> > > > this unconditional retry is the cause of this.
>> > >
>> > > Thanks Toshi - that makes sense. I think the right thing to do
>> > > would be if the DSM fails, return an EIO yes? (Or should we
>> > > ignore the fact that there was an error, clear ->has_err, and let
>> > > the write take its course (possibly generate a CMCI)
>> > >
>> > > It will still be in the badblock list, and for reads ->rw_bytes
>> > > will still check and fail them.
>> > >
>> > > I'll send out a new series with a fix, but we really need to get
>> > > a unit test for BTT error clearing, and I'm working on
>> > > implementing the new error injection DSMs in libndctl and
>> > > nfit_test to do that.
>> > >
>> >
>> > I think as much as possible we should try to not fail writes. Leave
>> > the badblock entry in place so that we get an error on the next
>> > read. Upper-level software reacts more aggressively to write errors
>> > than read errors.
>>
>> I don't think it's wise to lie about data integrity.  If a write
>> cannot be completed, it *needs* to fail.  You can't make any
>> assumptions about what applications will do with the result.
>
> Agreed.  pmem driver returns with EIO on write in this scenario as
> well.


Ah true, I think we had this discussion before and you convinced me to
go the EIO route then as well. So consider me re-convinced.
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Re: [PATCH v6 6/6] libnvdimm, btt: rework error clearing

Reply via email to