Dan Williams <[email protected]> writes: >>> I hit an infinite clear loop when DSM Clear Uncorrectable Error >>> function fails. Haven't looked into the details, but I suspect this >>> unconditional retry is the cause of this. >> >> Thanks Toshi - that makes sense. I think the right thing to do would be >> if the DSM fails, return an EIO yes? (Or should we ignore the fact that >> there was an error, clear ->has_err, and let the write take its course >> (possibly generate a CMCI) >> >> It will still be in the badblock list, and for reads ->rw_bytes will >> still check and fail them. >> >> I'll send out a new series with a fix, but we really need to get a unit >> test for BTT error clearing, and I'm working on implementing the new >> error injection DSMs in libndctl and nfit_test to do that. >> > > I think as much as possible we should try to not fail writes. Leave > the badblock entry in place so that we get an error on the next read. > Upper-level software reacts more aggressively to write errors than > read errors.
I don't think it's wise to lie about data integrity. If a write cannot be completed, it *needs* to fail. You can't make any assumptions about what applications will do with the result. -Jeff _______________________________________________ Linux-nvdimm mailing list [email protected] https://lists.01.org/mailman/listinfo/linux-nvdimm
