Re: [PATCH v3 01/11] pagemap: Introduce ->memory_failure()
On Mon, Feb 8, 2021 at 2:55 AM Shiyang Ruan wrote: > > When memory-failure occurs, we call this function which is implemented > by each kind of devices. For the fsdax case, pmem device driver > implements it. Pmem device driver will find out the block device where > the error page locates in, and try to get the filesystem on this block > device. And finally call filesystem handler to deal with the error. > The filesystem will try to recover the corrupted data if possiable. > > Signed-off-by: Shiyang Ruan > --- > include/linux/memremap.h | 8 > 1 file changed, 8 insertions(+) > > diff --git a/include/linux/memremap.h b/include/linux/memremap.h > index 79c49e7f5c30..0bcf2b1e20bd 100644 > --- a/include/linux/memremap.h > +++ b/include/linux/memremap.h > @@ -87,6 +87,14 @@ struct dev_pagemap_ops { > * the page back to a CPU accessible page. > */ > vm_fault_t (*migrate_to_ram)(struct vm_fault *vmf); > + > + /* > +* Handle the memory failure happens on one page. Notify the > processes > +* who are using this page, and try to recover the data on this page > +* if necessary. > +*/ > + int (*memory_failure)(struct dev_pagemap *pgmap, unsigned long pfn, > + int flags); > }; After the conversation with Dave I don't see the point of this. If there is a memory_failure() on a page, why not just call memory_failure()? That already knows how to find the inode and the filesystem can be notified from there. Although memory_failure() is inefficient for large range failures, I'm not seeing a better option, so I'm going to test calling memory_failure() over a large range whenever an in-use dax-device is hot-removed. ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [block] 52f019d43c: ndctl.test-libndctl.fail
On Fri, 2021-03-05 at 08:42 +0100, Christoph Hellwig wrote: > Dan, > > can you make any sense of thos report? [..] > > check_set_config_data: dimm: 0 read2 data miscompare: 0 > > check_set_config_data: dimm: 0x1 read2 data miscompare: 0 > > check_set_config_data: dimm: 0x100 read2 data miscompare: 0 > > check_set_config_data: dimm: 0x101 read2 data miscompare: 0 > > check_dax_autodetect: dax_ndns: 0x558a74d92f00 ndns: 0x558a74d92f00 > > check_dax_autodetect: dax_ndns: 0x558a74d91f40 ndns: 0x558a74d91f40 > > check_pfn_autodetect: pfn_ndns: 0x558a74d91f40 ndns: 0x558a74d91f40 > > check_pfn_autodetect: pfn_ndns: 0x558a74d8c5e0 ndns: 0x558a74d8c5e0 > > check_btt_autodetect: btt_ndns: 0x558a74d8c5e0 ndns: 0x558a74d8c5e0 > > check_btt_autodetect: btt_ndns: 0x558a74da1390 ndns: 0x558a74da1390 > > check_btt_autodetect: btt_ndns: 0x558a74d8c5e0 ndns: 0x558a74d8c5e0 > > check_btt_autodetect: btt_ndns: 0x558a74d91f40 ndns: 0x558a74d91f40 > > namespace7.0: failed to write /dev/pmem7 > > check_namespaces: namespace7.0 validate_bdev failed > > ndctl-test1 failed: -6 > > libkmod: ERROR ../libkmod/libkmod-module.c:793 kmod_module_remove_module: > > could not remove 'nfit_test': Resource temporarily unavailable > > test-libndctl: FAIL Yes, it looks like my unit test checks for exactly the behavior you changed. It was convenient to test that the device could be switched back to rw via BLKROSET, but I don't require that. The new behaviour of letting the disk->ro take precedence makes more sense to me, so I'll update the test for the new behaviour. I.e. I don't think regressing a unit test counts as a userspace regression. ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org