On Mon 04-06-18 07:31:25, Dan Williams wrote:
[...]
> I'm trying to solve this real world problem when real poison is
> consumed through a dax mapping:
>
> mce: Uncorrected hardware memory error in user-access at af34214200
> {1}[Hardware Error]: It has been corrected by h/w and requires
> no further action
> mce: [Hardware Error]: Machine check events logged
> {1}[Hardware Error]: event severity: corrected
> Memory failure: 0xaf34214: reserved kernel page still
> referenced by 1 users
> [..]
> Memory failure: 0xaf34214: recovery action for reserved kernel
> page: Failed
> mce: Memory error not recovered
>
> ...i.e. currently all poison consumed through dax mappings is
> needlessly system fatal.
Thanks. That should be a part of the changelog. It would be great to
describe why this cannot be simply handled by hwpoison code without any
ZONE_DEVICE specific hacks? The error is recoverable so why does
hwpoison code even care?
--
Michal Hocko
SUSE Labs
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm