On Mon 04-06-18 07:31:25, Dan Williams wrote:
[...]
> I'm trying to solve this real world problem when real poison is
> consumed through a dax mapping:
> 
>         mce: Uncorrected hardware memory error in user-access at af34214200
>         {1}[Hardware Error]: It has been corrected by h/w and requires
> no further action
>         mce: [Hardware Error]: Machine check events logged
>         {1}[Hardware Error]: event severity: corrected
>         Memory failure: 0xaf34214: reserved kernel page still
> referenced by 1 users
>         [..]
>         Memory failure: 0xaf34214: recovery action for reserved kernel
> page: Failed
>         mce: Memory error not recovered
> 
> ...i.e. currently all poison consumed through dax mappings is
> needlessly system fatal.

Thanks. That should be a part of the changelog. It would be great to
describe why this cannot be simply handled by hwpoison code without any
ZONE_DEVICE specific hacks? The error is recoverable so why does
hwpoison code even care?

-- 
Michal Hocko
SUSE Labs
_______________________________________________
Linux-nvdimm mailing list
Linux-nvdimm@lists.01.org
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to