On Fri, Jun 08, 2018 at 04:51:19PM -0700, Dan Williams wrote:
>     mce: Uncorrected hardware memory error in user-access at af34214200
>     {1}[Hardware Error]: It has been corrected by h/w and requires no further 
> action
>     mce: [Hardware Error]: Machine check events logged
>     {1}[Hardware Error]: event severity: corrected
>     Memory failure: 0xaf34214: reserved kernel page still referenced by 1 
> users
>     [..]
>     Memory failure: 0xaf34214: recovery action for reserved kernel page: 
> Failed
>     mce: Memory error not recovered
> 
> In contrast to typical memory, dev_pagemap pages may be dax mapped. With
> dax there is no possibility to map in another page dynamically since dax
> establishes 1:1 physical address to file offset associations. Also
> dev_pagemap pages associated with NVDIMM / persistent memory devices can
> internal remap/repair addresses with poison. While memory_failure()
> assumes that it can discard typical poisoned pages and keep them
> unmapped indefinitely, dev_pagemap pages may be returned to service
> after the error is cleared.
> 
> Teach memory_failure() to detect and handle MEMORY_DEVICE_HOST
> dev_pagemap pages that have poison consumed by userspace. Mark the
> memory as UC instead of unmapping it completely to allow ongoing access
> via the device driver (nd_pmem). Later, nd_pmem will grow support for
> marking the page back to WB when the error is cleared.
> 
> Cc: Jan Kara <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Jérôme Glisse <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Naoya Horiguchi <[email protected]>
> Cc: Ross Zwisler <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> ---
<>
> +static int memory_failure_dev_pagemap(unsigned long pfn, int flags,
> +             struct dev_pagemap *pgmap)
> +{
> +     const bool unmap_success = true;
> +     unsigned long size;
> +     struct page *page;
> +     LIST_HEAD(tokill);
> +     int rc = -EBUSY;
> +     loff_t start;
> +
> +     /*
> +      * Prevent the inode from being freed while we are interrogating
> +      * the address_space, typically this would be handled by
> +      * lock_page(), but dax pages do not use the page lock.
> +      */
> +     page = dax_lock_page(pfn);
> +     if (!page)
> +             goto out;
> +
> +     if (hwpoison_filter(page)) {
> +             rc = 0;
> +             goto unlock;
> +     }
> +
> +     switch (pgmap->type) {
> +     case MEMORY_DEVICE_PRIVATE:
> +     case MEMORY_DEVICE_PUBLIC:
> +             /*
> +              * TODO: Handle HMM pages which may need coordination
> +              * with device-side memory.
> +              */
> +             goto unlock;
> +     default:
> +             break;
> +     }
> +
> +     /*
> +      * If the page is not mapped in userspace then report it as
> +      * unhandled.
> +      */
> +     size = dax_mapping_size(page);
> +     if (!size) {
> +             pr_err("Memory failure: %#lx: failed to unmap page\n", pfn);
> +             goto unlock;
> +     }
> +
> +     SetPageHWPoison(page);
> +
> +     /*
> +      * Unlike System-RAM there is no possibility to swap in a
> +      * different physical page at a given virtual address, so all
> +      * userspace consumption of ZONE_DEVICE memory necessitates
> +      * SIGBUS (i.e. MF_MUST_KILL)
> +      */
> +     flags |= MF_ACTION_REQUIRED | MF_MUST_KILL;
> +     collect_procs(page, &tokill, flags & MF_ACTION_REQUIRED);

You know "flags & MF_ACTION_REQUIRED" will always be true, so you can just
pass in MF_ACTION_REQUIRED or even just "true".

> +
> +     start = (page->index << PAGE_SHIFT) & ~(size - 1);
> +     unmap_mapping_range(page->mapping, start, start + size, 0);
> +
> +     kill_procs(&tokill, flags & MF_MUST_KILL, !unmap_success, ilog2(size),

You know "flags & MF_MUST_KILL" will always be true, so you can just pass in
MF_MUST_KILL or even just "true".

Also, you can get rid of the constant "unmap_success" if you want and just
pass in false as the 3rd argument.
_______________________________________________
Linux-nvdimm mailing list
[email protected]
https://lists.01.org/mailman/listinfo/linux-nvdimm

Reply via email to