[ add Tony ]

Hi Jane,

I was out for a few days...

On Tue, Jun 22, 2021 at 10:22 AM Jane Chu <[email protected]> wrote:
>
>
> On 6/21/2021 5:15 PM, Luis Chamberlain wrote:
> > On Tue, Jun 15, 2021 at 11:55:19AM -0700, Jane Chu wrote:
> >> Hi, Dan,
> >>
> >> It appears the fact pmem region is of WB memtype renders set_memory_uc()
> >>
> >> to fail due to inconsistency between the requested memtype(UC-) and the
> >> cached
> >>
> >> memtype(WB).
> >>
> >> # cat /proc/iomem |grep -A 8 -i persist
> >> 1840000000-d53fffffff : Persistent Memory
> >>    1840000000-1c3fffffff : namespace0.0
> >>    5740000000-76bfffffff : namespace2.0
> >>
> >> # cat /sys/kernel/debug/x86/pat_memtype_list
> >> PAT memtype list:
> >> PAT: [mem 0x0000001840000000-0x0000001c40000000] write-back
> >> PAT: [mem 0x0000005740000000-0x00000076c0000000] write-back
> >>
> >> [10683.418072] Memory failure: 0x1850600: recovery action for dax page:
> >> Recovered
> >> [10683.426147] x86/PAT: fsdax_poison_v1:5018 conflicting memory types
> >> 1850600000-1850601000  uncached-minus<->write-back
> >>
> >> cscope search shows that unlike pmem, set_memory_uc() is primarily used by
> >> drivers dealing with ioremap(),
> >
> > Yes, when a driver *knows* the type must follow certain rules, it
> > requests it.
> >
> >> perhaps the pmem case needs another way to suppress prefetch?
> >>
> >> Your thoughts?
> >
> > The way to think about this problem is, who did the ioremap call for the
> > ioremap'd area? That's the driver that needs changing.
>
> I'm not sure if the pmem driver is doing something wrong, but rather the
> pmem memory failure handler.

This is a fixup that got lost in the shuffle. The issue is that the
memtype tracking needs to be updated before the set_memory_uc(), or
the memtype tracking needs to be bypassed.

You'll notice that this commit:

284ce4011ba6 x86/memory_failure: Introduce {set, clear}_mce_nospec()

...effectively replaced set_memory_np() with set_memory_uc(). However,
set_memory_np() does not check memtype and set_memory_uc() does.

I believe it should be sufficient to do the following, i.e. skip the
memtype sanity checking because the memory-failure code should have
already shot down all the conflicting WB mappings:

diff --git a/arch/x86/include/asm/set_memory.h
b/arch/x86/include/asm/set_memory.h
index 43fa081a1adb..0bf2274c5186 100644
--- a/arch/x86/include/asm/set_memory.h
+++ b/arch/x86/include/asm/set_memory.h
@@ -114,8 +114,13 @@ static inline int set_mce_nospec(unsigned long
pfn, bool unmap)

        if (unmap)
                rc = set_memory_np(decoy_addr, 1);
-       else
-               rc = set_memory_uc(decoy_addr, 1);
+       else {
+               /*
+                * Bypass memtype checks since memory-failure has shot
+                * down mappings.
+                */
+               rc = _set_memory_uc(decoy_addr, 1);
+       }
        if (rc)
                pr_warn("Could not invalidate pfn=0x%lx from 1:1 map\n", pfn);
        return rc;

Reply via email to