On Thu, Sep 30, 2021 at 1:54 PM Borislav Petkov <[email protected]> wrote:
>
> On Thu, Sep 30, 2021 at 01:39:03PM -0700, Dan Williams wrote:
> > Yes, that's a good way to think about it. The only way to avoid poison
> > for page allocator pages is to just ditch the page. In the case of
> > PMEM the driver can do this fine grained dance because it gets precise
> > sub-page poison lists to consult and implements a non-mmap path to
> > access the page contents.
>
> Ok, good.
>
> Now, before we do anything further here, I'd like for this very much
> non-obvious situation to be documented in detail so that we know what's
> going on there and what that whole_page notion even means. Because this
> is at least bending the meaning of page states like poison and what that
> really means for the underlying thing - PMEM or general purpose DIMMs.

Hmm, memory type does not matter in this path.

>
> And then that test could be something like:
>
>         /*
>          * Normal DRAM gets poisoned as a whole page, yadda yadda...

No, the whole_page case is equally relevant for PMEM...

>          /
>         if (whole_page) {
>
>         /*
>          * Special handling for PMEM case, driver can handle accessing 
> sub-page ranges
>          * even if the whole "page" is poisoned, blabla
>         } else {

...and UC is acceptable to DRAM.

>                 rc = _set_memory_uc(decoy_addr, 1);
>         ...
>
> so that it is crystal clear what's going on there.
>

The only distinction that matters here is whether the reported
blast-radius of the poison reported by the hardware is a sub-page or
whole-page. Memory type does not matter.

I.e. it's fine if a DRAM page with a single cacheline error only gets
marked UC. Speculation is disabled and the page allocator will still
throw it away and never use it again. Similarly NP is fine for PMEM
when the machine-check-registers indicate that the entire page is full
of poison. The driver will record that and block any attempt to
recover any data in that page.

Reply via email to