> Both the RFC patch and the above 5-step recovery plan look neat, step 4)
> is nice to carry forward on icelake when a single instruction to clear
> poison is available.
Jane,
Clearing poison has some challenges.
On persistent memory it probably works (as the DIMM is going to remap that
now list as "fragile") and injecting uncorrected
errors into buffers that are then copied using copy_mc_to_kernel()
result in recovered machine checks with the return value correctly
indicating the amount remaining.
Reviewed-by: Tony Luck
-Tony
___
Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org
To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
On Tue, Jun 23, 2020 at 11:17:41PM +0100, Matthew Wilcox wrote:
> It might also be nice to have an madvise() MADV_ZERO option so the
> application doesn't have to look up the fd associated with that memory
> range, but we haven't floated that idea with the customer yet; I just
> thought of it now.
On Tue, Jun 23, 2020 at 09:17:45PM +0100, Matthew Wilcox wrote:
>
> Hardware actually tells us the blast radius of the error, but we ignore
> it and take out the entire page. We've had a customer request to know
> exactly how much of the page is damaged so they can avoid reconstructing
> an
> When a copy function hits a bad page and the page is not yet known to
> be bad, what does it do? (I.e. the page was believed to be fine but
> the copy function gets #MC.) Does it unmap it right away? What does
> it return?
I suspect that we will only ever find a handful of situations where
> Now maybe copy_to_user() should *always* work this way, but I’m not convinced.
> Certainly put_user() shouldn’t — the result wouldn’t even be well defined.
> And I’m
> unconvinced that it makes much sense for the majority of copy_to_user()
> callers
> that are also directly accessing the
On Thu, Apr 30, 2020 at 12:50:40PM -0700, Linus Torvalds wrote:
I see your point about the namimg being important. I think Dan's
case is indeed "copy from pmem to user" where only options for faulting
are #MC on the source addresses, and #PF on the destination.
> The only *fundamental* access
On Thu, Apr 30, 2020 at 11:42:20AM -0700, Andy Lutomirski wrote:
> I suppose there could be a consistent naming like this:
>
> copy_from_user()
> copy_to_user()
>
> copy_from_unchecked_kernel_address() [what probe_kernel_read() is]
> copy_to_unchecked_kernel_address() [what probe_kernel_write()
> (a) is a trap, not an exception - so the instruction has been done,
> and you don't need to try to emulate it or anything to continue.
Maybe for errors on the data side of the pipeline. On the instruction
side we can usually recover from user space instruction fetches by
just throwing away the
> By "asynchronous" I don't mean "hours later".
>
> Make it be "interrupts are enabled, before serializing instruction".
>
> Yes, we want bounded error handling latency. But that doesn't mean
> "synchronous"
Another X86 vendor seems to be adding something like that. See MCOMMIT
in
On Mon, Apr 20, 2020 at 01:07:09PM -0700, Linus Torvalds wrote:
> On Mon, Apr 20, 2020 at 12:29 PM Dan Williams
> wrote:
> >
> > I didn't consider asynchronous to be
> > better because that means there is a gap between when the data
> > corruption is detected and when it might escape the system
Possible race accessing memdev structures after dropping the
mutex. Dan Williams says this could race against another thread
that is doing:
# echo "ACPI0012:00" > /sys/bus/acpi/drivers/nfit/unbind
Reported-by: Jane Chu
Signed-off-by: Tony Luck
---
This time with a commit message and credit
On Fri, Jan 11, 2019 at 02:07:00PM -0800, Dan Williams wrote:
> No harm in fixing it up, I'd take that patch if you wrote it up.
Something like this?
diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c
index 011d3db19c80..22945bf803c8 100644
--- a/drivers/acpi/nfit/core.c
+++
currently
no race here.
-Tony
From: Jane Chu [mailto:jane@oracle.com]
Sent: Friday, January 11, 2019 1:43 PM
To: Luck, Tony
Cc: linux-nvdimm
Subject: 23222f8f8dce6: acpi, nfit: Add function to look up nvdimm device and
provide SMBIOS handle
Hi, Tony,
I happen to be looking at the a
> I would recommend to remove this section at all.
> New maintainers won't come out of blue, but will be come
> from existing community and such individuals for sure will see
> and judge by themselves to whom they trust and to whom not.
Perhaps this is more of a hint to contributors than to
On Wed, Jun 06, 2018 at 09:42:28PM -0700, Dan Williams wrote:
> > Cc: Thomas Gleixner
> > Cc: Ingo Molnar
> > Cc: "H. Peter Anvin"
> > Cc: Tony Luck
>
> Tony, safe to assume you are ok with this patch now that the
> decoy_addr approach is back
On Mon, Jun 04, 2018 at 10:39:48AM -0700, Dan Williams wrote:
> On Mon, Jun 4, 2018 at 10:08 AM, Luck, Tony wrote:
> > On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote:
> >> +static inline int set_mce_nospec(unsigned long pfn)
> >> +{
> >>
On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote:
> +static inline int set_mce_nospec(unsigned long pfn)
> +{
> + int rc;
> +
> + rc = set_memory_uc((unsigned long) __va(PFN_PHYS(pfn)), 1);
You should really do the decoy_addr thing here that I had in mce_unmap_kpfn().
Putting
> I thought the cache-aligned might make sense, since usually we'd expect the
> failure to be at a cache-line level, but our copy_tofrom_user does accurate
> accounting
That's one of the wrinkles in the current x86 memcpy_mcsafe(). It starts by
checking alignment of the source address, and moves
On Sat, Mar 10, 2018 at 02:22:17PM +0100, Jean Delvare wrote:
> Note that it is possible to store MB values (up to 16 MB) using kB as
> the unit. The specification allows for it, and a few systems use that
> option. For example [1], the DMI data of a Supermicro X8STi board looks
> like:
>
>
On Fri, Mar 09, 2018 at 11:20:53AM +0100, Jean Delvare wrote:
> > + else if (size & 0x8000)
> > + bytes = (u64)(size & 0x7fff) << 10;
> > + else if (size != 0x7fff)
> > + bytes = (u64)size << 20;
> > + else
> > + bytes = (u64)get_unaligned((u32 *)[0x1C]) << 20;
On Tue, Dec 05, 2017 at 10:44:41PM +0100, Borislav Petkov wrote:
> On Tue, Dec 05, 2017 at 12:03:37PM -0800, Luck, Tony wrote:
> > I could. But what happens when someone ends up on a system with
> > an edac driver configured without ACPI_NFIT that does have NVDIMMs?
>
> Sam
On Tue, Dec 05, 2017 at 11:54:51AM +0100, Borislav Petkov wrote:
> On Thu, Nov 30, 2017 at 12:40:42PM -0800, Tony Luck wrote:
> > config EDAC_SKX
> > tristate "Intel Skylake server Integrated MC"
> > depends on PCI && X86_64 && X86_MCE_INTEL && PCI_MMCONFIG
> > + select DMI
> > +
On Mon, Dec 04, 2017 at 11:37:11PM +0100, Borislav Petkov wrote:
> On Thu, Nov 30, 2017 at 12:40:41PM -0800, Tony Luck wrote:
> WTF, there are *two* string arrays with memory types?! How did that
> happen?
>
> Can you please remove that above in a prepatch and switch to
> edac_mem_types? Or
On Mon, Dec 04, 2017 at 10:38:14PM +0100, Borislav Petkov wrote:
> On Thu, Nov 30, 2017 at 12:40:40PM -0800, Tony Luck wrote:
> > dmi_memdev[nr].bank = dmi_string(dm, d[0x11]);
>
> < newline here.
>
> > + size = get_unaligned((u16 *)[0xC]);
> > + if (size == 0)
> > + bytes
> > > > +if (set_memory_np(decoy_addr, 1))
> > > > +pr_warn("Could not invalidate pfn=0x%lx from 1:1 map \n",
>
> Another concept to consider is mapping the page as UC rather than
> completely unmapping it.
UC would also avoid the speculative prefetch issue. The Vol 3, Section 11.3
SDM says:
On Thu, Jun 22, 2017 at 10:07:18PM -0700, Dan Williams wrote:
> On Wed, Jun 21, 2017 at 1:30 PM, Luck, Tony <tony.l...@intel.com> wrote:
> >> Persistent memory does have unpoisoning and would require this inverse
> >> operation - see drivers/nvdimm/pmem.c pme
> Persistent memory does have unpoisoning and would require this inverse
> operation - see drivers/nvdimm/pmem.c pmem_clear_poison() and core.c
> nvdimm_clear_poison().
Nice. Well this code will need to cooperate with that ... in particular if the
page
is in an area that can be unpoisoned ...
On Fri, Apr 21, 2017 at 02:35:51PM -0600, Vishal Verma wrote:
> On 04/21, Luck, Tony wrote:
> > Needs extra parentheses to make it right. Vishal, sorry I led you astray.
> >
> > if (!((mce->status & 0xef80) == BIT(7)))
>
> Is this still right though? Anyt
On Fri, Apr 21, 2017 at 01:19:16PM -0700, Dan Williams wrote:
> On Fri, Apr 21, 2017 at 1:16 PM, Luck, Tony <tony.l...@intel.com> wrote:
> >>> > + if (!(mce->status & 0xef80) == BIT(7))
> >>>
> >>> Can we get a define for this, or a
On Thu, Apr 13, 2017 at 02:12:16PM +0200, Borislav Petkov wrote:
> On Thu, Apr 13, 2017 at 01:31:59PM +0200, Borislav Petkov wrote:
> > @@ -321,18 +321,8 @@ static void __print_mce(struct mce *m)
> >
> > static void print_mce(struct mce *m)
> > {
> > - int ret = 0;
> > -
> >
On Thu, Apr 13, 2017 at 12:16:39AM +0200, Borislav Petkov wrote:
> ... and it is midnight here so I could be talking crap but we probably
> should really split the reporting "chain" into two:
This shouldn't be too painful. Users ask to be put on the chain via
the wrapper:
void
On Wed, Apr 12, 2017 at 01:27:05PM -0700, Verma, Vishal L wrote:
> > /* We only care about memory errors */
> > if (!(mce->status & MCACOD))
> > return NOTIFY_DONE;
N.B. that isn't a valid test that this is a memory error. You need
if (!(m->status & 0xef80) ==
> Adding Tony so he can either confirm, or point and laugh at my
> assumptions. In general you're right that there are machine check
> events that are not recoverable, but I'm thinking of problems like bus
> lockups and other disasters out of the direct cpu-to-memory data path.
> The question is
34 matches
Mail list logo