RE: [RFC] Make the memory failure blast radius more precise

2020-06-24 Thread Luck, Tony
> Both the RFC patch and the above 5-step recovery plan look neat, step 4) > is nice to carry forward on icelake when a single instruction to clear > poison is available. Jane, Clearing poison has some challenges. On persistent memory it probably works (as the DIMM is going to remap that

Re: [PATCH v6 0/2] Renovate memcpy_mcsafe with copy_mc_to_{user, kernel}

2020-06-23 Thread Luck, Tony
now list as "fragile") and injecting uncorrected errors into buffers that are then copied using copy_mc_to_kernel() result in recovered machine checks with the return value correctly indicating the amount remaining. Reviewed-by: Tony Luck -Tony ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org

Re: [RFC] Make the memory failure blast radius more precise

2020-06-23 Thread Luck, Tony
On Tue, Jun 23, 2020 at 11:17:41PM +0100, Matthew Wilcox wrote: > It might also be nice to have an madvise() MADV_ZERO option so the > application doesn't have to look up the fd associated with that memory > range, but we haven't floated that idea with the customer yet; I just > thought of it now.

Re: [RFC] Make the memory failure blast radius more precise

2020-06-23 Thread Luck, Tony
On Tue, Jun 23, 2020 at 09:17:45PM +0100, Matthew Wilcox wrote: > > Hardware actually tells us the blast radius of the error, but we ignore > it and take out the entire page. We've had a customer request to know > exactly how much of the page is damaged so they can avoid reconstructing > an

RE: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

2020-05-04 Thread Luck, Tony
> When a copy function hits a bad page and the page is not yet known to > be bad, what does it do? (I.e. the page was believed to be fine but > the copy function gets #MC.) Does it unmap it right away? What does > it return? I suspect that we will only ever find a handful of situations where

RE: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

2020-05-01 Thread Luck, Tony
> Now maybe copy_to_user() should *always* work this way, but I’m not convinced. > Certainly put_user() shouldn’t — the result wouldn’t even be well defined. > And I’m > unconvinced that it makes much sense for the majority of copy_to_user() > callers > that are also directly accessing the

Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

2020-04-30 Thread Luck, Tony
On Thu, Apr 30, 2020 at 12:50:40PM -0700, Linus Torvalds wrote: I see your point about the namimg being important. I think Dan's case is indeed "copy from pmem to user" where only options for faulting are #MC on the source addresses, and #PF on the destination. > The only *fundamental* access

Re: [PATCH v2 0/2] Replace and improve "mcsafe" with copy_safe()

2020-04-30 Thread Luck, Tony
On Thu, Apr 30, 2020 at 11:42:20AM -0700, Andy Lutomirski wrote: > I suppose there could be a consistent naming like this: > > copy_from_user() > copy_to_user() > > copy_from_unchecked_kernel_address() [what probe_kernel_read() is] > copy_to_unchecked_kernel_address() [what probe_kernel_write()

RE: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast

2020-04-20 Thread Luck, Tony
> (a) is a trap, not an exception - so the instruction has been done, > and you don't need to try to emulate it or anything to continue. Maybe for errors on the data side of the pipeline. On the instruction side we can usually recover from user space instruction fetches by just throwing away the

RE: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast

2020-04-20 Thread Luck, Tony
> By "asynchronous" I don't mean "hours later". > > Make it be "interrupts are enabled, before serializing instruction". > > Yes, we want bounded error handling latency. But that doesn't mean > "synchronous" Another X86 vendor seems to be adding something like that. See MCOMMIT in

Re: [PATCH] x86/memcpy: Introduce memcpy_mcsafe_fast

2020-04-20 Thread Luck, Tony
On Mon, Apr 20, 2020 at 01:07:09PM -0700, Linus Torvalds wrote: > On Mon, Apr 20, 2020 at 12:29 PM Dan Williams > wrote: > > > > I didn't consider asynchronous to be > > better because that means there is a gap between when the data > > corruption is detected and when it might escape the system

[PATCH] acpi/nfit: Fix race accessing memdev in nfit_get_smbios_id()

2019-01-11 Thread Luck, Tony
Possible race accessing memdev structures after dropping the mutex. Dan Williams says this could race against another thread that is doing: # echo "ACPI0012:00" > /sys/bus/acpi/drivers/nfit/unbind Reported-by: Jane Chu Signed-off-by: Tony Luck --- This time with a commit message and credit

Re: 23222f8f8dce6: acpi, nfit: Add function to look up nvdimm device and provide SMBIOS handle

2019-01-11 Thread Luck, Tony
On Fri, Jan 11, 2019 at 02:07:00PM -0800, Dan Williams wrote: > No harm in fixing it up, I'd take that patch if you wrote it up. Something like this? diff --git a/drivers/acpi/nfit/core.c b/drivers/acpi/nfit/core.c index 011d3db19c80..22945bf803c8 100644 --- a/drivers/acpi/nfit/core.c +++

RE: 23222f8f8dce6: acpi, nfit: Add function to look up nvdimm device and provide SMBIOS handle

2019-01-11 Thread Luck, Tony
currently no race here. -Tony From: Jane Chu [mailto:jane@oracle.com] Sent: Friday, January 11, 2019 1:43 PM To: Luck, Tony Cc: linux-nvdimm Subject: 23222f8f8dce6: acpi, nfit: Add function to look up nvdimm device and provide SMBIOS handle Hi, Tony, I happen to be looking at the a

RE: [Ksummit-discuss] [RFC PATCH 3/3] libnvdimm, MAINTAINERS: Subsystem Profile

2018-11-15 Thread Luck, Tony
> I would recommend to remove this section at all. > New maintainers won't come out of blue, but will be come > from existing community and such individuals for sure will see > and judge by themselves to whom they trust and to whom not. Perhaps this is more of a hint to contributors than to

Re: [PATCH v3 08/12] x86/memory_failure: Introduce {set, clear}_mce_nospec()

2018-06-07 Thread Luck, Tony
On Wed, Jun 06, 2018 at 09:42:28PM -0700, Dan Williams wrote: > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: "H. Peter Anvin" > > Cc: Tony Luck > > Tony, safe to assume you are ok with this patch now that the > decoy_addr approach is back

Re: [PATCH v2 07/11] x86, memory_failure: Introduce {set, clear}_mce_nospec()

2018-06-04 Thread Luck, Tony
On Mon, Jun 04, 2018 at 10:39:48AM -0700, Dan Williams wrote: > On Mon, Jun 4, 2018 at 10:08 AM, Luck, Tony wrote: > > On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote: > >> +static inline int set_mce_nospec(unsigned long pfn) > >> +{ > >>

Re: [PATCH v2 07/11] x86, memory_failure: Introduce {set, clear}_mce_nospec()

2018-06-04 Thread Luck, Tony
On Sat, Jun 02, 2018 at 10:23:20PM -0700, Dan Williams wrote: > +static inline int set_mce_nospec(unsigned long pfn) > +{ > + int rc; > + > + rc = set_memory_uc((unsigned long) __va(PFN_PHYS(pfn)), 1); You should really do the decoy_addr thing here that I had in mce_unmap_kpfn(). Putting

RE: [RESEND 2/3] powerpc/memcpy: Add memcpy_mcsafe for pmem

2018-04-06 Thread Luck, Tony
> I thought the cache-aligned might make sense, since usually we'd expect the > failure to be at a cache-line level, but our copy_tofrom_user does accurate > accounting That's one of the wrinkles in the current x86 memcpy_mcsafe(). It starts by checking alignment of the source address, and moves

Re: [PATCH 4/5] firmware: dmi: Add function to look up a handle and return DIMM size

2018-03-12 Thread Luck, Tony
On Sat, Mar 10, 2018 at 02:22:17PM +0100, Jean Delvare wrote: > Note that it is possible to store MB values (up to 16 MB) using kB as > the unit. The specification allows for it, and a few systems use that > option. For example [1], the DMI data of a Supermicro X8STi board looks > like: > >

Re: [PATCH 4/5] firmware: dmi: Add function to look up a handle and return DIMM size

2018-03-09 Thread Luck, Tony
On Fri, Mar 09, 2018 at 11:20:53AM +0100, Jean Delvare wrote: > > + else if (size & 0x8000) > > + bytes = (u64)(size & 0x7fff) << 10; > > + else if (size != 0x7fff) > > + bytes = (u64)size << 20; > > + else > > + bytes = (u64)get_unaligned((u32 *)[0x1C]) << 20;

Re: [RFC PATCH 4/4] EDAC, skx_edac: Detect non-volatile DIMMs

2017-12-05 Thread Luck, Tony
On Tue, Dec 05, 2017 at 10:44:41PM +0100, Borislav Petkov wrote: > On Tue, Dec 05, 2017 at 12:03:37PM -0800, Luck, Tony wrote: > > I could. But what happens when someone ends up on a system with > > an edac driver configured without ACPI_NFIT that does have NVDIMMs? > > Sam

Re: [RFC PATCH 4/4] EDAC, skx_edac: Detect non-volatile DIMMs

2017-12-05 Thread Luck, Tony
On Tue, Dec 05, 2017 at 11:54:51AM +0100, Borislav Petkov wrote: > On Thu, Nov 30, 2017 at 12:40:42PM -0800, Tony Luck wrote: > > config EDAC_SKX > > tristate "Intel Skylake server Integrated MC" > > depends on PCI && X86_64 && X86_MCE_INTEL && PCI_MMCONFIG > > + select DMI > > +

Re: [RFC PATCH 3/4] edac: Add new memory type for non-volatile DIMMs

2017-12-04 Thread Luck, Tony
On Mon, Dec 04, 2017 at 11:37:11PM +0100, Borislav Petkov wrote: > On Thu, Nov 30, 2017 at 12:40:41PM -0800, Tony Luck wrote: > WTF, there are *two* string arrays with memory types?! How did that > happen? > > Can you please remove that above in a prepatch and switch to > edac_mem_types? Or

Re: [RFC PATCH 2/4] firmware: dmi: Add function to look up a handle and return DIMM size

2017-12-04 Thread Luck, Tony
On Mon, Dec 04, 2017 at 10:38:14PM +0100, Borislav Petkov wrote: > On Thu, Nov 30, 2017 at 12:40:40PM -0800, Tony Luck wrote: > > dmi_memdev[nr].bank = dmi_string(dm, d[0x11]); > > < newline here. > > > + size = get_unaligned((u16 *)[0xC]); > > + if (size == 0) > > + bytes

RE: [PATCH] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages

2017-06-27 Thread Luck, Tony
> > > > +if (set_memory_np(decoy_addr, 1)) > > > > +pr_warn("Could not invalidate pfn=0x%lx from 1:1 map \n", > > Another concept to consider is mapping the page as UC rather than > completely unmapping it. UC would also avoid the speculative prefetch issue. The Vol 3, Section 11.3 SDM says:

Re: [PATCH] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages

2017-06-23 Thread Luck, Tony
On Thu, Jun 22, 2017 at 10:07:18PM -0700, Dan Williams wrote: > On Wed, Jun 21, 2017 at 1:30 PM, Luck, Tony <tony.l...@intel.com> wrote: > >> Persistent memory does have unpoisoning and would require this inverse > >> operation - see drivers/nvdimm/pmem.c pme

RE: [PATCH] mm/hwpoison: Clear PRESENT bit for kernel 1:1 mappings of poison pages

2017-06-21 Thread Luck, Tony
> Persistent memory does have unpoisoning and would require this inverse > operation - see drivers/nvdimm/pmem.c pmem_clear_poison() and core.c > nvdimm_clear_poison(). Nice. Well this code will need to cooperate with that ... in particular if the page is in an area that can be unpoisoned ...

Re: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce

2017-04-21 Thread Luck, Tony
On Fri, Apr 21, 2017 at 02:35:51PM -0600, Vishal Verma wrote: > On 04/21, Luck, Tony wrote: > > Needs extra parentheses to make it right. Vishal, sorry I led you astray. > > > > if (!((mce->status & 0xef80) == BIT(7))) > > Is this still right though? Anyt

Re: [PATCH] acpi, nfit: fix the memory error check in nfit_handle_mce

2017-04-21 Thread Luck, Tony
On Fri, Apr 21, 2017 at 01:19:16PM -0700, Dan Williams wrote: > On Fri, Apr 21, 2017 at 1:16 PM, Luck, Tony <tony.l...@intel.com> wrote: > >>> > + if (!(mce->status & 0xef80) == BIT(7)) > >>> > >>> Can we get a define for this, or a

Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic'

2017-04-18 Thread Luck, Tony
On Thu, Apr 13, 2017 at 02:12:16PM +0200, Borislav Petkov wrote: > On Thu, Apr 13, 2017 at 01:31:59PM +0200, Borislav Petkov wrote: > > @@ -321,18 +321,8 @@ static void __print_mce(struct mce *m) > > > > static void print_mce(struct mce *m) > > { > > - int ret = 0; > > - > >

Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic'

2017-04-12 Thread Luck, Tony
On Thu, Apr 13, 2017 at 12:16:39AM +0200, Borislav Petkov wrote: > ... and it is midnight here so I could be talking crap but we probably > should really split the reporting "chain" into two: This shouldn't be too painful. Users ask to be put on the chain via the wrapper: void

Re: [RFC PATCH] x86, mce: change the mce notifier to 'blocking' from 'atomic'

2017-04-12 Thread Luck, Tony
On Wed, Apr 12, 2017 at 01:27:05PM -0700, Verma, Vishal L wrote: > >   /* We only care about memory errors */ > >   if (!(mce->status & MCACOD)) > >   return NOTIFY_DONE; N.B. that isn't a valid test that this is a memory error. You need if (!(m->status & 0xef80) ==

RE: [PATCH] acpi, nfit: skip ARS on machine-check-recovery capable platforms

2017-02-09 Thread Luck, Tony
> Adding Tony so he can either confirm, or point and laugh at my > assumptions. In general you're right that there are machine check > events that are not recoverable, but I'm thinking of problems like bus > lockups and other disasters out of the direct cpu-to-memory data path. > The question is