Re: [PATCH v2 8/9] KVM: x86: add force_intercept_exceptions_mask

2021-04-27 Thread Borislav Petkov
s is based on an idea first shown here: > https://patchwork.kernel.org/project/kvm/patch/20160301192822.gd22...@pd.tnic/ > > CC: Borislav Petkov > Signed-off-by: Maxim Levitsky > --- > arch/x86/kvm/x86.c | 3 +++ > arch/x86/kvm/x86.h | 2 ++ > 2 files changed, 5 insertions(+

Re: [PATCH v8 18/26] ACPI / APEI: Make GHES estatus header validation more user friendly

2019-02-01 Thread Borislav Petkov
) message about the nature of the header format error. > > Suggested-by: Borislav Petkov > Signed-off-by: James Morse > --- > drivers/acpi/apei/ghes.c | 46 > 1 file changed, 32 insertions(+), 14 deletions(-) > > diff --git a/drive

Re: [PATCH v8 09/26] ACPI / APEI: Generalise the estatus queue's notify code

2019-02-01 Thread Borislav Petkov
On Tue, Jan 29, 2019 at 06:48:45PM +, James Morse wrote: > +static int ghes_in_nmi_spool_from_list(struct list_head *rcu_list) > +{ > + int err, ret = -ENOENT; > + struct ghes *ghes; > + > + rcu_read_lock(); > + list_for_each_entry_rcu(ghes, rcu_list, list) { > +

Re: [PATCH v8 04/26] ACPI / APEI: Make hest.c manage the estatus memory pool

2019-02-01 Thread Borislav Petkov
s.c | 33 ++--- > drivers/acpi/apei/hest.c | 10 +- > include/acpi/ghes.h | 2 ++ > 3 files changed, 17 insertions(+), 28 deletions(-) Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-postin

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-31 Thread Borislav Petkov
On Wed, Jan 23, 2019 at 06:40:08PM +, James Morse wrote: > My SMM comment was because the CPU must jump from user-space->SMM, which > injects > an NMI into the kernel. The kernel's EIP must point into user-space, so > returning from the NMI without doing the memory_failure() work puts us back

Re: [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications

2019-01-31 Thread Borislav Petkov
On Wed, Jan 23, 2019 at 06:33:02PM +, James Morse wrote: > Was the best I had, but this trips the BUILD_BUG() too early. > With it, x86 BUILD_BUG()s. With just the -1 the path gets pruned out, and > there > are no 'sdei' symbols in the object file. > > ...at this point, I stopped caring!

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-31 Thread Borislav Petkov
On Tue, Jan 29, 2019 at 06:48:33PM +, James Morse wrote: > If firmware has never generated CPER records, so it has never written to void > *error_status_address, yes. I guess this is the bit of information I was missing. > There seem to be two ways of doing this. This zero check implies an

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-29 Thread Borislav Petkov
On Wed, Jan 23, 2019 at 06:36:38PM +, James Morse wrote: > Do you consider ENOENT an error? We don't ack in that case as the > memory wasn't in use. So let's see: if (!*buf_paddr) return -ENOENT; can happen when apei_read() has returned 0 but it has managed to do

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-22 Thread Borislav Petkov
On Mon, Dec 10, 2018 at 07:15:13PM +, James Morse wrote: > What happens if we miss MF_ACTION_REQUIRED? AFAICU, the logic is to force-send a signal to the user process, i.e., force_sig_info() which cannot be ignored. IOW, an "enlightened" process would know how to do recovery action from a

Re: [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:06:10PM +, James Morse wrote: > memory_failure() offlines or repairs pages of memory that have been > discovered to be corrupt. These may be detected by an external > component, (e.g. the memory controller), and notified via an IRQ. > In this case the work is queued

Re: [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:06:08PM +, James Morse wrote: > Now that ghes notification helpers provide the fixmap slots and > take the lock themselves, multiple NMI-like notifications can > be used on arm64. > > These should be named after their notification method as they can't > all be

Re: [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()

2019-01-21 Thread Borislav Petkov
the new > ghes_peek_estatus() to know how much memory to allocate from > the ghes_estatus_pool before reading the records. > > Reported-by: Borislav Petkov > Signed-off-by: James Morse > > Change since v6: > * Added a comment explaining the 'ack-error, then goto

Re: [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:06:06PM +, James Morse wrote: > ghes_read_estatus() reads the record address, then the record's > header, then performs some sanity checks before reading the > records into the provided estatus buffer. > > To provide this estatus buffer the caller must know the size

Re: [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:06:05PM +, James Morse wrote: > The NMI-like notifications scribble over ghes->estatus, before > copying it somewhere else. If this interrupts the ghes_probe() code > calling ghes_proc() on each struct ghes, the data is corrupted. > > All the NMI-like notifications

Re: [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI

2019-01-21 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:05:59PM +, James Morse wrote: > The estatus-queue code is currently hidden by the NOTIFY_NMI #ifdefs. > Once NOTIFY_SEA starts using the estatus-queue we can stop hiding > it as each architecture has a user that can't be turned off. > > Split the existing

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-11 Thread Borislav Petkov
On Fri, Jan 11, 2019 at 06:09:28PM +, James Morse wrote: > We can return on ENOENT out earlier, as nothing needs doing in that case. Its > what the GHES_TO_CLEAR spaghetti is for, we can probably move the ack thing > into > ghes_clear_estatus(), that way that thing means 'I'm done with this

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-11 Thread Borislav Petkov
On Fri, Jan 11, 2019 at 06:25:21PM +, James Morse wrote: > We ack it in the corrupt-record case too, because we are done with the > memory. Ok, so the only thing that we need to do unconditionally is ACK in order to free the memory. Or is there an exception to that set of steps in error

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-11 Thread Borislav Petkov
On Fri, Jan 11, 2019 at 10:32:23AM -0500, Tyler Baicar wrote: > The kernel would have no way of knowing what to do here. What do you mean, there's no way of knowing what to do? It needs to clear registers so that the next error can get reported properly. Or of the status read failed and it

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2019-01-11 Thread Borislav Petkov
On Thu, Jan 10, 2019 at 04:01:27PM -0500, Tyler Baicar wrote: > On Thu, Jan 10, 2019 at 1:23 PM James Morse wrote: > > >> > > >> +if (is_hest_type_generic_v2(ghes) && > > >> ghes_ack_error(ghes->generic_v2)) > > > > > > Since ghes_ack_error() is always prepended with this check, you could >

Re: [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code

2019-01-11 Thread Borislav Petkov
On Thu, Jan 10, 2019 at 06:21:21PM +, James Morse wrote: > Something like: > ghes_notify_nmi() -> in_nmi_spool_from_list(list) -> > in_nmi_queue_one_entry(ghes). Yah, but make that ghes_notify_nmi() -> ghes_nmi_spool_from_list(list) -> ghes_nmi_queue_one_entry(ghes). to denote it is the

(ghes|hest)_disable

2019-01-11 Thread Borislav Petkov
Ok, lemme split this out into a separate thread and add Tony. On Thu, Jan 10, 2019 at 06:20:35PM +, James Morse wrote: > > Grrr, what an effing mess that code is! There's hest_disable *and* > > ghes_disable. Do we really need them both? > > ghes_disable lets you ignore the firmware-first

Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool

2018-12-19 Thread Borislav Petkov
On Fri, Dec 14, 2018 at 01:56:16PM +, James Morse wrote: > /me digs a bit, > > ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register(). > Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after > another 2 calls to apei_hest_parse(). > > If ghes_disable

Re: [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records

2018-12-11 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:05:58PM +, James Morse wrote: > ACPI has a GHESv2 which is used on hardware reduced platforms to > explicitly acknowledge that the memory for CPER records has been > consumed. This lets an external agent know it can re-use this > memory for something else. > >

Re: [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code

2018-12-11 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:05:57PM +, James Morse wrote: > Refactor the estatus queue's pool notification routine from > NOTIFY_NMI's handlers. This will allow another notification > method to use the estatus queue without duplicating this code. > > This patch adds

Re: [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check

2018-12-11 Thread Borislav Petkov
- continue; > - > __process_error(ghes); > ghes_clear_estatus(ghes, buf_paddr); > } > -- Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-posting and trim the reply. __

Re: [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes

2018-12-11 Thread Borislav Petkov
s->flags &= ~GHES_TO_CLEAR; < newline here. > + if (buf_paddr) { Also, you can save yourself an indendation level: if (!buf_paddr) return; ghes_copy... > + ghes_copy_tofrom_phys(ghes->estatus, buf_paddr, > +

Re: [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size

2018-12-11 Thread Borislav Petkov
and shrinks the pool. > > Suggested-by: Borislav Petkov > Signed-off-by: James Morse > --- > drivers/acpi/apei/ghes.c | 49 +--- > drivers/acpi/apei/hest.c | 2 +- > include/acpi/ghes.h | 2 +- > 3 files changed, 8 insertions(+), 45

Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool

2018-12-11 Thread Borislav Petkov
On Mon, Dec 03, 2018 at 06:05:52PM +, James Morse wrote: > ghes.c has a memory pool it uses for the estatus cache and the estatus > queue. The cache is initialised when registering the platform driver. > For the queue, an NMI-like notification has to grow/shrink the pool > as it is registered

Re: [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory

2018-12-04 Thread Borislav Petkov
if (ret) > - return ret; > - } > > - return 0; > + addr = (unsigned long)vmalloc(PAGE_ALIGN(len)); > + if (!addr) > + return -ENOMEM; ... and if we return here due to the ENOMEM, that increment above remains. I see you'

Re: [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus()

2018-12-04 Thread Borislav Petkov
hanges since v6: > * Moved earlier in the series, > * Tinkered with the commit message. > * switched to pr_warn_ratelimited() to shut checkpatch up > > shutup checkpatch > --- > drivers/acpi/apei/ghes.c | 15 +++ > 1 file changed, 7 insertions(+), 8 deletio

Re: [PATCH v6 17/18] mm/memory-failure: increase queued recovery work's priority

2018-10-15 Thread Borislav Petkov
+ Peter. On Fri, Sep 21, 2018 at 11:17:04PM +0100, James Morse wrote: > arm64 can take an NMI-like error notification when user-space steps in > some corrupt memory. APEI's GHES code will call memory_failure_queue() > to schedule the recovery work. We then return to user-space, possibly > taking

Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol

2018-10-12 Thread Borislav Petkov
On Fri, Oct 12, 2018 at 06:17:48PM +0100, James Morse wrote: > Ripping out the existing #ifdefs and replacing them with IS_ENABLED() would > let > the compiler work out the estatus stuff is unused, and saves us describing the > what-uses-it logic in Kconfig. > > But this does expose the x86 nmi

Re: [PATCH v6 15/18] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one()

2018-10-12 Thread Borislav Petkov
he > ghes_peek_estatus() so we know how much memory to allocate from > the ghes_estatus_pool before we read the records. > > Reported-by: Borislav Petkov > Signed-off-by: James Morse > --- > drivers/acpi/apei/ghes.c | 45 &

Re: [PATCH v6 14/18] ACPI / APEI: Split ghes_read_estatus() to read CPER length

2018-10-12 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:17:01PM +0100, James Morse wrote: > ghes_read_estatus() reads the record address, then the record's > header, then performs some sanity checks before reading the > records into the provided estatus buffer. > > We either need to know the size of the records before we

Re: [PATCH v6 13/18] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus

2018-10-12 Thread Borislav Petkov
the struct ghes. > > Signed-off-by: James Morse > --- > drivers/acpi/apei/ghes.c | 26 -- > include/acpi/ghes.h | 1 - > 2 files changed, 12 insertions(+), 15 deletions(-) Nice. Reviewed-by: Borislav Petkov -- Regards/Gruss, Boris. Go

Re: [PATCH v6 11/18] ACPI / APEI: Remove silent flag from ghes_read_estatus()

2018-10-12 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:58PM +0100, James Morse wrote: > Subsequent patches will split up ghes_read_estatus(), at which > point passing around the 'silent' flag gets annoying. This is to > suppress prink() messages, which prior to 42a0bb3f7138 ("printk/nmi: > generic solution for safe printk

Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus

2018-10-12 Thread Borislav Petkov
Nitpick: Subject: Re: [PATCH v6 10/18] ACPI / APEI: preparatory split of ghes->estatus Pls have an active formulation in your Subject and start it with a capital letter, i.e., something like: "Split ghes->estatus in preparation for... " On Fri, Sep 21, 2018 at 11:16:57PM +0100, James

Re: [PATCH v6 08/18] ACPI / APEI: Move locking to the notification helper

2018-10-12 Thread Borislav Petkov
proc(), and are called in process > or IRQ context. Move the spin_lock_irqsave() around their ghes_proc() > calls. Right, should ghes_proc() be renamed to ghes_proc_irq() now, to be absolutely clear on the processing context it is operating in? Other than that: Reviewed-by: Borislav Petko

Re: [PATCH v6 07/18] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface

2018-10-12 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:54PM +0100, James Morse wrote: > To split up APEIs in_nmi() path, we need the nmi-like callers to always > be in_nmi(). Add a helper to do the work and claim the notification. > > When KVM or the arch code takes an exception that might be a RAS > notification, it

Re: [PATCH v6 06/18] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing

2018-10-12 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:53PM +0100, James Morse wrote: > To split up APEIs in_nmi() path, we need any nmi-like callers to always > be in_nmi(). KVM shouldn't have to know about this, pull the RAS plumbing > out into a header file. > > Currently guest synchronous external aborts are claimed

Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol

2018-10-04 Thread Borislav Petkov
On Wed, Oct 03, 2018 at 06:50:36PM +0100, James Morse wrote: > I'm all in favour of letting the compiler work it out, but the existing ghes > code has #ifdef/#else all over the place. This is 'keeping the style'. Yeah, but this "style" is not the optimal one and we should simplify/clean up and

Re: [PATCH v6 00/18] APEI in_nmi() rework

2018-10-04 Thread Borislav Petkov
On Wed, Oct 03, 2018 at 06:50:38PM +0100, James Morse wrote: ... > The non-ghes HEST entries have a "number of records to pre-allocate" too, we > could make this memory pool something hest.c looks after, but I can't see if > the > other error sources use those values. Thanks for the detailed

Re: [PATCH v6 05/18] ACPI / APEI: Make estatus queue a Kconfig symbol

2018-10-01 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:52PM +0100, James Morse wrote: > Now that there are two users of the estatus queue, and likely to be more, > make it a Kconfig symbol selected by the appropriate notification. We > can move the ARCH_HAVE_NMI_SAFE_CMPXCHG checks in here too. Ok, question: why do we

Re: [PATCH v6 04/18] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue

2018-09-28 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:51PM +0100, James Morse wrote: > Now that the estatus queue can be used by more than one notification > method, we can move notifications that have NMI-like behaviour over to > it, and start abstracting GHES's single in_nmi() path. > > Switch NOTIFY_SEA over to use

Re: [PATCH v6 00/18] APEI in_nmi() rework

2018-09-25 Thread Borislav Petkov
On Fri, Sep 21, 2018 at 11:16:47PM +0100, James Morse wrote: > Hello, > > The GHES driver has collected quite a few bugs: > > ghes_proc() at ghes_probe() time can be interrupted by an NMI that > will clobber the ghes->estatus fields, flags, and the buffer_paddr. > > ghes_copy_tofrom_phys() uses

Re: [PATCH v3 07/12] ACPI / APEI: Make the nmi_fixmap_idx per-ghes to allow multiple in_nmi() users

2018-05-17 Thread Borislav Petkov
On Wed, May 16, 2018 at 11:38:16AM -0400, Tyler Baicar wrote: > I haven't seen a deadlock from that, but it looks possible. What if > the ghes_proc() call in ghes_probe() is moved before the second switch > statement? That way it is before the NMI/IRQ/poll is setup. At quick > glance I think that

Re: [PATCH v3 07/12] ACPI / APEI: Make the nmi_fixmap_idx per-ghes to allow multiple in_nmi() users

2018-05-17 Thread Borislav Petkov
On Wed, May 16, 2018 at 03:51:14PM +0100, James Morse wrote: > The first two overload existing architectural behavior, the third improves all > this with a third standard option. Its the standard story! :-) > I thought this was safe because its just ghes_copy_tofrom_phys()s access to > the >

Re: [PATCH v3 07/12] ACPI / APEI: Make the nmi_fixmap_idx per-ghes to allow multiple in_nmi() users

2018-05-16 Thread Borislav Petkov
On Tue, May 08, 2018 at 09:45:01AM +0100, James Morse wrote: > NOTIFY_NMI is x86's NMI, arm doesn't have anything that behaves in the same > way, > so doesn't use it. The equivalent notifications with NMI-like behaviour are: > * SEA (synchronous external abort) > * SEI (SError Interrupt) > * SDEI

Re: [PATCH v3 07/12] ACPI / APEI: Make the nmi_fixmap_idx per-ghes to allow multiple in_nmi() users

2018-05-05 Thread Borislav Petkov
On Fri, Apr 27, 2018 at 04:35:05PM +0100, James Morse wrote: > Arm64 has multiple NMI-like notifications, but ghes.c only has one > in_nmi() path, risking deadlock if one NMI-like notification can > interrupt another. > > To support this we need a fixmap entry and lock for each notification >

Re: [PATCH v3 02/12] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-05-05 Thread Borislav Petkov
On Fri, Apr 27, 2018 at 04:35:00PM +0100, James Morse wrote: > To support asynchronous NMI-like notifications on arm64 we need to use > the estatus-queue. These patches refactor it to allow multiple APEI > notification types to use it. > > Refactor the estatus queue's pool grow/shrink code and

Re: [PATCH v3 01/12] ACPI / APEI: Move the estatus queue code up, and under its own ifdef

2018-05-05 Thread Borislav Petkov
memory) that can't be handled immediatly. > --- > drivers/acpi/apei/ghes.c | 265 > ------- > 1 file changed, 137 insertions(+), 128 deletions(-) Reviewed-by: Borislav Petkov <b...@suse.de> -- Regards/Gruss, Boris. Good mailing practices for 400: avoid top-pos

Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-04-17 Thread Borislav Petkov
>= GHES_SEV_PANIC) { > -#ifdef CONFIG_X86 > - oops_begin(); > -#endif > ghes_print_queued_estatus(); > __ghes_panic(ghes); > } > -%<- Acked-by: Borislav Petkov <b...@suse.de> -- Regards/Gru

Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-03-27 Thread Borislav Petkov
Hi James, On Mon, Mar 19, 2018 at 02:29:13PM +, James Morse wrote: > I don't think the die_lock really helps here, do we really want to wait for a > remote CPU to finish printing an OOPs about user-space's bad memory accesses, > before we bring the machine down due to this system-wide fatal

Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-03-08 Thread Borislav Petkov
On Wed, Mar 07, 2018 at 06:15:02PM +, James Morse wrote: > Today its just x86 and arm64. arm64 doesn't have a hook to do this. I'm happy > to > add an empty declaration or leave it under an ifdef until someone complains > about any behaviour I missed! So I did some more staring at the code

Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-03-01 Thread Borislav Petkov
On Thu, Mar 01, 2018 at 06:06:59PM +, Punit Agrawal wrote: > You're looking at support for the 32-bit ARM systems. I know. That's why I'm asking. > The 64-bit support lives in arch/arm64 and the die() there doesn't > contain an oops_begin()/oops_end(). But the lack of oops_begin() on > arm64

Re: [PATCH 02/11] ACPI / APEI: Generalise the estatus queue's add/remove and notify code

2018-03-01 Thread Borislav Petkov
On Thu, Feb 15, 2018 at 06:55:57PM +, James Morse wrote: > Keep the oops_begin() call for x86, That oops_begin() in generic code is such a layering violation, grrr. > arm64 doesn't have one of these, > and APEI is the only thing outside arch code calling this.. So looking at:

Re: [PATCH 01/11] ACPI / APEI: Move the estatus queue code up, and under its own ifdef

2018-02-23 Thread Borislav Petkov
On Fri, Feb 23, 2018 at 06:02:21PM +, James Morse wrote: > Sure. I reckon your English grammar is better than mine, is this better?: Bah, you must be joking :-) > | In any NMI-like handler, memory from ghes_estatus_pool is used to save > | estatus, and added to the ghes_estatus_llist.

Re: [PATCH 01/11] ACPI / APEI: Move the estatus queue code up, and under its own ifdef

2018-02-20 Thread Borislav Petkov
On Thu, Feb 15, 2018 at 06:55:56PM +, James Morse wrote: > +#ifdef CONFIG_HAVE_ACPI_APEI_NMI > +/* > + * While printk() now has an in_nmi() path, the handling for CPER records > + * does not. For example, memory_failure_queue() takes spinlocks and calls > + * schedule_work_on(). > + * > + * So

Re: [PATCH 00/11] APEI in_nmi() rework and arm64 SDEI wire-up

2018-02-19 Thread Borislav Petkov
On Thu, Feb 15, 2018 at 06:55:55PM +, James Morse wrote: > Hello! > > The aim of this series is to wire arm64's SDEI into APEI. > > What's SDEI? Its ARM's "Software Delegated Exception Interface" [0]. It's > used by firmware to tell the OS about firmware-first RAS events. > > These Software

Re: [PATCH V17 00/11] Add UEFI 2.6 and ACPI 6.1 updates for RAS on ARM64

2017-05-23 Thread Borislav Petkov
the error. ... > V17:Rebase on tip > Change trace event helper function names > Remove unneeded prefixes from commit text The whole series: Reviewed-by: Borislav Petkov <b...@suse.de> Thanks! -- Regards/Gruss, Boris. ECO tip #101: Trim y

Re: [PATCH V16 08/11] efi: print unrecognized CPER section

2017-05-16 Thread Borislav Petkov
On Tue, May 16, 2017 at 10:44:43AM -0600, Baicar, Tyler wrote: > I meant to respond to this comment after I sent the v16 patch series, but > you beat me to it :) > > These prefixes are common to all the GHES/CPER printing to the kernel logs. I don't mean that - I meant to remove them from this

Re: [PATCH V16 10/11] trace, ras: add ARM processor error trace event

2017-05-16 Thread Borislav Petkov
On Mon, May 15, 2017 at 03:27:59PM -0600, Tyler Baicar wrote: > Currently there are trace events for the various RAS > errors with the exception of ARM processor type errors. > Add a new trace event for such errors so that the user > will know when they occur. These trace events are > consistent

Re: [PATCH V16 09/11] ras: acpi / apei: generate trace event for unrecognized CPER section

2017-05-16 Thread Borislav Petkov
On Mon, May 15, 2017 at 03:27:58PM -0600, Tyler Baicar wrote: > The UEFI spec includes non-standard section type support in the > Common Platform Error Record. This is defined in section N.2.3 of > UEFI version 2.5. > > Currently if the CPER section's type (UUID) does not match any > section type

Re: [PATCH V16 08/11] efi: print unrecognized CPER section

2017-05-16 Thread Borislav Petkov
On Mon, May 15, 2017 at 03:27:57PM -0600, Tyler Baicar wrote: > UEFI spec allows for non-standard section in Common Platform Error > Record. This is defined in section N.2.3 of UEFI version 2.5. > > Currently if the CPER section's type (UUID) does not match with > one of the section types that

Re: [PATCH V15 11/11] arm/arm64: KVM: add guest SEA support

2017-05-08 Thread Borislav Petkov
On Mon, May 08, 2017 at 01:54:44PM -0600, Baicar, Tyler wrote: > This was discussed in the v12 and v13 patch series. There is existing > code in kvm_handle_guest_abort for injecting an abort back into the > guest. We only want to do that if it was an abort that was not handled > by the firmware

Re: [PATCH V15 06/11] acpi: apei: handle SEA notification type for ARMv8

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 11:41:39AM -0600, Baicar, Tyler wrote: > I originally had this as a notifier, but Will requested to remove the > notifier. That conversation is here: https://lkml.org/lkml/2017/1/18/1018 Yeah, he mentioned on IRC. I just think notifiers would be the cleaner thing but

Re: [PATCH V15 06/11] acpi: apei: handle SEA notification type for ARMv8

2017-04-25 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 05:05:18PM -0600, Tyler Baicar wrote: > ARM APEI extension proposal added SEA (Synchronous External Abort) > notification type for ARMv8. > Add a new GHES error source handling function for SEA. If an error > source's notification type is SEA, then this function can be

Re: [PATCH V15 04/11] efi: parse ARM processor error

2017-04-25 Thread Borislav Petkov
On Tue, Apr 25, 2017 at 10:05:31AM -0600, Baicar, Tyler wrote: > That seems like something that should be done outside of these patches (if > added to the kernel at all). The decoding for this information would all be > vendor specific, so I'm not sure if we want to pollute the EFI code with >

Re: [PATCH V15 04/11] efi: parse ARM processor error

2017-04-24 Thread Borislav Petkov
On Fri, Apr 21, 2017 at 12:22:09PM -0600, Baicar, Tyler wrote: > I guess it's not really needed. It just may be useful considering there can > be numerous error info structures, numerous context info structures, and a > variable length vendor information section. I can move this print to only in >

Re: [PATCH V15 03/11] cper: add timestamp print to CPER status printing

2017-04-21 Thread Borislav Petkov
On Fri, Apr 21, 2017 at 12:08:43PM -0600, Baicar, Tyler wrote: > The timestamp may still be useful when it is imprecise. In the polling case, > you may only poll every minute or so, so the time may be useful. Well, what is in the timestamp when !precise? Some random time or some timestamp from a

Re: [PATCH V15 04/11] efi: parse ARM processor error

2017-04-21 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 05:05:16PM -0600, Tyler Baicar wrote: > Add support for ARM Common Platform Error Record (CPER). > UEFI 2.6 specification adds support for ARM specific > processor error information to be reported as part of the > CPER records. This provides more detail on for processor

Re: [PATCH V15 03/11] cper: add timestamp print to CPER status printing

2017-04-21 Thread Borislav Petkov
On Fri, Apr 21, 2017 at 10:04:35AM -0600, Baicar, Tyler wrote: > This is basically what I already had in v14...you asked to move it into a > different if-statement? https://lkml.org/lkml/2017/4/12/397 Well, clearly I've been smoking some nasty potent sh*t. :-\ /me goes and looks at the spec:

Re: [PATCH V15 03/11] cper: add timestamp print to CPER status printing

2017-04-21 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 05:05:15PM -0600, Tyler Baicar wrote: > The ACPI 6.1 spec added a timestamp to the HEST generic data HEST? I see the timestamp in Table 18-343 Generic Error Data Entry where those things are "One or more Generic Error Data Entry structures may be recorded in the Generic

Re: [PATCH V15 02/11] ras: acpi/apei: cper: add support for generic data v3 structure

2017-04-20 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 05:05:14PM -0600, Tyler Baicar wrote: > The ACPI 6.1 spec adds a new version of the generic data structure. which data structure? HEST? > Add support to handle the new structure as well as properly verify > and iterate through the generic data entries. > > Signed-off-by:

Re: [PATCH V15 01/11] acpi: apei: read ack upon ghes record consumption

2017-04-19 Thread Borislav Petkov
On Wed, Apr 19, 2017 at 02:31:13PM -0600, Baicar, Tyler wrote: > Will do. You don't necessarily have to reply with "will do" if you agree with the review. Also, please wait until I've gone through the whole pile before sending it again. Thanks. -- Regards/Gruss, Boris. Good mailing

Re: [PATCH V15 01/11] acpi: apei: read ack upon ghes record consumption

2017-04-19 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 05:05:13PM -0600, Tyler Baicar wrote: > A RAS (Reliability, Availability, Serviceability) controller > may be a separate processor running in parallel with OS > execution, and may generate error records for consumption by > the OS. If the RAS controller produces multiple

Re: [PATCH V14 02/10] ras: acpi/apei: cper: generic error data entry v3 per ACPI 6.1

2017-04-13 Thread Borislav Petkov
On Thu, Apr 13, 2017 at 02:30:21PM -0600, Baicar, Tyler wrote: > I do not agree with this. The struct being passed to this function is > already named acpi_hest_generic_data in the existing code and all over this > code is named gdata not just d. And I'm saying they're too long - the preexisting

Re: [PATCH V14 01/10] acpi: apei: read ack upon ghes record consumption

2017-04-12 Thread Borislav Petkov
On Tue, Mar 28, 2017 at 01:30:31PM -0600, Tyler Baicar wrote: > A RAS (Reliability, Availability, Serviceability) controller > may be a separate processor running in parallel with OS > execution, and may generate error records for consumption by > the OS. If the RAS controller produces multiple

Re: [PATCH V14 03/10] efi: parse ARM processor error

2017-04-12 Thread Borislav Petkov
On Tue, Mar 28, 2017 at 01:30:33PM -0600, Tyler Baicar wrote: > Add support for ARM Common Platform Error Record (CPER). > UEFI 2.6 specification adds support for ARM specific > processor error information to be reported as part of the > CPER records. This provides more detail on for processor