Re: kexec regression since 4.9 caused by efi

2017-03-09 Thread Dave Young
On 03/09/17 at 12:53pm, Ard Biesheuvel wrote:
> On 9 March 2017 at 10:54, Omar Sandoval  wrote:
> > On Thu, Mar 09, 2017 at 02:38:06PM +0800, Dave Young wrote:
> >> Add efi/kexec list.
> >>
> >> On 03/08/17 at 12:16pm, Omar Sandoval wrote:
> >
> > [snip]
> >
> >> I have no more clue yet from your provided log, but the runtime value is
> >> odd to me. It is set in below code:
> >>
> >> arch/x86/platform/efi/efi.c: efi_systab_init()
> >>   efi_systab.runtime = data ?
> >>(void *)(unsigned long)data->runtime :
> >>(void *)(unsigne long)systab64->runtime;
> >>
> >> Here data is the setup_data passed by kexec-tools from normal kernel to
> >> kexec kernel, efi_setup_data structure is like below:
> >> struct efi_setup_data {
> >> u64 fw_vendor;
> >> u64 runtime;
> >> u64 tables;
> >> u64 smbios;
> >> u64 reserved[8];
> >> };
> >>
> >> kexec-tools get the runtime address from /sys/firmware/efi/runtime
> >>
> >> So can you do some debuggin on your side, eg. see the sysfs runtime
> >> value is correct or not. And add some printk in efi init path etc.
> >
> > The attached patch fixes this for me.
> 
> Hi Omar,
> 
> Thanks for tracking this down.
> 
> I wonder if this is an unintended side effect of the way we repurpose
> the EFI_MEMORY_RUNTIME attribute in efi_arch_mem_reserve(). AFAIUI,
> splitting memory map entries should only be necessary for regions that
> are not runtime memory regions to begin with, and so whether their
> virtual mapping address makes sense or not should be irrelevant.

In this case the esrt chunk are Runtime Data which is not necessary to
be reserved explicitly. I think efi_arch_mem_reserve are for boot areas.

Probably there could be esrt data which belongs to boot data? If we are
sure they are all runtime, the better fix may be just dropping the
efi_mem_reserve in esrt.c

> 
> Perhaps this only illustrates my lack of understanding of the x86 way
> of doing this, so perhaps Matt can shed some light on this?
> 
> Thanks,
> Ard.

Thanks
Dave
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 13/32] KVM: SVM: Enable SEV by setting the SEV_ENABLE CPU feature

2017-03-09 Thread Borislav Petkov
On Thu, Mar 02, 2017 at 10:15:01AM -0500, Brijesh Singh wrote:
> From: Tom Lendacky 
> 
> Modify the SVM cpuid update function to indicate if Secure Encrypted
> Virtualization (SEV) is active in the guest by setting the SEV KVM CPU
> features bit. SEV is active if Secure Memory Encryption is enabled in
> the host and the SEV_ENABLE bit of the VMCB is set.
> 
> Signed-off-by: Tom Lendacky 
> ---
>  arch/x86/kvm/cpuid.c |4 +++-
>  arch/x86/kvm/svm.c   |   18 ++
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 1639de8..e0c40a8 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -601,7 +601,7 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
> *entry, u32 function,
>   entry->edx = 0;
>   break;
>   case 0x8000:
> - entry->eax = min(entry->eax, 0x801a);
> + entry->eax = min(entry->eax, 0x801f);
>   break;
>   case 0x8001:
>   entry->edx &= kvm_cpuid_8000_0001_edx_x86_features;
> @@ -634,6 +634,8 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 
> *entry, u32 function,
>   break;
>   case 0x801d:
>   break;
> + case 0x801f:
> + break;

I guess those three case's can be unified:

case 0x801a:
case 0x801d:
case 0x801f:
break;

...

> + sev_info = kvm_find_cpuid_entry(vcpu, 0x801f, 0);
> + if (!sev_info)
> + return;
> +
> + if (ca->nested_ctl & SVM_NESTED_CTL_SEV_ENABLE) {
> + features->eax |= (1 << KVM_FEATURE_SEV);
> + cpuid(0x801f, _info->eax, _info->ebx,
> +   _info->ecx, _info->edx);
> + }

Right, as already mentioned in the previous mail: can we communicate SEV
status to the guest solely through the 0x801f leaf? Then we won't
need KVM_FEATURE_SEV and this way we'll be hypervisor-agnostic, as Paolo
suggested.

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active

2017-03-09 Thread Borislav Petkov
On Thu, Mar 09, 2017 at 05:13:33PM +0100, Paolo Bonzini wrote:
> This is not how you check if running under a hypervisor; you should
> check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
> tells you if leaf 0x4000 is valid.

Ah, good point, I already do that in the microcode loader :)

/*
 * CPUID(1).ECX[31]: reserved for hypervisor use. This is still not
 * completely accurate as xen pv guests don't see that CPUID bit set but
 * that's good enough as they don't land on the BSP path anyway.
 */
if (native_cpuid_ecx(1) & BIT(31))
return *res;

> That said, the main issue with this function is that it hardcodes the
> behavior for KVM.  It is possible that another hypervisor defines its
> 0x4001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.
> 
> Instead, AMD should define a "well-known" bit in its own space (i.e.
> 0x80xx) that is only used by hypervisors that support SEV.  This is
> similar to how Intel defined one bit in leaf 1 to say "is leaf
> 0x4000 valid".
> 
> > +   if (eax > 0x4000) {
> > +   eax = 0x4001;
> > +   ecx = 0;
> > +   native_cpuid(, , , );
> > +   if (!(eax & BIT(KVM_FEATURE_SEV)))
> > +   goto out;
> > +
> > +   eax = 0x801f;
> > +   ecx = 0;
> > +   native_cpuid(, , , );
> > +   if (!(eax & 1))

Right, so this is testing CPUID_0x801f_ECX(0)[0], SME. Why not
simply set that bit for the guest too, in kvm?

-- 
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 
(AG Nürnberg)
-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH v2 12/32] x86: Add early boot support when running with SEV active

2017-03-09 Thread Paolo Bonzini


On 09/03/2017 15:07, Borislav Petkov wrote:
> + /* Check if running under a hypervisor */
> + eax = 0x4000;
> + ecx = 0;
> + native_cpuid(, , , );

This is not how you check if running under a hypervisor; you should
check the HYPERVISOR bit, i.e. bit 31 of cpuid(1).ecx.  This in turn
tells you if leaf 0x4000 is valid.

That said, the main issue with this function is that it hardcodes the
behavior for KVM.  It is possible that another hypervisor defines its
0x4001 leaf in such a way that KVM_FEATURE_SEV has a different meaning.

Instead, AMD should define a "well-known" bit in its own space (i.e.
0x80xx) that is only used by hypervisors that support SEV.  This is
similar to how Intel defined one bit in leaf 1 to say "is leaf
0x4000 valid".

Thanks,

Paolo

> + if (eax > 0x4000) {
> + eax = 0x4001;
> + ecx = 0;
> + native_cpuid(, , , );
> + if (!(eax & BIT(KVM_FEATURE_SEV)))
> + goto out;
> +
> + eax = 0x801f;
> + ecx = 0;
> + native_cpuid(, , , );
> + if (!(eax & 1))
> + goto out;
> +
> + sme_me_mask = 1UL << (ebx & 0x3f);
> + sev_enabled = 1;
> +
> + goto out;
> + }
> +
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec regression since 4.9 caused by efi

2017-03-09 Thread Ard Biesheuvel
On 9 March 2017 at 10:54, Omar Sandoval  wrote:
> On Thu, Mar 09, 2017 at 02:38:06PM +0800, Dave Young wrote:
>> Add efi/kexec list.
>>
>> On 03/08/17 at 12:16pm, Omar Sandoval wrote:
>
> [snip]
>
>> I have no more clue yet from your provided log, but the runtime value is
>> odd to me. It is set in below code:
>>
>> arch/x86/platform/efi/efi.c: efi_systab_init()
>>   efi_systab.runtime = data ?
>>(void *)(unsigned long)data->runtime :
>>(void *)(unsigne long)systab64->runtime;
>>
>> Here data is the setup_data passed by kexec-tools from normal kernel to
>> kexec kernel, efi_setup_data structure is like below:
>> struct efi_setup_data {
>> u64 fw_vendor;
>> u64 runtime;
>> u64 tables;
>> u64 smbios;
>> u64 reserved[8];
>> };
>>
>> kexec-tools get the runtime address from /sys/firmware/efi/runtime
>>
>> So can you do some debuggin on your side, eg. see the sysfs runtime
>> value is correct or not. And add some printk in efi init path etc.
>
> The attached patch fixes this for me.

Hi Omar,

Thanks for tracking this down.

I wonder if this is an unintended side effect of the way we repurpose
the EFI_MEMORY_RUNTIME attribute in efi_arch_mem_reserve(). AFAIUI,
splitting memory map entries should only be necessary for regions that
are not runtime memory regions to begin with, and so whether their
virtual mapping address makes sense or not should be irrelevant.

Perhaps this only illustrates my lack of understanding of the x86 way
of doing this, so perhaps Matt can shed some light on this?

Thanks,
Ard.
--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kexec regression since 4.9 caused by efi

2017-03-09 Thread Omar Sandoval
On Thu, Mar 09, 2017 at 02:38:06PM +0800, Dave Young wrote:
> Add efi/kexec list.
> 
> On 03/08/17 at 12:16pm, Omar Sandoval wrote:

[snip]

> I have no more clue yet from your provided log, but the runtime value is
> odd to me. It is set in below code:
> 
> arch/x86/platform/efi/efi.c: efi_systab_init()
>   efi_systab.runtime = data ?
>(void *)(unsigned long)data->runtime :
>(void *)(unsigne long)systab64->runtime;
> 
> Here data is the setup_data passed by kexec-tools from normal kernel to
> kexec kernel, efi_setup_data structure is like below: 
> struct efi_setup_data {
> u64 fw_vendor;
> u64 runtime;
> u64 tables;
> u64 smbios;
> u64 reserved[8];
> };
> 
> kexec-tools get the runtime address from /sys/firmware/efi/runtime
> 
> So can you do some debuggin on your side, eg. see the sysfs runtime
> value is correct or not. And add some printk in efi init path etc.

The attached patch fixes this for me.
>From 4b343f0b0b408469f28c973ea52877797a166313 Mon Sep 17 00:00:00 2001
Message-Id: <4b343f0b0b408469f28c973ea52877797a166313.1489053164.git.osan...@fb.com>
From: Omar Sandoval 
Date: Thu, 9 Mar 2017 01:46:19 -0800
Subject: [PATCH] efi: adjust virt_addr when splitting descriptors in
 efi_memmap_insert()

When we split efi memory descriptors, we adjust the physical address but
not the virtual address it maps to. This leads to bogus memory mappings
later when these virtual addresses are used.

This fixes a kexec boot regression since 8e80632fb23f ("efi/esrt: Use
efi_mem_reserve() and avoid a kmalloc()"), although the bug was only
exposed by that commit.

Signed-off-by: Omar Sandoval 
---
 drivers/firmware/efi/memmap.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/firmware/efi/memmap.c b/drivers/firmware/efi/memmap.c
index 78686443cb37..ca614db76faf 100644
--- a/drivers/firmware/efi/memmap.c
+++ b/drivers/firmware/efi/memmap.c
@@ -298,6 +298,7 @@ void __init efi_memmap_insert(struct efi_memory_map *old_memmap, void *buf,
 			memcpy(new, old, old_memmap->desc_size);
 			md = new;
 			md->phys_addr = m_end + 1;
+			md->virt_addr += md->phys_addr - start;
 			md->num_pages = (end - md->phys_addr + 1) >>
 EFI_PAGE_SHIFT;
 		}
@@ -312,6 +313,7 @@ void __init efi_memmap_insert(struct efi_memory_map *old_memmap, void *buf,
 			md = new;
 			md->attribute |= m_attr;
 			md->phys_addr = m_start;
+			md->virt_addr += md->phys_addr - start;
 			md->num_pages = (m_end - m_start + 1) >>
 EFI_PAGE_SHIFT;
 			/* last part */
@@ -319,6 +321,7 @@ void __init efi_memmap_insert(struct efi_memory_map *old_memmap, void *buf,
 			memcpy(new, old, old_memmap->desc_size);
 			md = new;
 			md->phys_addr = m_end + 1;
+			md->virt_addr += md->phys_addr - start;
 			md->num_pages = (end - m_end) >>
 EFI_PAGE_SHIFT;
 		}
@@ -333,6 +336,7 @@ void __init efi_memmap_insert(struct efi_memory_map *old_memmap, void *buf,
 			memcpy(new, old, old_memmap->desc_size);
 			md = new;
 			md->phys_addr = m_start;
+			md->virt_addr += md->phys_addr - start;
 			md->num_pages = (end - md->phys_addr + 1) >>
 EFI_PAGE_SHIFT;
 			md->attribute |= m_attr;
-- 
2.12.0



Re: [PATCH V12 09/10] trace, ras: add ARM processor error trace event

2017-03-09 Thread Xie XiuQi
Hi Tyler Baicar,

On 2017/3/7 4:45, Tyler Baicar wrote:
> Currently there are trace events for the various RAS
> errors with the exception of ARM processor type errors.
> Add a new trace event for such errors so that the user
> will know when they occur. These trace events are
> consistent with the ARM processor error section type
> defined in UEFI 2.6 spec section N.2.4.4.
> 
> Signed-off-by: Tyler Baicar 
> Acked-by: Steven Rostedt 
> ---
>  drivers/acpi/apei/ghes.c|  8 +++-
>  drivers/firmware/efi/cper.c |  1 +
>  drivers/ras/ras.c   |  1 +
>  include/ras/ras_event.h | 34 ++
>  4 files changed, 43 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index 842c0cc..81d7b79 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -514,7 +514,13 @@ static void ghes_do_proc(struct ghes *ghes,
>   }
>  #endif
>  #ifdef CONFIG_RAS
> - else if (trace_unknown_sec_event_enabled()) {
> + else if (!uuid_le_cmp(sec_type, CPER_SEC_PROC_ARM) &&
> +  trace_arm_event_enabled()) {
> + struct cper_sec_proc_arm *arm_err;
> +
> + arm_err = acpi_hest_generic_data_payload(gdata);
> + trace_arm_event(arm_err);
> + } else if (trace_unknown_sec_event_enabled()) {
>   void *unknown_err = 
> acpi_hest_generic_data_payload(gdata);
>   trace_unknown_sec_event(_type,
>   fru_id, fru_text, sec_sev,
> diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
> index 545a6c2..e9fb56a 100644
> --- a/drivers/firmware/efi/cper.c
> +++ b/drivers/firmware/efi/cper.c
> @@ -35,6 +35,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define INDENT_SP" "
>  
> diff --git a/drivers/ras/ras.c b/drivers/ras/ras.c
> index fb2500b..8ba5a94 100644
> --- a/drivers/ras/ras.c
> +++ b/drivers/ras/ras.c
> @@ -28,3 +28,4 @@ static int __init ras_init(void)
>  #endif
>  EXPORT_TRACEPOINT_SYMBOL_GPL(mc_event);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(unknown_sec_event);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(arm_event);
> diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
> index 5861b6f..b36db48 100644
> --- a/include/ras/ras_event.h
> +++ b/include/ras/ras_event.h
> @@ -162,6 +162,40 @@
>  );
>  
>  /*
> + * ARM Processor Events Report
> + *
> + * This event is generated when hardware detects an ARM processor error
> + * has occurred. UEFI 2.6 spec section N.2.4.4.
> + */
> +TRACE_EVENT(arm_event,
> +
> + TP_PROTO(const struct cper_sec_proc_arm *proc),
> +
> + TP_ARGS(proc),
> +
> + TP_STRUCT__entry(
> + __field(u64, mpidr)
> + __field(u64, midr)
> + __field(u32, running_state)
> + __field(u32, psci_state)
> + __field(u8, affinity)
> + ),
> +
> + TP_fast_assign(
> + __entry->affinity = proc->affinity_level;
> + __entry->mpidr = proc->mpidr;
> + __entry->midr = proc->midr;
> + __entry->running_state = proc->running_state;
> + __entry->psci_state = proc->psci_state;
> + ),
> +
> + TP_printk("affinity level: %d; MPIDR: %016llx; MIDR: %016llx; "
> +   "running state: %d; PSCI state: %d",
> +   __entry->affinity, __entry->mpidr, __entry->midr,
> +   __entry->running_state, __entry->psci_state)
> +);
> +

I think these fields are not enough, we need also export arm processor error
information (UEFI 2.6 spec section N.2.4.4.1), or at least the error type,
address, etc. So that the userspace (such as rasdaemon tool) could know what
error occurred.

Thanks,
Xie XiuQi

> +/*
>   * Unknown Section Report
>   *
>   * This event is generated when hardware detected a hardware
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-efi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html