Re: [PATCH 12/13] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case

2023-10-20 Thread kirill.shute...@linux.intel.com
On Fri, Oct 20, 2023 at 03:29:24AM +, Huang, Kai wrote: > On Thu, 2023-10-05 at 16:14 +0300, Kirill A. Shutemov wrote: > > ACPI MADT doesn't allow to offline CPU after it got woke up. It limits > > kexec: target kernel won't be able to use more than one CPU. > > > > Zero out mailbox address

Re: [PATCH 10/13] x86/tdx: Convert shared memory back to private on kexec

2023-10-20 Thread Kirill A. Shutemov
On Fri, Oct 20, 2023 at 12:21:11PM +0300, Kirill A. Shutemov wrote: > On Fri, Oct 06, 2023 at 02:24:11PM -0500, Kalra, Ashish wrote: > > > > On 10/5/2023 5:28 PM, Kirill A. Shutemov wrote: > > > On Thu, Oct 05, 2023 at 05:01:23PM -0500, Kalra, Ashish wrote: > > > > On 10/5/2023 4:28 PM, Kirill A.

Re: [PATCH 13/13] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2023-10-20 Thread kirill.shute...@linux.intel.com
On Fri, Oct 20, 2023 at 09:49:59AM +, Huang, Kai wrote: > On Thu, 2023-10-05 at 16:14 +0300, Kirill A. Shutemov wrote: > >  struct acpi_madt_multiproc_wakeup { > >   struct acpi_subtable_header header; > > - u16 mailbox_version; > > + u16 version; > >   u32 reserved; /*

Re: [PATCH 10/13] x86/tdx: Convert shared memory back to private on kexec

2023-10-20 Thread Kirill A. Shutemov
On Fri, Oct 06, 2023 at 02:24:11PM -0500, Kalra, Ashish wrote: > > On 10/5/2023 5:28 PM, Kirill A. Shutemov wrote: > > On Thu, Oct 05, 2023 at 05:01:23PM -0500, Kalra, Ashish wrote: > > > On 10/5/2023 4:28 PM, Kirill A. Shutemov wrote: > > > > On Thu, Oct 05, 2023 at 01:41:38PM -0500, Kalra,

Re: [PATCH 13/13] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2023-10-20 Thread Huang, Kai
On Thu, 2023-10-05 at 16:14 +0300, Kirill A. Shutemov wrote: >  struct acpi_madt_multiproc_wakeup { >   struct acpi_subtable_header header; > - u16 mailbox_version; > + u16 version; >   u32 reserved; /* reserved - must be zero */ > - u64 base_address; > + u64

Re: [PATCH 13/13] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2023-10-20 Thread Huang, Kai
> --- /dev/null > +++ b/arch/x86/kernel/acpi/madt.S > @@ -0,0 +1,28 @@ > +#include > +#include > +#include > +#include > + > + .text > + .align PAGE_SIZE > +SYM_FUNC_START(asm_acpi_mp_play_dead) > + /* Load address of reset vector into RCX to jump when kernel is ready */ > +

Re: [PATCH 03/13] cpu/hotplug, x86/acpi: Disable CPU hotplug for ACPI MADT wakeup

2023-10-20 Thread Huang, Kai
On Tue, 2023-10-10 at 10:24 +, Huang, Kai wrote: > >  /* Physical address of the Multiprocessor Wakeup Structure mailbox */ > > @@ -74,6 +75,9 @@ int __init acpi_parse_mp_wake(union acpi_subtable_headers > > *header, > >   > > > >   acpi_mp_wake_mailbox_paddr = mp_wake->base_address; > >  

Re: [PATCHv9 2/2] powerpc/setup: Loosen the mapping between cpu logical id and its seq in dt

2023-10-20 Thread Hari Bathini
On 18/10/23 1:51 pm, Pingfan Liu wrote: On Tue, Oct 17, 2023 at 6:39 PM Hari Bathini wrote: On 17/10/23 7:58 am, Pingfan Liu wrote: *** Idea *** For kexec -p, the boot cpu can be not the cpu0, this causes the problem of allocating memory for paca_ptrs[]. However, in theory, there is no

Re: [PATCH 13/13] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2023-10-20 Thread kirill.shute...@linux.intel.com
On Fri, Oct 20, 2023 at 11:21:34AM +, Huang, Kai wrote: > > > --- /dev/null > > +++ b/arch/x86/kernel/acpi/madt.S > > @@ -0,0 +1,28 @@ > > +#include > > +#include > > +#include > > +#include > > + > > + .text > > + .align PAGE_SIZE > > +SYM_FUNC_START(asm_acpi_mp_play_dead) > > + /*

[PATCHv2 12/13] x86/acpi: Rename fields in acpi_madt_multiproc_wakeup structure

2023-10-20 Thread Kirill A. Shutemov
To prepare for the addition of support for MADT wakeup structure version 1, it is necessary to provide more appropriate names for the fields in the structure. The field 'mailbox_version' renamed as 'version'. This field signifies the version of the structure and the related protocols, rather than

[PATCHv2 13/13] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method

2023-10-20 Thread Kirill A. Shutemov
MADT Multiprocessor Wakeup structure version 1 brings support of CPU offlining: BIOS provides a reset vector where the CPU has to jump to offline itself. The new TEST mailbox command can be used to test the CPU offlined successfully and BIOS has control over it. Add CPU offling support for ACPI

[PATCHv2 07/13] x86/mm: Return correct level from lookup_address() if pte is none

2023-10-20 Thread Kirill A. Shutemov
lookup_address() only returns correct page table level for the entry if the entry is not none. Make the helper to always return correct 'level'. It allows to implement iterator over kernel page tables using lookup_address(). Add one more entry into enum pg_level to indicate size of VA covered by

[PATCHv2 10/13] x86/mm: Make e820_end_ram_pfn() cover E820_TYPE_ACPI ranges

2023-10-20 Thread Kirill A. Shutemov
e820__end_of_ram_pfn() is used to calculate max_pfn which, among other things, guides where direct mapping ends. Any memory above max_pfn is not going to be present in the direct mapping. e820__end_of_ram_pfn() finds the end of the ram based on the highest E820_TYPE_RAM range. But it doesn't

[PATCHv2 09/13] x86/tdx: Convert shared memory back to private on kexec

2023-10-20 Thread Kirill A. Shutemov
TDX guests allocate shared buffers to perform I/O. It is done by allocating pages normally from the buddy allocator and converting them to shared with set_memory_decrypted(). The second kernel has no idea what memory is converted this way. It only sees E820_TYPE_RAM. Accessing shared memory via

[PATCHv2 02/13] kernel/cpu: Add support for declaring CPU offlining not supported

2023-10-20 Thread Kirill A. Shutemov
ACPI MADT doesn't allow to offline CPU after it got woke up. Currently offlining hotplug prevented based on the confidential computing attribute which is set for Intel TDX. But TDX is not the only possible user of the wake up method. Introduce cpu_hotplug_not_supported() that can be called to

[PATCHv2 01/13] x86/acpi: Extract ACPI MADT wakeup code into a separate file

2023-10-20 Thread Kirill A. Shutemov
In order to prepare for the expansion of support for the ACPI MADT wakeup method, move the relevant code into a separate file. Introduce a new configuration option to clearly indicate dependencies without the use of ifdefs. There have been no functional changes. Signed-off-by: Kirill A.

[PATCHv2 11/13] x86/acpi: Do not attempt to bring up secondary CPUs in kexec case

2023-10-20 Thread Kirill A. Shutemov
ACPI MADT doesn't allow to offline CPU after it got woke up. It limits kexec: the second kernel won't be able to use more than one CPU. Now acpi_mp_wake_mailbox_paddr already has the mailbox address. The acpi_wakeup_cpu() will use it to bring up secondary cpus. Zero out mailbox address in the

[PATCHv2 05/13] x86/kexec: Keep CR4.MCE set during kexec for TDX guest

2023-10-20 Thread Kirill A. Shutemov
TDX guests are not allowed to clear CR4.MCE. Attempt to clear it leads to #VE. Use alternatives to keep the flag during kexec for TDX guests. The change doesn't affect non-TDX-guest environments. Signed-off-by: Kirill A. Shutemov Reviewed-by: Kai Huang ---

[PATCHv2 06/13] x86/mm: Make x86_platform.guest.enc_status_change_*() return errno

2023-10-20 Thread Kirill A. Shutemov
TDX is going to have more than one reason to fail enc_status_change_prepare(). Change the callback to return errno instead of assuming -EIO; enc_status_change_finish() changed too to keep the interface symmetric. Signed-off-by: Kirill A. Shutemov --- arch/x86/coco/tdx/tdx.c | 20

[PATCHv2 00/13] x86/tdx: Add kexec support

2023-10-20 Thread Kirill A. Shutemov
The patchset adds bits and pieces to get kexec (and crashkernel) work on TDX guest. The last patch implements CPU offlining according to the approved ACPI spec change poposal[1]. It unlocks kexec with all CPUs visible in the target kernel. It requires BIOS-side enabling. If it missing we fallback

[PATCHv2 08/13] x86/tdx: Account shared memory

2023-10-20 Thread Kirill A. Shutemov
The kernel will convert all shared memory back to private during kexec. The direct mapping page tables will provide information on which memory is shared. It is extremely important to convert all shared memory. If a page is missed, it will cause the second kernel to crash when it accesses it.

Re: [PATCHv2 04/13] x86/kvm: Do not try to disable kvmclock if it was not enabled

2023-10-20 Thread Sean Christopherson
On Fri, Oct 20, 2023, Kirill A. Shutemov wrote: > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > present in the VM. It leads to write to a MSR that doesn't exist on some > configurations, namely in TDX guest: > > unchecked MSR access error: WRMSR to 0x12 (tried to

[PATCHv2 03/13] cpu/hotplug, x86/acpi: Disable CPU offlining for ACPI MADT wakeup

2023-10-20 Thread Kirill A. Shutemov
ACPI MADT doesn't allow to offline CPU after it got woke up. Currently hotplug prevented based on the confidential computing attribute which is set for Intel TDX. But TDX is not the only possible user of the wake up method. Disable CPU offlining on ACPI MADT wakeup enumeration. Signed-off-by:

[PATCHv2 04/13] x86/kvm: Do not try to disable kvmclock if it was not enabled

2023-10-20 Thread Kirill A. Shutemov
kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is present in the VM. It leads to write to a MSR that doesn't exist on some configurations, namely in TDX guest: unchecked MSR access error: WRMSR to 0x12 (tried to write 0x) at rIP:

Re: [PATCH 03/13] cpu/hotplug, x86/acpi: Disable CPU hotplug for ACPI MADT wakeup

2023-10-20 Thread kirill.shute...@linux.intel.com
On Fri, Oct 20, 2023 at 11:58:58AM +, Huang, Kai wrote: > On Tue, 2023-10-10 at 10:24 +, Huang, Kai wrote: > > >  /* Physical address of the Multiprocessor Wakeup Structure mailbox */ > > > @@ -74,6 +75,9 @@ int __init acpi_parse_mp_wake(union > > > acpi_subtable_headers *header, > > >  

Re: [PATCHv2 04/13] x86/kvm: Do not try to disable kvmclock if it was not enabled

2023-10-20 Thread Vitaly Kuznetsov
"Kirill A. Shutemov" writes: > kvm_guest_cpu_offline() tries to disable kvmclock regardless if it is > present in the VM. It leads to write to a MSR that doesn't exist on some > configurations, namely in TDX guest: > > unchecked MSR access error: WRMSR to 0x12 (tried to write >

Re: [PATCHv2 01/13] x86/acpi: Extract ACPI MADT wakeup code into a separate file

2023-10-20 Thread Kuppuswamy Sathyanarayanan
On 10/20/2023 8:12 AM, Kirill A. Shutemov wrote: > In order to prepare for the expansion of support for the ACPI MADT > wakeup method, move the relevant code into a separate file. > > Introduce a new configuration option to clearly indicate dependencies > without the use of ifdefs. > > There

Re: [PATCHv2 04/13] x86/kvm: Do not try to disable kvmclock if it was not enabled

2023-10-20 Thread Sean Christopherson
On Fri, Oct 20, 2023, Vitaly Kuznetsov wrote: > > --- > > arch/x86/kernel/kvmclock.c | 12 > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x86/kernel/kvmclock.c b/arch/x86/kernel/kvmclock.c > > index fb8f52149be9..f2fff625576d 100644 > > ---

Re: [PATCH v2 7/7] ima: record log size at kexec load and execute

2023-10-20 Thread Tushar Sugandhi
On 10/12/23 17:27, Stefan Berger wrote: On 10/5/23 14:26, Tushar Sugandhi wrote: The window between kexec 'load' and 'execute' could be arbitrarily long. Even with the large chunk of memory allocated at kexec 'load', it may run out which would result in missing events in IMA log after the

Re: [PATCH v2 6/7] ima: make the memory for events between kexec load and exec configurable

2023-10-20 Thread Stefan Berger
On 10/20/23 16:39, Tushar Sugandhi wrote: On 10/12/23 17:27, Stefan Berger wrote: On 10/5/23 14:26, Tushar Sugandhi wrote: IMA currently allocates half a PAGE_SIZE for the extra events that would be measured between kexec 'load' and 'execute'.  Depending on the IMA policy and the system

Re: [PATCH v2 1/7] ima: refactor ima_dump_measurement_list to move memory allocation to a separate function

2023-10-20 Thread Tushar Sugandhi
Thanks a lot Stefan for reviewing this series. Really appreciate it. On 10/12/23 17:28, Stefan Berger wrote: On 10/5/23 14:25, Tushar Sugandhi wrote: IMA allocates memory and dumps the measurement during kexec soft reboot as a single function call ima_dump_measurement_list().  It gets called

Re: [PATCH v2 6/7] ima: make the memory for events between kexec load and exec configurable

2023-10-20 Thread Tushar Sugandhi
On 10/20/23 14:16, Stefan Berger wrote: No, what I mean is you should ask the user for how many extra kilobytes (kb) to allocate - not ask for pages.    Stefan Ok. Will do. I will align the input config value to the PAGE_SIZE as well.

Re: [PATCH v2 2/7] ima: move ima_dump_measurement_list call from kexec load to execute

2023-10-20 Thread Tushar Sugandhi
On 10/12/23 17:28, Stefan Berger wrote: On 10/5/23 14:25, Tushar Sugandhi wrote: In the current IMA implementation, ima_dump_measurement_list() is called during the kexec 'load' operation.  This can result in loss of IMA measurements taken between the 'load' and 'execute' phases when the

Re: [PATCH v2 1/7] ima: refactor ima_dump_measurement_list to move memory allocation to a separate function

2023-10-20 Thread Stefan Berger
On 10/20/23 16:33, Tushar Sugandhi wrote: Thanks a lot Stefan for reviewing this series. Really appreciate it. You are welcome. What may be a bit problematic is the fact that between the time the buffer for the flattened IMA log is allocated (kexec 'load') and the time it is filled

Re: [PATCH v2 3/7] ima: kexec: map source pages containing IMA buffer to image post kexec load

2023-10-20 Thread Tushar Sugandhi
On 10/12/23 17:29, Stefan Berger wrote: On 10/5/23 14:25, Tushar Sugandhi wrote: Currently, the mechanism to map and unmap segments to the kimage structure is not available to the subsystems outside of kexec.  This functionality is needed when IMA is allocating the memory segments during

Re: [PATCH v2 6/7] ima: make the memory for events between kexec load and exec configurable

2023-10-20 Thread Tushar Sugandhi
On 10/12/23 17:27, Stefan Berger wrote: On 10/5/23 14:26, Tushar Sugandhi wrote: IMA currently allocates half a PAGE_SIZE for the extra events that would be measured between kexec 'load' and 'execute'.  Depending on the IMA policy and the system state, that memory may not be sufficient to

Re: [PATCH v2 1/7] ima: refactor ima_dump_measurement_list to move memory allocation to a separate function

2023-10-20 Thread Tushar Sugandhi
On 10/20/23 14:21, Stefan Berger wrote: On 10/20/23 16:33, Tushar Sugandhi wrote: Thanks a lot Stefan for reviewing this series. Really appreciate it. You are welcome. What may be a bit problematic is the fact that between the time the buffer for the flattened IMA log is allocated