Re: [PATCH linux-next 1/3] x86, crash: don't nest CONFIG_CRASH_DUMP ifdef inside CONFIG_KEXEC_CODE ifdef scope

2024-02-02 Thread Nathan Chancellor
This series resolves the build issues I was seeing. Please feel free to
carry

  Tested-by: Nathan Chancellor  # build

forward if there are any more revisions without drastic changes.

On Mon, Jan 29, 2024 at 09:50:31PM +0800, Baoquan He wrote:
> Michael pointed out that the #ifdef CONFIG_CRASH_DUMP is nested inside
> arch/x86/xen/enlighten_hvm.c.
> 
> Although the nesting works well too since CONFIG_CRASH_DUMP has
> dependency on CONFIG_KEXEC_CORE, it may cause confuse because there
> are places where it's not nested, and people may think it need be nested
> even though it doesn't have to.
> 
> Fix that by moving  CONFIG_CRASH_DUMP ifdeffery of codes out of
> CONFIG_KEXEC_CODE ifdeffery scope.
> 
> And also fix a building error Nathan reported as below by replacing
> CONFIG_KEXEC_CORE ifdef with CONFIG_VMCORE_INFO ifdef.
> 
> 
> $ curl -LSso .config 
> https://git.alpinelinux.org/aports/plain/community/linux-edge/config-edge.x86_64
> $ make -skj"$(nproc)" ARCH=x86_64 CROSS_COMPILE=x86_64-linux- olddefconfig all
> ...
> x86_64-linux-ld: arch/x86/xen/mmu_pv.o: in function `paddr_vmcoreinfo_note':
> mmu_pv.c:(.text+0x3af3): undefined reference to `vmcoreinfo_note'
> 
> 
> Link: 
> https://lore.kernel.org/all/sn6pr02mb4157931105fa68d72e3d3db8d4...@sn6pr02mb4157.namprd02.prod.outlook.com/T/#u
> Link: 
> https://lore.kernel.org/all/20240126045551.GA126645@dev-arch.thelio-3990X/T/#u
> Signed-off-by: Baoquan He 
> ---
>  arch/x86/kernel/cpu/mshyperv.c | 10 ++
>  arch/x86/kernel/reboot.c   |  2 +-
>  arch/x86/xen/enlighten_hvm.c   |  4 ++--
>  arch/x86/xen/mmu_pv.c  |  2 +-
>  4 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
> index f8163a59026b..2e8cd5a4ae85 100644
> --- a/arch/x86/kernel/cpu/mshyperv.c
> +++ b/arch/x86/kernel/cpu/mshyperv.c
> @@ -209,6 +209,7 @@ static void hv_machine_shutdown(void)
>   if (kexec_in_progress)
>   hyperv_cleanup();
>  }
> +#endif /* CONFIG_KEXEC_CORE */
>  
>  #ifdef CONFIG_CRASH_DUMP
>  static void hv_machine_crash_shutdown(struct pt_regs *regs)
> @@ -222,8 +223,7 @@ static void hv_machine_crash_shutdown(struct pt_regs 
> *regs)
>   /* Disable the hypercall page when there is only 1 active CPU. */
>   hyperv_cleanup();
>  }
> -#endif
> -#endif /* CONFIG_KEXEC_CORE */
> +#endif /* CONFIG_CRASH_DUMP */
>  #endif /* CONFIG_HYPERV */
>  
>  static uint32_t  __init ms_hyperv_platform(void)
> @@ -497,9 +497,11 @@ static void __init ms_hyperv_init_platform(void)
>   no_timer_check = 1;
>  #endif
>  
> -#if IS_ENABLED(CONFIG_HYPERV) && defined(CONFIG_KEXEC_CORE)
> +#if IS_ENABLED(CONFIG_HYPERV)
> +#if defined(CONFIG_KEXEC_CORE)
>   machine_ops.shutdown = hv_machine_shutdown;
> -#ifdef CONFIG_CRASH_DUMP
> +#endif
> +#if defined(CONFIG_CRASH_DUMP)
>   machine_ops.crash_shutdown = hv_machine_crash_shutdown;
>  #endif
>  #endif
> diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c
> index 1287b0d5962f..f3130f762784 100644
> --- a/arch/x86/kernel/reboot.c
> +++ b/arch/x86/kernel/reboot.c
> @@ -826,7 +826,7 @@ void machine_halt(void)
>   machine_ops.halt();
>  }
>  
> -#ifdef CONFIG_KEXEC_CORE
> +#ifdef CONFIG_CRASH_DUMP
>  void machine_crash_shutdown(struct pt_regs *regs)
>  {
>   machine_ops.crash_shutdown(regs);
> diff --git a/arch/x86/xen/enlighten_hvm.c b/arch/x86/xen/enlighten_hvm.c
> index 09e3db7ff990..0b367c1e086d 100644
> --- a/arch/x86/xen/enlighten_hvm.c
> +++ b/arch/x86/xen/enlighten_hvm.c
> @@ -148,6 +148,7 @@ static void xen_hvm_shutdown(void)
>   if (kexec_in_progress)
>   xen_reboot(SHUTDOWN_soft_reset);
>  }
> +#endif
>  
>  #ifdef CONFIG_CRASH_DUMP
>  static void xen_hvm_crash_shutdown(struct pt_regs *regs)
> @@ -156,7 +157,6 @@ static void xen_hvm_crash_shutdown(struct pt_regs *regs)
>   xen_reboot(SHUTDOWN_soft_reset);
>  }
>  #endif
> -#endif
>  
>  static int xen_cpu_up_prepare_hvm(unsigned int cpu)
>  {
> @@ -238,10 +238,10 @@ static void __init xen_hvm_guest_init(void)
>  
>  #ifdef CONFIG_KEXEC_CORE
>   machine_ops.shutdown = xen_hvm_shutdown;
> +#endif
>  #ifdef CONFIG_CRASH_DUMP
>   machine_ops.crash_shutdown = xen_hvm_crash_shutdown;
>  #endif
> -#endif
>  }
>  
>  static __init int xen_parse_nopv(char *arg)
> diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
> index 218773cfb009..e21974f2cf2d 100644
> --- a/arch/x86/xen/mmu_pv.c
> +++ b/arch/x86/xen/mmu_pv.c
> @@ -2520,7 +2520,7 @@ int xen_remap_pfn(struct vm_area_struct *vma, unsigned 
> long addr,
>  }
>  EXPORT_SYMBOL_GPL(xen_remap_pfn);
>  
> -#ifdef CONFIG_KEXEC_CORE
> +#ifdef CONFIG_VMCORE_INFO
>  phys_addr_t paddr_vmcoreinfo_note(void)
>  {
>   if (xen_pv_domain())
> -- 
> 2.41.0
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 00/17] kexec: Allow preservation of ftrace buffers

2024-02-02 Thread Alexander Graf

Hi Philipp,

On 29.01.24 17:34, Philipp Rudo wrote:

Hi Alex,

adding linux-integrity as there are some synergies with IMA_KEXEC (in case we
get KHO to work).

Fist of all I believe that having a generic framework to pass information from
one kernel to the other across kexec would be a good thing. But I'm afraid that



Thanks, I'm happy to hear that you agree with the basic motivation :). 
There are fundamentally 2 problems with passing data:


  * Passing structured data in a cross-architecture way
  * Passing memory

KHO tackles both. It proposes a common FDT based format that allows us 
to pass per-subsystem properties. That way, a subsystem does not need to 
know whether it's running on ARM, x86, RISC-V or s390x. It just gains 
awareness for KHO and can pass data.


On top of that, it proposes a standardized "mem" property (and some 
magic around that) which allows subsystems to pass memory.




you are ignoring some fundamental problems which makes it extremely hard, if
not impossible, to reliably transfer the kernel's state from one kernel to the
other.

One thing I don't understand is how reusing the scratch area is working. Sure
you pass it's location via the dt/boot_params but I don't see any code that
makes it a CMA region. So IIUC the scratch area won't be available for the 2nd
kernel. Which is probably for the better as IIUC the 2nd kernel gets loaded and
runs inside that area and I don't believe the CMA design ever considered that
the kernel image could be included in a CMA area.



That one took me a lot to figure out sensibly (with recursion all the 
way down) while building KHO :). I hope I detailed it sensibly in the 
documentation - please let me know how to improve it in case it's 
unclear: https://lore.kernel.org/lkml/20240117144704.602-8-g...@amazon.com/


Let me explain inline using different words as well what happens:

The first (and only the first) kernel that boots allocates a CMA region 
as "scratch region". It loads the new kernel into that region. It passes 
that region as "scratch region" to the next kernel. The next kernel now 
takes it and marks every page block that the scratch region spans as CMA:


https://lore.kernel.org/lkml/20240117144704.602-3-g...@amazon.com/

The CMA hint doesn't mean we create an actual CMA region. It mostly 
means that the kernel won't use this memory for any kernel allocations. 
Kernel allocations up to this point are allocations we don't need to 
pass on with KHO again. Kernel allocations past that point may be 
allocations that we want to pass, so we just never place them into the 
"scratch region" again.


And because we now already have a scratch region from the previous 
kernel, we keep reusing that forever with any new KHO kexec.




Staying at reusing the scratch area. One thing that is broken for sure is that
you reuse the scratch area without ever checking the kho_scratch parameter of
the 2nd kernel's command line. Remember, with kexec you are dealing with two
different kernels with two different command lines. Meaning you can only reuse
the scratch area if the requested size in the 2nd kernel is identical to the
one of the 1st kernel. In all other cases you need to adjust the scratch area's
size or reserve a new one.



Hm. So you're saying a user may want to change the size of the scratch 
area with a KHO kexec. That's insanely risky because you (as rightfully 
pointed out below) may have significant fragmentation at that point. And 
we will only know when we're in the new kernel so it's too late to 
abort. IMHO it's better to just declare the scratch region as immutable 
during KHO to avoid that pitfall.




This directly leads to the next problem. In kho_reserve_previous_mem you are
reusing the different memory regions wherever the 1st kernel allocated them.
But that also means you are handing over the 1st kernel's memory
fragmentation to the 2nd kernel and you do that extremely early during boot.
Which means that users who need to allocate large continuous physical memory,
like the scratch area or the crashkernel memory, will have increasing chance to
not find a suitable area. Which IMHO is unacceptable.



Correct :). It basically means you want to pass large allocations from 
the 1st kernel that you want to preserve on to the next. So if the 1st 
kernel allocated a large crash area, it's safest to pass that allocation 
using KHO to ensure the next kernel also has the region fully reserved. 
Otherwise the next kernel may accidentally place data into the 
previously reserved crash region (which would be contiguously free at 
early init of the 2nd kernel) and fragment it again.




Finally, and that's the big elephant in the room, is your lax handling of the
unstable kernel internal ABI. Remember, you are dealing with two different
kernels, that also means two different source levels and two different configs.
So only because both the 1st and 2nd kernel have a e.g. struct buffer_page
doesn't means that they have the same struct 

Re: [PATCH] kexec-tools: purgatory: fix build on `binutils-2.42`

2024-02-02 Thread Simon Horman
On Fri, Feb 02, 2024 at 12:16:38PM +0800, Baoquan He wrote:
> On 02/02/24 at 11:01am, Coiby Xu wrote:
> > Hi,
> > 
> > FYI, before this patch, Michel already sent "[PATCH] Fix building on x86_64
> > with binutils 2.41" to address the same issue. Currently I almost know
> > nothing about
> > assembly but Michel's patch seems to be more complete because two more
> > files are touched.
> 
> Thanks for telling, I didn't notice that one. I didn't know these
> either, leave them to Sergei and Michel to decide what is the
> appropriate solution.

Hi,

I did go ahead and apply Michael's patch.
Let's follow-up with incremental changes if necessary.

...

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH kexec-tools] workflow: update to use checkout@v4

2024-02-02 Thread Simon Horman
Update to use checkout@v4.

This addresses the following warning that appears in GitHub runs:

"Node.js 16 actions are deprecated. Please update the following actions
 to use Node.js 20: actions/checkout@v3. For more information
 see: 
https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.

Link: https://github.com/horms/kexec-tools/actions/runs/7753454923
Signed-off-by: Simon Horman 
---
 .github/workflows/build.yml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
index d0007f14b274..46edde66e384 100644
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -28,7 +28,7 @@ jobs:
 libxen: libxen
 steps:
 - name: Checkout
-  uses: actions/checkout@v3
+  uses: actions/checkout@v4
 
 - name: Set Environment
   env:


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] Fix building on x86_64 with binutils 2.41

2024-02-02 Thread Simon Horman
On Tue, Jan 30, 2024 at 04:14:31AM -0600, Michel Lind via B4 Relay wrote:
> From: Michel Lind 
> 
> Newer versions of the GNU assembler (observed with binutils 2.41) will
> complain about the ".arch i386" in files assembled with "as --64",
> with the message "Error: 64bit mode not supported on 'i386'".
> 
> Fix by moving ".arch i386" below the relevant ".code32" directive, so
> that the assembler is no longer expecting 64-bit instructions to be used
> by the time that the ".arch i386" directive is encountered.
> 
> Based on similar iPXE fix:
> https://github.com/ipxe/ipxe/commit/6ca597eee
> 
> Signed-off-by: Michel Lind 

Thanks Michael,

applied.

- Fix building on x86_64 with binutils 2.41
  
https://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git/commit/?id=328de8e00e29

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec