subject:"Re\: \[PULL 10\/56\] x86\: don't let decompressed kernel image clobber setup

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread H. Peter Anvin

On January 31, 2023 1:22:43 PM PST, "Jason A. Donenfeld"  
wrote:
>On Tue, Jan 31, 2023, 15:55 H. Peter Anvin  wrote:
>
>> On January 30, 2023 12:19:14 PM PST, "Michael S. Tsirkin" 
>> wrote:
>> >From: "Jason A. Donenfeld" 
>> >
>> >The setup_data links are appended to the compressed kernel image. Since
>> >the kernel image is typically loaded at 0x10, setup_data lives at
>> >`0x10 + compressed_size`, which does not get relocated during the
>> >kernel's boot process.
>> >
>> >The kernel typically decompresses the image starting at address
>> >0x100 (note: there's one more zero there than the compressed image
>> >above). This usually is fine for most kernels.
>> >
>> >However, if the compressed image is actually quite large, then
>> >setup_data will live at a `0x10 + compressed_size` that extends into
>> >the decompressed zone at 0x100. In other words, if compressed_size
>> >is larger than `0x100 - 0x10`, then the decompression step will
>> >clobber setup_data, resulting in crashes.
>> >
>> >Visually, what happens now is that QEMU appends setup_data to the kernel
>> >image:
>> >
>> >  kernel imagesetup_data
>> >   |--|||
>> >0x10  0x10+l1 0x10+l1+l2
>> >
>> >The problem is that this decompresses to 0x100 (one more zero). So
>> >if l1 is > (0x100-0x10), then this winds up looking like:
>> >
>> >  kernel imagesetup_data
>> >   |--|||
>> >0x10  0x10+l1 0x10+l1+l2
>> >
>> > d e c o m p r e s s e d   k e r n e l
>> >
>>  |-|
>> >0x100
>>  0x100+l3
>> >
>> >The decompressed kernel seemingly overwriting the compressed kernel
>> >image isn't a problem, because that gets relocated to a higher address
>> >early on in the boot process, at the end of startup_64. setup_data,
>> >however, stays in the same place, since those links are self referential
>> >and nothing fixes them up.  So the decompressed kernel clobbers it.
>> >
>> >Fix this by appending setup_data to the cmdline blob rather than the
>> >kernel image blob, which remains at a lower address that won't get
>> >clobbered.
>> >
>> >This could have been done by overwriting the initrd blob instead, but
>> >that poses big difficulties, such as no longer being able to use memory
>> >mapped files for initrd, hurting performance, and, more importantly, the
>> >initrd address calculation is hard coded in qboot, and it always grows
>> >down rather than up, which means lots of brittle semantics would have to
>> >be changed around, incurring more complexity. In contrast, using cmdline
>> >is simple and doesn't interfere with anything.
>> >
>> >The microvm machine has a gross hack where it fiddles with fw_cfg data
>> >after the fact. So this hack is updated to account for this appending,
>> >by reserving some bytes.
>> >
>> >Fixup-by: Michael S. Tsirkin 
>> >Cc: x...@kernel.org
>> >Cc: Philippe Mathieu-Daudé 
>> >Cc: H. Peter Anvin 
>> >Cc: Borislav Petkov 
>> >Cc: Eric Biggers 
>> >Signed-off-by: Jason A. Donenfeld 
>> >Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
>> >Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
>> >Reviewed-by: Michael S. Tsirkin 
>> >Signed-off-by: Michael S. Tsirkin 
>> >Tested-by: Eric Biggers 
>> >Tested-by: Mathias Krause 
>> >---
>> > include/hw/i386/microvm.h |  5 ++--
>> > include/hw/nvram/fw_cfg.h |  9 +++
>> > hw/i386/microvm.c | 15 +++
>> > hw/i386/x86.c | 52 +--
>> > hw/nvram/fw_cfg.c |  9 +++
>> > 5 files changed, 59 insertions(+), 31 deletions(-)
>> >
>> >diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
>> >index fad97a891d..e8af61f194 100644
>> >--- a/include/hw/i386/microvm.h
>> >+++ b/include/hw/i386/microvm.h
>> >@@ -50,8 +50,9 @@
>> >  */
>> >
>> > /* Platform virtio definitions */
>> >-#define VIRTIO_MMIO_BASE  0xfeb0
>> >-#define VIRTIO_CMDLINE_MAXLEN 64
>> >+#define VIRTIO_MMIO_BASE0xfeb0
>> >+#define VIRTIO_CMDLINE_MAXLEN   64
>> >+#define VIRTIO_CMDLINE_TOTAL_MAX_LEN((VIRTIO_CMDLINE_MAXLEN + 1) *
>> 16)
>> >
>> > #define GED_MMIO_BASE 0xfea0
>> > #define GED_MMIO_BASE_MEMHP   (GED_MMIO_BASE + 0x100)
>> >diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
>> >index 2e503904dc..990dcdbb2e 100644
>> >--- a/include/hw/nvram/fw_cfg.h
>> >+++ b/include/hw/nvram/fw_cfg.h
>> >@@ -139,6 +139,15 @@ void fw_cfg_add_bytes_callback(FWCfgState *s,
>> uint16_t key,
>> >void *data, size_t len,
>> >bool read_only);
>> >
>> >+/**
>> >+ * fw_cfg_read_bytes_ptr:
>> >+ * @s: fw_cfg device being modified
>> >+ * @key: selector key value for new fw_cfg item
>> >+ *

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread Jason A. Donenfeld

On Mon, Jan 30, 2023 at 03:19:59PM -0500, Michael S. Tsirkin wrote:
> From: "Jason A. Donenfeld" 
> 
> The setup_data links are appended to the compressed kernel image. Since
> the kernel image is typically loaded at 0x10, setup_data lives at
> `0x10 + compressed_size`, which does not get relocated during the
> kernel's boot process.
> 
> The kernel typically decompresses the image starting at address
> 0x100 (note: there's one more zero there than the compressed image
> above). This usually is fine for most kernels.
> 
> However, if the compressed image is actually quite large, then
> setup_data will live at a `0x10 + compressed_size` that extends into
> the decompressed zone at 0x100. In other words, if compressed_size
> is larger than `0x100 - 0x10`, then the decompression step will
> clobber setup_data, resulting in crashes.
> 
> Visually, what happens now is that QEMU appends setup_data to the kernel
> image:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
> The problem is that this decompresses to 0x100 (one more zero). So
> if l1 is > (0x100-0x10), then this winds up looking like:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
>  d e c o m p r e s s e d   k e r n e l
>  
> |-|
> 0x100 
> 0x100+l3
> 
> The decompressed kernel seemingly overwriting the compressed kernel
> image isn't a problem, because that gets relocated to a higher address
> early on in the boot process, at the end of startup_64. setup_data,
> however, stays in the same place, since those links are self referential
> and nothing fixes them up.  So the decompressed kernel clobbers it.
> 
> Fix this by appending setup_data to the cmdline blob rather than the
> kernel image blob, which remains at a lower address that won't get
> clobbered.
> 
> This could have been done by overwriting the initrd blob instead, but
> that poses big difficulties, such as no longer being able to use memory
> mapped files for initrd, hurting performance, and, more importantly, the
> initrd address calculation is hard coded in qboot, and it always grows
> down rather than up, which means lots of brittle semantics would have to
> be changed around, incurring more complexity. In contrast, using cmdline
> is simple and doesn't interfere with anything.
> 
> The microvm machine has a gross hack where it fiddles with fw_cfg data
> after the fact. So this hack is updated to account for this appending,
> by reserving some bytes.
> 
> Fixup-by: Michael S. Tsirkin 
> Cc: x...@kernel.org
> Cc: Philippe Mathieu-Daudé 
> Cc: H. Peter Anvin 
> Cc: Borislav Petkov 
> Cc: Eric Biggers 
> Signed-off-by: Jason A. Donenfeld 
> Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
> Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Tested-by: Eric Biggers 
> Tested-by: Mathias Krause 
> ---
>  include/hw/i386/microvm.h |  5 ++--
>  include/hw/nvram/fw_cfg.h |  9 +++
>  hw/i386/microvm.c | 15 +++
>  hw/i386/x86.c | 52 +--
>  hw/nvram/fw_cfg.c |  9 +++
>  5 files changed, 59 insertions(+), 31 deletions(-)

Cc: qemu-sta...@nongnu.org
Fixes: 67f7e426 ("hw/i386: pass RNG seed via setup_data entry")

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread Michael S. Tsirkin

On Tue, Jan 31, 2023 at 08:39:08PM +0100, Jason A. Donenfeld wrote:
> On Mon, Jan 30, 2023 at 03:19:59PM -0500, Michael S. Tsirkin wrote:
> > From: "Jason A. Donenfeld" 
> > 
> > The setup_data links are appended to the compressed kernel image. Since
> > the kernel image is typically loaded at 0x10, setup_data lives at
> > `0x10 + compressed_size`, which does not get relocated during the
> > kernel's boot process.
> > 
> > The kernel typically decompresses the image starting at address
> > 0x100 (note: there's one more zero there than the compressed image
> > above). This usually is fine for most kernels.
> > 
> > However, if the compressed image is actually quite large, then
> > setup_data will live at a `0x10 + compressed_size` that extends into
> > the decompressed zone at 0x100. In other words, if compressed_size
> > is larger than `0x100 - 0x10`, then the decompression step will
> > clobber setup_data, resulting in crashes.
> > 
> > Visually, what happens now is that QEMU appends setup_data to the kernel
> > image:
> > 
> >   kernel imagesetup_data
> >|--|||
> > 0x10  0x10+l1 0x10+l1+l2
> > 
> > The problem is that this decompresses to 0x100 (one more zero). So
> > if l1 is > (0x100-0x10), then this winds up looking like:
> > 
> >   kernel imagesetup_data
> >|--|||
> > 0x10  0x10+l1 0x10+l1+l2
> > 
> >  d e c o m p r e s s e d   k e r n e l
> >  
> > |-|
> > 0x100   
> >   0x100+l3
> > 
> > The decompressed kernel seemingly overwriting the compressed kernel
> > image isn't a problem, because that gets relocated to a higher address
> > early on in the boot process, at the end of startup_64. setup_data,
> > however, stays in the same place, since those links are self referential
> > and nothing fixes them up.  So the decompressed kernel clobbers it.
> > 
> > Fix this by appending setup_data to the cmdline blob rather than the
> > kernel image blob, which remains at a lower address that won't get
> > clobbered.
> > 
> > This could have been done by overwriting the initrd blob instead, but
> > that poses big difficulties, such as no longer being able to use memory
> > mapped files for initrd, hurting performance, and, more importantly, the
> > initrd address calculation is hard coded in qboot, and it always grows
> > down rather than up, which means lots of brittle semantics would have to
> > be changed around, incurring more complexity. In contrast, using cmdline
> > is simple and doesn't interfere with anything.
> > 
> > The microvm machine has a gross hack where it fiddles with fw_cfg data
> > after the fact. So this hack is updated to account for this appending,
> > by reserving some bytes.
> > 
> > Fixup-by: Michael S. Tsirkin 
> > Cc: x...@kernel.org
> > Cc: Philippe Mathieu-Daudé 
> > Cc: H. Peter Anvin 
> > Cc: Borislav Petkov 
> > Cc: Eric Biggers 
> > Signed-off-by: Jason A. Donenfeld 
> > Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
> > Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
> > Reviewed-by: Michael S. Tsirkin 
> > Signed-off-by: Michael S. Tsirkin 
> > Tested-by: Eric Biggers 
> > Tested-by: Mathias Krause 
> 
> This one should wind up in the stable point release too. Dunno what the
> procedure for that is.
> 
> Jason

If you want that you need to include
Cc: qemu-sta...@nongnu.org
Fixes:  ("subject")

you can still reply to the original mail with this.

-- 
MST

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread Jason A. Donenfeld

On Tue, Jan 31, 2023, 15:55 H. Peter Anvin  wrote:

> On January 30, 2023 12:19:14 PM PST, "Michael S. Tsirkin" 
> wrote:
> >From: "Jason A. Donenfeld" 
> >
> >The setup_data links are appended to the compressed kernel image. Since
> >the kernel image is typically loaded at 0x10, setup_data lives at
> >`0x10 + compressed_size`, which does not get relocated during the
> >kernel's boot process.
> >
> >The kernel typically decompresses the image starting at address
> >0x100 (note: there's one more zero there than the compressed image
> >above). This usually is fine for most kernels.
> >
> >However, if the compressed image is actually quite large, then
> >setup_data will live at a `0x10 + compressed_size` that extends into
> >the decompressed zone at 0x100. In other words, if compressed_size
> >is larger than `0x100 - 0x10`, then the decompression step will
> >clobber setup_data, resulting in crashes.
> >
> >Visually, what happens now is that QEMU appends setup_data to the kernel
> >image:
> >
> >  kernel imagesetup_data
> >   |--|||
> >0x10  0x10+l1 0x10+l1+l2
> >
> >The problem is that this decompresses to 0x100 (one more zero). So
> >if l1 is > (0x100-0x10), then this winds up looking like:
> >
> >  kernel imagesetup_data
> >   |--|||
> >0x10  0x10+l1 0x10+l1+l2
> >
> > d e c o m p r e s s e d   k e r n e l
> >
>  |-|
> >0x100
>  0x100+l3
> >
> >The decompressed kernel seemingly overwriting the compressed kernel
> >image isn't a problem, because that gets relocated to a higher address
> >early on in the boot process, at the end of startup_64. setup_data,
> >however, stays in the same place, since those links are self referential
> >and nothing fixes them up.  So the decompressed kernel clobbers it.
> >
> >Fix this by appending setup_data to the cmdline blob rather than the
> >kernel image blob, which remains at a lower address that won't get
> >clobbered.
> >
> >This could have been done by overwriting the initrd blob instead, but
> >that poses big difficulties, such as no longer being able to use memory
> >mapped files for initrd, hurting performance, and, more importantly, the
> >initrd address calculation is hard coded in qboot, and it always grows
> >down rather than up, which means lots of brittle semantics would have to
> >be changed around, incurring more complexity. In contrast, using cmdline
> >is simple and doesn't interfere with anything.
> >
> >The microvm machine has a gross hack where it fiddles with fw_cfg data
> >after the fact. So this hack is updated to account for this appending,
> >by reserving some bytes.
> >
> >Fixup-by: Michael S. Tsirkin 
> >Cc: x...@kernel.org
> >Cc: Philippe Mathieu-Daudé 
> >Cc: H. Peter Anvin 
> >Cc: Borislav Petkov 
> >Cc: Eric Biggers 
> >Signed-off-by: Jason A. Donenfeld 
> >Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
> >Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
> >Reviewed-by: Michael S. Tsirkin 
> >Signed-off-by: Michael S. Tsirkin 
> >Tested-by: Eric Biggers 
> >Tested-by: Mathias Krause 
> >---
> > include/hw/i386/microvm.h |  5 ++--
> > include/hw/nvram/fw_cfg.h |  9 +++
> > hw/i386/microvm.c | 15 +++
> > hw/i386/x86.c | 52 +--
> > hw/nvram/fw_cfg.c |  9 +++
> > 5 files changed, 59 insertions(+), 31 deletions(-)
> >
> >diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
> >index fad97a891d..e8af61f194 100644
> >--- a/include/hw/i386/microvm.h
> >+++ b/include/hw/i386/microvm.h
> >@@ -50,8 +50,9 @@
> >  */
> >
> > /* Platform virtio definitions */
> >-#define VIRTIO_MMIO_BASE  0xfeb0
> >-#define VIRTIO_CMDLINE_MAXLEN 64
> >+#define VIRTIO_MMIO_BASE0xfeb0
> >+#define VIRTIO_CMDLINE_MAXLEN   64
> >+#define VIRTIO_CMDLINE_TOTAL_MAX_LEN((VIRTIO_CMDLINE_MAXLEN + 1) *
> 16)
> >
> > #define GED_MMIO_BASE 0xfea0
> > #define GED_MMIO_BASE_MEMHP   (GED_MMIO_BASE + 0x100)
> >diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
> >index 2e503904dc..990dcdbb2e 100644
> >--- a/include/hw/nvram/fw_cfg.h
> >+++ b/include/hw/nvram/fw_cfg.h
> >@@ -139,6 +139,15 @@ void fw_cfg_add_bytes_callback(FWCfgState *s,
> uint16_t key,
> >void *data, size_t len,
> >bool read_only);
> >
> >+/**
> >+ * fw_cfg_read_bytes_ptr:
> >+ * @s: fw_cfg device being modified
> >+ * @key: selector key value for new fw_cfg item
> >+ *
> >+ * Reads an existing fw_cfg data pointer.
> >+ */
> >+void *fw_cfg_read_bytes_ptr(FWCfgState *s, uint16_t key);
> >+
> > /**
> >  * fw_cfg_add_string:
> >  * @s: fw_cfg device

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread H. Peter Anvin

On January 30, 2023 12:19:14 PM PST, "Michael S. Tsirkin"  
wrote:
>From: "Jason A. Donenfeld" 
>
>The setup_data links are appended to the compressed kernel image. Since
>the kernel image is typically loaded at 0x10, setup_data lives at
>`0x10 + compressed_size`, which does not get relocated during the
>kernel's boot process.
>
>The kernel typically decompresses the image starting at address
>0x100 (note: there's one more zero there than the compressed image
>above). This usually is fine for most kernels.
>
>However, if the compressed image is actually quite large, then
>setup_data will live at a `0x10 + compressed_size` that extends into
>the decompressed zone at 0x100. In other words, if compressed_size
>is larger than `0x100 - 0x10`, then the decompression step will
>clobber setup_data, resulting in crashes.
>
>Visually, what happens now is that QEMU appends setup_data to the kernel
>image:
>
>  kernel imagesetup_data
>   |--|||
>0x10  0x10+l1 0x10+l1+l2
>
>The problem is that this decompresses to 0x100 (one more zero). So
>if l1 is > (0x100-0x10), then this winds up looking like:
>
>  kernel imagesetup_data
>   |--|||
>0x10  0x10+l1 0x10+l1+l2
>
> d e c o m p r e s s e d   k e r n e l
> 
> |-|
>0x100 
> 0x100+l3
>
>The decompressed kernel seemingly overwriting the compressed kernel
>image isn't a problem, because that gets relocated to a higher address
>early on in the boot process, at the end of startup_64. setup_data,
>however, stays in the same place, since those links are self referential
>and nothing fixes them up.  So the decompressed kernel clobbers it.
>
>Fix this by appending setup_data to the cmdline blob rather than the
>kernel image blob, which remains at a lower address that won't get
>clobbered.
>
>This could have been done by overwriting the initrd blob instead, but
>that poses big difficulties, such as no longer being able to use memory
>mapped files for initrd, hurting performance, and, more importantly, the
>initrd address calculation is hard coded in qboot, and it always grows
>down rather than up, which means lots of brittle semantics would have to
>be changed around, incurring more complexity. In contrast, using cmdline
>is simple and doesn't interfere with anything.
>
>The microvm machine has a gross hack where it fiddles with fw_cfg data
>after the fact. So this hack is updated to account for this appending,
>by reserving some bytes.
>
>Fixup-by: Michael S. Tsirkin 
>Cc: x...@kernel.org
>Cc: Philippe Mathieu-Daudé 
>Cc: H. Peter Anvin 
>Cc: Borislav Petkov 
>Cc: Eric Biggers 
>Signed-off-by: Jason A. Donenfeld 
>Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
>Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
>Reviewed-by: Michael S. Tsirkin 
>Signed-off-by: Michael S. Tsirkin 
>Tested-by: Eric Biggers 
>Tested-by: Mathias Krause 
>---
> include/hw/i386/microvm.h |  5 ++--
> include/hw/nvram/fw_cfg.h |  9 +++
> hw/i386/microvm.c | 15 +++
> hw/i386/x86.c | 52 +--
> hw/nvram/fw_cfg.c |  9 +++
> 5 files changed, 59 insertions(+), 31 deletions(-)
>
>diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
>index fad97a891d..e8af61f194 100644
>--- a/include/hw/i386/microvm.h
>+++ b/include/hw/i386/microvm.h
>@@ -50,8 +50,9 @@
>  */
> 
> /* Platform virtio definitions */
>-#define VIRTIO_MMIO_BASE  0xfeb0
>-#define VIRTIO_CMDLINE_MAXLEN 64
>+#define VIRTIO_MMIO_BASE0xfeb0
>+#define VIRTIO_CMDLINE_MAXLEN   64
>+#define VIRTIO_CMDLINE_TOTAL_MAX_LEN((VIRTIO_CMDLINE_MAXLEN + 1) * 16)
> 
> #define GED_MMIO_BASE 0xfea0
> #define GED_MMIO_BASE_MEMHP   (GED_MMIO_BASE + 0x100)
>diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
>index 2e503904dc..990dcdbb2e 100644
>--- a/include/hw/nvram/fw_cfg.h
>+++ b/include/hw/nvram/fw_cfg.h
>@@ -139,6 +139,15 @@ void fw_cfg_add_bytes_callback(FWCfgState *s, uint16_t 
>key,
>void *data, size_t len,
>bool read_only);
> 
>+/**
>+ * fw_cfg_read_bytes_ptr:
>+ * @s: fw_cfg device being modified
>+ * @key: selector key value for new fw_cfg item
>+ *
>+ * Reads an existing fw_cfg data pointer.
>+ */
>+void *fw_cfg_read_bytes_ptr(FWCfgState *s, uint16_t key);
>+
> /**
>  * fw_cfg_add_string:
>  * @s: fw_cfg device being modified
>diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
>index 170a331e3f..29f30dd6d3 100644
>--- a/hw/i386/microvm.c
>+++ b/hw/i386/microvm.c
>@@ -378,7 +378,8 @@ static void

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

2023-01-31 Thread Jason A. Donenfeld

On Mon, Jan 30, 2023 at 03:19:59PM -0500, Michael S. Tsirkin wrote:
> From: "Jason A. Donenfeld" 
> 
> The setup_data links are appended to the compressed kernel image. Since
> the kernel image is typically loaded at 0x10, setup_data lives at
> `0x10 + compressed_size`, which does not get relocated during the
> kernel's boot process.
> 
> The kernel typically decompresses the image starting at address
> 0x100 (note: there's one more zero there than the compressed image
> above). This usually is fine for most kernels.
> 
> However, if the compressed image is actually quite large, then
> setup_data will live at a `0x10 + compressed_size` that extends into
> the decompressed zone at 0x100. In other words, if compressed_size
> is larger than `0x100 - 0x10`, then the decompression step will
> clobber setup_data, resulting in crashes.
> 
> Visually, what happens now is that QEMU appends setup_data to the kernel
> image:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
> The problem is that this decompresses to 0x100 (one more zero). So
> if l1 is > (0x100-0x10), then this winds up looking like:
> 
>   kernel imagesetup_data
>|--|||
> 0x10  0x10+l1 0x10+l1+l2
> 
>  d e c o m p r e s s e d   k e r n e l
>  
> |-|
> 0x100 
> 0x100+l3
> 
> The decompressed kernel seemingly overwriting the compressed kernel
> image isn't a problem, because that gets relocated to a higher address
> early on in the boot process, at the end of startup_64. setup_data,
> however, stays in the same place, since those links are self referential
> and nothing fixes them up.  So the decompressed kernel clobbers it.
> 
> Fix this by appending setup_data to the cmdline blob rather than the
> kernel image blob, which remains at a lower address that won't get
> clobbered.
> 
> This could have been done by overwriting the initrd blob instead, but
> that poses big difficulties, such as no longer being able to use memory
> mapped files for initrd, hurting performance, and, more importantly, the
> initrd address calculation is hard coded in qboot, and it always grows
> down rather than up, which means lots of brittle semantics would have to
> be changed around, incurring more complexity. In contrast, using cmdline
> is simple and doesn't interfere with anything.
> 
> The microvm machine has a gross hack where it fiddles with fw_cfg data
> after the fact. So this hack is updated to account for this appending,
> by reserving some bytes.
> 
> Fixup-by: Michael S. Tsirkin 
> Cc: x...@kernel.org
> Cc: Philippe Mathieu-Daudé 
> Cc: H. Peter Anvin 
> Cc: Borislav Petkov 
> Cc: Eric Biggers 
> Signed-off-by: Jason A. Donenfeld 
> Message-Id: <20221230220725.618763-1-ja...@zx2c4.com>
> Message-ID: <20230128061015-mutt-send-email-...@kernel.org>
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Michael S. Tsirkin 
> Tested-by: Eric Biggers 
> Tested-by: Mathias Krause 

This one should wind up in the stable point release too. Dunno what the
procedure for that is.

Jason

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

Re: [PULL 10/56] x86: don't let decompressed kernel image clobber setup_data

6 matches

Site Navigation

Mail list logo

Footer information