Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-19 Thread Konrad Rzeszutek Wilk
On Sun, Nov 11, 2018 at 10:49:39AM -0800, H. Peter Anvin wrote:
> On 11/10/18 1:03 AM, Juergen Gross wrote:
> > 
> > How would that help? The garabge data written could have the correct
> > terminal sentinel value by chance.
> > 
> > That's why I re-used an existing field in setup_header (the version) to
> > let grub tell the kernel which part of setup_header was written by grub.
> > 
> > That's the only way I could find to let the kernel distinguish between
> > garbage and actual data.
> 
> There is plenty of space *before* the setup_header part of struct boot_params
> too -- look a the various __pad fields, especially (in your case), __pad3[16]
> and __pad4[116] would suit the bill just fine.
> 
> >> It would be enormously helpful if you could find out any more details about
> >> exactly what they are doing to break things.
> > 
> > That's easy:
> > 
> > The memory layout is:
> > 
> > 0x1f1 bytes of data, including the sentinel, the setup_header, and then
> > more data.
> > 
> > grub did read the kernel's setup_header in the correct size into its
> > buffer (which contains random garbage before that), intializes the first
> > 0x1f1 including the sentinel byte, and then writes back the buffer, but
> > using a too large length for that.
> 
> Are you kidding me... it really overwrites it with completely random data, and
> not simply overspilling contents of the file?
> 
> In that case it might not be possible (or desirable) to use those N bytes
> following the setup_heaader, or we need to a bigger sentinel than one byte
> (probability being what it is, 256^n gets to be a pretty big number for any n,
> very quickly drowning in the noise compared to other potential sources of boot
> failures, and most likely less fatal than most.)
> 
> How big is this garbage dump?  I'm going to brave a guess it is 512 bytes.  In
> that case this is hardly a big deal: the E820 map begins at 0x290, and the
> setup_header maximum goes to 0x280, so it is only 15 bytes lost.  If it is
> worse than that, we would risk losing __pad8[48] and __pad9[276], and
> especially the latter would be painful. In those case perhaps we should use
> 0x281..0x290 as a 15-byte sentinel; that is going to be virtually foolproof.
> 
> I'm also thinking that it might be desirable to add a flags field (__pad2
> would be ideal) to struct boot_params; it would let us recycle some of the
> obsolete fields (hd0_info, hd1_info, sys_desc_table, olpc_ofw_header, ...) and
> perhaps be able to add some more robustness against these sort of things. This
> would be the right way to do what your version feedback mechanism would do.
> 
> The reason why the feedback mechanism is fundamentally broken is that it only
> gives the boot loader a way to assert that it supports a certain version of
> the protocol, but it doesn't say *which* bootloader does such an assert -- and
> therefore it is still wide open to implementation error.
> 
> We do, in fact, already have a feedback mechanism: the bootloader ID and
> bootloader version. One way we could deal with this problem is to bump the
> bootloader version reported by Grub, and add a quirk to the kernel that if the
> bootloader ID is Grub (7) and the version is less than a certain number, zero
> those fields. In fact, the more I think about it, this is what we should do.
> 
> That being said, Grub really needs to be kept honest.  They keep making both
> severe design (like refusing to use the BIOS and UEFI entry points provided by
> the kernel by default) and implementation errors, apparently without
> meaningful oversight. I really ask that the people more closely involved with
> Grub try to keep a closer eye on their code as it applies to Linux.

Cc-ing GRUB and Daniel Kiper (maintainer of GRUB).

Could folks please please CC Daniel Kiper on any of these patches in the future?

Thanks.
> 
>   -hpa


Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-19 Thread Konrad Rzeszutek Wilk
On Sun, Nov 11, 2018 at 10:49:39AM -0800, H. Peter Anvin wrote:
> On 11/10/18 1:03 AM, Juergen Gross wrote:
> > 
> > How would that help? The garabge data written could have the correct
> > terminal sentinel value by chance.
> > 
> > That's why I re-used an existing field in setup_header (the version) to
> > let grub tell the kernel which part of setup_header was written by grub.
> > 
> > That's the only way I could find to let the kernel distinguish between
> > garbage and actual data.
> 
> There is plenty of space *before* the setup_header part of struct boot_params
> too -- look a the various __pad fields, especially (in your case), __pad3[16]
> and __pad4[116] would suit the bill just fine.
> 
> >> It would be enormously helpful if you could find out any more details about
> >> exactly what they are doing to break things.
> > 
> > That's easy:
> > 
> > The memory layout is:
> > 
> > 0x1f1 bytes of data, including the sentinel, the setup_header, and then
> > more data.
> > 
> > grub did read the kernel's setup_header in the correct size into its
> > buffer (which contains random garbage before that), intializes the first
> > 0x1f1 including the sentinel byte, and then writes back the buffer, but
> > using a too large length for that.
> 
> Are you kidding me... it really overwrites it with completely random data, and
> not simply overspilling contents of the file?
> 
> In that case it might not be possible (or desirable) to use those N bytes
> following the setup_heaader, or we need to a bigger sentinel than one byte
> (probability being what it is, 256^n gets to be a pretty big number for any n,
> very quickly drowning in the noise compared to other potential sources of boot
> failures, and most likely less fatal than most.)
> 
> How big is this garbage dump?  I'm going to brave a guess it is 512 bytes.  In
> that case this is hardly a big deal: the E820 map begins at 0x290, and the
> setup_header maximum goes to 0x280, so it is only 15 bytes lost.  If it is
> worse than that, we would risk losing __pad8[48] and __pad9[276], and
> especially the latter would be painful. In those case perhaps we should use
> 0x281..0x290 as a 15-byte sentinel; that is going to be virtually foolproof.
> 
> I'm also thinking that it might be desirable to add a flags field (__pad2
> would be ideal) to struct boot_params; it would let us recycle some of the
> obsolete fields (hd0_info, hd1_info, sys_desc_table, olpc_ofw_header, ...) and
> perhaps be able to add some more robustness against these sort of things. This
> would be the right way to do what your version feedback mechanism would do.
> 
> The reason why the feedback mechanism is fundamentally broken is that it only
> gives the boot loader a way to assert that it supports a certain version of
> the protocol, but it doesn't say *which* bootloader does such an assert -- and
> therefore it is still wide open to implementation error.
> 
> We do, in fact, already have a feedback mechanism: the bootloader ID and
> bootloader version. One way we could deal with this problem is to bump the
> bootloader version reported by Grub, and add a quirk to the kernel that if the
> bootloader ID is Grub (7) and the version is less than a certain number, zero
> those fields. In fact, the more I think about it, this is what we should do.
> 
> That being said, Grub really needs to be kept honest.  They keep making both
> severe design (like refusing to use the BIOS and UEFI entry points provided by
> the kernel by default) and implementation errors, apparently without
> meaningful oversight. I really ask that the people more closely involved with
> Grub try to keep a closer eye on their code as it applies to Linux.

Cc-ing GRUB and Daniel Kiper (maintainer of GRUB).

Could folks please please CC Daniel Kiper on any of these patches in the future?

Thanks.
> 
>   -hpa


Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-11 Thread hpa
On November 10, 2018 7:22:29 AM PST, Juergen Gross  wrote:
>On 09/11/2018 23:23, H. Peter Anvin wrote:
>> I just noticed this patch -- I missed it because the cover message
>> seemed far more harmless so I didn't notice this change.
>> 
>> THIS PATCH IS FATALLY WRONG AND NEEDS TO BE IMMEDIATELY REVERTED
>BEFORE
>> ANYONE STARTS RELYING ON IT; IT HAS THE POTENTIAL OF BREAKING THE
>> BOOTLOADER PROTOCOL FOR ALL FUTURE.
>> 
>> It seems to be based on fundamental misconceptions about the various
>> data structures in the protocol, and does so in a way that completely
>> breaks the way the protocol is designed to work.
>> 
>> The protocol is specifically designed such that fields are not
>version
>> dependencies. The version number is strictly to inform the boot
>loader
>> about which capabilities the kernel has, so that the boot loader can
>> know if a certain data field is meaningful and/or honored.
>> 
>>> +Protocol 2.14: (Kernel 4.20) Added acpi_rsdp_addr holding the
>physical
>>> +   address of the ACPI RSDP table.
>>> +   The bootloader updates version with:
>>> +   0x8000 | min(kernel-version, bootloader-version)
>>> +   kernel-version being the protocol version supported by
>>> +   the kernel and bootloader-version the protocol version
>>> +   supported by the bootloader.
>> 
>> [...]
>> 
>>>   MEMORY LAYOUT
>>>
>>>  The traditional memory map for the kernel loader, used for Image or
>>> @@ -197,6 +209,7 @@ Offset  Proto   NameMeaning
>>>  0258/8 2.10+   pref_addressPreferred loading address
>>>  0260/4 2.10+   init_size   Linear memory required during 
>>> initialization
>>>  0264/4 2.11+   handover_offset Offset of handover entry point
>>> +0268/8 2.14+   acpi_rsdp_addr  Physical address of RSDP table
>> 
>> NO.
>> 
>> That is not how struct setup_header works, nor does this belong here.
>> 
>> struct setup_header contains *initialized data*, and has a length
>byte
>> at offset 0x201.  The bootloader is responsible for copying the full
>> structure into the appropriate offset (0x1f1) in struct boot_params.
>> 
>> The length byte isn't actually a requirement, since the maximum
>possible
>> size of this structure is 144 bytes, and the kernel will (obviously)
>not
>> look at the older fields anyway, but it is good practice. The kernel
>or
>> any other entity is free to zero out the bytes past this length
>pointer.
>> 
>> There are only 24 bytes left in this structure, and this would occupy
>8
>> of them for no valid reason.  The *only* valid reason to put a
>> zero-initialized field in struct setup_header is if it used by the
>> 16-bit legacy BIOS boot, which is obviously not the case here.
>> 
>> This field thus belongs in struct boot_params, not struct
>setup_header.
>
>Would you be okay with putting acpi_rsdp_addr at offset 0x0cc (_pad4)?
>
>
>Juergen

I'd prefer if you used __pad3 offset 0x70 to keep the large block, and that way 
your field is also aligned.

However, if you have some specific reason to prefer __pad4 it's no big deal, 
although I'm curious what it would be.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-11 Thread hpa
On November 10, 2018 7:22:29 AM PST, Juergen Gross  wrote:
>On 09/11/2018 23:23, H. Peter Anvin wrote:
>> I just noticed this patch -- I missed it because the cover message
>> seemed far more harmless so I didn't notice this change.
>> 
>> THIS PATCH IS FATALLY WRONG AND NEEDS TO BE IMMEDIATELY REVERTED
>BEFORE
>> ANYONE STARTS RELYING ON IT; IT HAS THE POTENTIAL OF BREAKING THE
>> BOOTLOADER PROTOCOL FOR ALL FUTURE.
>> 
>> It seems to be based on fundamental misconceptions about the various
>> data structures in the protocol, and does so in a way that completely
>> breaks the way the protocol is designed to work.
>> 
>> The protocol is specifically designed such that fields are not
>version
>> dependencies. The version number is strictly to inform the boot
>loader
>> about which capabilities the kernel has, so that the boot loader can
>> know if a certain data field is meaningful and/or honored.
>> 
>>> +Protocol 2.14: (Kernel 4.20) Added acpi_rsdp_addr holding the
>physical
>>> +   address of the ACPI RSDP table.
>>> +   The bootloader updates version with:
>>> +   0x8000 | min(kernel-version, bootloader-version)
>>> +   kernel-version being the protocol version supported by
>>> +   the kernel and bootloader-version the protocol version
>>> +   supported by the bootloader.
>> 
>> [...]
>> 
>>>   MEMORY LAYOUT
>>>
>>>  The traditional memory map for the kernel loader, used for Image or
>>> @@ -197,6 +209,7 @@ Offset  Proto   NameMeaning
>>>  0258/8 2.10+   pref_addressPreferred loading address
>>>  0260/4 2.10+   init_size   Linear memory required during 
>>> initialization
>>>  0264/4 2.11+   handover_offset Offset of handover entry point
>>> +0268/8 2.14+   acpi_rsdp_addr  Physical address of RSDP table
>> 
>> NO.
>> 
>> That is not how struct setup_header works, nor does this belong here.
>> 
>> struct setup_header contains *initialized data*, and has a length
>byte
>> at offset 0x201.  The bootloader is responsible for copying the full
>> structure into the appropriate offset (0x1f1) in struct boot_params.
>> 
>> The length byte isn't actually a requirement, since the maximum
>possible
>> size of this structure is 144 bytes, and the kernel will (obviously)
>not
>> look at the older fields anyway, but it is good practice. The kernel
>or
>> any other entity is free to zero out the bytes past this length
>pointer.
>> 
>> There are only 24 bytes left in this structure, and this would occupy
>8
>> of them for no valid reason.  The *only* valid reason to put a
>> zero-initialized field in struct setup_header is if it used by the
>> 16-bit legacy BIOS boot, which is obviously not the case here.
>> 
>> This field thus belongs in struct boot_params, not struct
>setup_header.
>
>Would you be okay with putting acpi_rsdp_addr at offset 0x0cc (_pad4)?
>
>
>Juergen

I'd prefer if you used __pad3 offset 0x70 to keep the large block, and that way 
your field is also aligned.

However, if you have some specific reason to prefer __pad4 it's no big deal, 
although I'm curious what it would be.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-11 Thread H. Peter Anvin
On 11/10/18 1:03 AM, Juergen Gross wrote:
> 
> How would that help? The garabge data written could have the correct
> terminal sentinel value by chance.
> 
> That's why I re-used an existing field in setup_header (the version) to
> let grub tell the kernel which part of setup_header was written by grub.
> 
> That's the only way I could find to let the kernel distinguish between
> garbage and actual data.

There is plenty of space *before* the setup_header part of struct boot_params
too -- look a the various __pad fields, especially (in your case), __pad3[16]
and __pad4[116] would suit the bill just fine.

>> It would be enormously helpful if you could find out any more details about
>> exactly what they are doing to break things.
> 
> That's easy:
> 
> The memory layout is:
> 
> 0x1f1 bytes of data, including the sentinel, the setup_header, and then
> more data.
> 
> grub did read the kernel's setup_header in the correct size into its
> buffer (which contains random garbage before that), intializes the first
> 0x1f1 including the sentinel byte, and then writes back the buffer, but
> using a too large length for that.

Are you kidding me... it really overwrites it with completely random data, and
not simply overspilling contents of the file?

In that case it might not be possible (or desirable) to use those N bytes
following the setup_heaader, or we need to a bigger sentinel than one byte
(probability being what it is, 256^n gets to be a pretty big number for any n,
very quickly drowning in the noise compared to other potential sources of boot
failures, and most likely less fatal than most.)

How big is this garbage dump?  I'm going to brave a guess it is 512 bytes.  In
that case this is hardly a big deal: the E820 map begins at 0x290, and the
setup_header maximum goes to 0x280, so it is only 15 bytes lost.  If it is
worse than that, we would risk losing __pad8[48] and __pad9[276], and
especially the latter would be painful. In those case perhaps we should use
0x281..0x290 as a 15-byte sentinel; that is going to be virtually foolproof.

I'm also thinking that it might be desirable to add a flags field (__pad2
would be ideal) to struct boot_params; it would let us recycle some of the
obsolete fields (hd0_info, hd1_info, sys_desc_table, olpc_ofw_header, ...) and
perhaps be able to add some more robustness against these sort of things. This
would be the right way to do what your version feedback mechanism would do.

The reason why the feedback mechanism is fundamentally broken is that it only
gives the boot loader a way to assert that it supports a certain version of
the protocol, but it doesn't say *which* bootloader does such an assert -- and
therefore it is still wide open to implementation error.

We do, in fact, already have a feedback mechanism: the bootloader ID and
bootloader version. One way we could deal with this problem is to bump the
bootloader version reported by Grub, and add a quirk to the kernel that if the
bootloader ID is Grub (7) and the version is less than a certain number, zero
those fields. In fact, the more I think about it, this is what we should do.

That being said, Grub really needs to be kept honest.  They keep making both
severe design (like refusing to use the BIOS and UEFI entry points provided by
the kernel by default) and implementation errors, apparently without
meaningful oversight. I really ask that the people more closely involved with
Grub try to keep a closer eye on their code as it applies to Linux.

-hpa


Re: PLEASE REVERT URGENTLY: Re: [PATCH v5 2/3] x86/boot: add acpi rsdp address to setup_header

2018-11-11 Thread H. Peter Anvin
On 11/10/18 1:03 AM, Juergen Gross wrote:
> 
> How would that help? The garabge data written could have the correct
> terminal sentinel value by chance.
> 
> That's why I re-used an existing field in setup_header (the version) to
> let grub tell the kernel which part of setup_header was written by grub.
> 
> That's the only way I could find to let the kernel distinguish between
> garbage and actual data.

There is plenty of space *before* the setup_header part of struct boot_params
too -- look a the various __pad fields, especially (in your case), __pad3[16]
and __pad4[116] would suit the bill just fine.

>> It would be enormously helpful if you could find out any more details about
>> exactly what they are doing to break things.
> 
> That's easy:
> 
> The memory layout is:
> 
> 0x1f1 bytes of data, including the sentinel, the setup_header, and then
> more data.
> 
> grub did read the kernel's setup_header in the correct size into its
> buffer (which contains random garbage before that), intializes the first
> 0x1f1 including the sentinel byte, and then writes back the buffer, but
> using a too large length for that.

Are you kidding me... it really overwrites it with completely random data, and
not simply overspilling contents of the file?

In that case it might not be possible (or desirable) to use those N bytes
following the setup_heaader, or we need to a bigger sentinel than one byte
(probability being what it is, 256^n gets to be a pretty big number for any n,
very quickly drowning in the noise compared to other potential sources of boot
failures, and most likely less fatal than most.)

How big is this garbage dump?  I'm going to brave a guess it is 512 bytes.  In
that case this is hardly a big deal: the E820 map begins at 0x290, and the
setup_header maximum goes to 0x280, so it is only 15 bytes lost.  If it is
worse than that, we would risk losing __pad8[48] and __pad9[276], and
especially the latter would be painful. In those case perhaps we should use
0x281..0x290 as a 15-byte sentinel; that is going to be virtually foolproof.

I'm also thinking that it might be desirable to add a flags field (__pad2
would be ideal) to struct boot_params; it would let us recycle some of the
obsolete fields (hd0_info, hd1_info, sys_desc_table, olpc_ofw_header, ...) and
perhaps be able to add some more robustness against these sort of things. This
would be the right way to do what your version feedback mechanism would do.

The reason why the feedback mechanism is fundamentally broken is that it only
gives the boot loader a way to assert that it supports a certain version of
the protocol, but it doesn't say *which* bootloader does such an assert -- and
therefore it is still wide open to implementation error.

We do, in fact, already have a feedback mechanism: the bootloader ID and
bootloader version. One way we could deal with this problem is to bump the
bootloader version reported by Grub, and add a quirk to the kernel that if the
bootloader ID is Grub (7) and the version is less than a certain number, zero
those fields. In fact, the more I think about it, this is what we should do.

That being said, Grub really needs to be kept honest.  They keep making both
severe design (like refusing to use the BIOS and UEFI entry points provided by
the kernel by default) and implementation errors, apparently without
meaningful oversight. I really ask that the people more closely involved with
Grub try to keep a closer eye on their code as it applies to Linux.

-hpa