Hi, Read this message in a fixed width text editor with a lot of columns.
On Wed, Dec 28, 2022 at 03:58:12PM -0800, H. Peter Anvin wrote: > Glad you asked. > > So the kernel load addresses are parameterized in the kernel image > setup header. One of the things that are so parameterized are the size > and possible realignment of the kernel image in memory. > > I'm very confused where you are getting the 64 MB number from. There > should not be any such limitation. Currently, QEMU appends it to the kernel image, not to the initramfs as you suggest below. So, that winds up looking, currently, like: kernel image setup_data |--------------------------||----------------| 0x100000 0x100000+l1 0x100000+l1+l2 The problem is that this decompresses to 0x1000000 (one more zero). So if l1 is > (0x1000000-0x100000), then this winds up looking like: kernel image setup_data |--------------------------||----------------| 0x100000 0x100000+l1 0x100000+l1+l2 d e c o m p r e s s e d k e r n e l |-------------------------------------------------------------| 0x1000000 0x1000000+l3 The decompressed kernel seemingly overwriting the compressed kernel image isn't a problem, because that gets relocated to a higher address early on in the boot process. setup_data, however, stays in the same place, since those links are self referential and nothing fixes them up. So the decompressed kernel clobbers it. The solution in this commit adds a bunch of padding between the kernel image and setup_data to avoid this. That looks like this: kernel image padding setup_data |--------------------------||---------------------------------------------------||----------------| 0x100000 0x100000+l1 0x1000000+l3 0x1000000+l3+l2 d e c o m p r e s s e d k e r n e l |-------------------------------------------------------------| 0x1000000 0x1000000+l3 This way, the decompressed kernel doesn't clobber setup_data. The problem is that if 0x1000000+l3-0x100000 is around 62 megabytes, then the bootloader crashes when trying to dereference setup_data's ->len param at the end of initialize_identity_maps() in ident_map_64.c. I don't know why it does this. If I could remove the 62 megabyte restriction, then I could keep with this technique and all would be well. > In general, setup_data should be able to go anywhere the initrd can > go, and so is subject to the same address cap (896 MB for old kernels, > 4 GB on newer ones; this address too is enumerated in the header.) It would be theoretically possible to attach it to the initrd image instead of to the kernel image. As a last resort, I guess I can look into doing that. However, that's going to require some serious rework and plumbing of a lot of different components. So if I can make it work as is, that'd be ideal. However, I need to figure out this weird 62 meg limitation. Any ideas on that? Jason