Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Vivek Goyal wrote: > > One would not know highest used address until ELF headers have been > parsed. May be it is two step movement. First decompress ELF.gz and > ELF parser can be at the end of decompressed data. Then it can parse > the ELF headers and move itself out of the ELF header destination memory > and then load the elf segments at appropriate place. > > One will have to be little careful while moving ELF parser or while > decompressing the file to a temporary buffer so that we don't stomp over > any other data loaded by boot-loader (like kexec does) or we don't go beyond > the memory bounds which might have been created in the case of using kdump. > The easiest is probably to decode the ELF headers (which can be done in O(1) space), relocate, reset the decompressor and restart. Relocation is currently done in the decompressor, but it could also be done at the kernel entrypoint, as long as the kernel entrypoint code is all PIC. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wed, Jun 06, 2007 at 05:42:35PM -0700, H. Peter Anvin wrote: > Jeremy Fitzhardinge wrote: > > > > Certainly, but much harder to implement. The ELF parser needs to be > > prepared to move itself around to get out of the way of the ELF file. > > It's a fairly large change from how it works now. > > > > It doesn't if we simply declare that a certain chunk of memory is > available to it, for the case where it runs in the native configuration. > Since it doesn't have to support *any* ELF file, just the kernel one, > that's an option. > > On the other hand, I guess with the decompressor/ELF parser being PIC, > one would simply look for the highest used address, and relocate itself > above that point. It's not really all that different from what the > decompressor does today, except that it knows the address a priori. > One would not know highest used address until ELF headers have been parsed. May be it is two step movement. First decompress ELF.gz and ELF parser can be at the end of decompressed data. Then it can parse the ELF headers and move itself out of the ELF header destination memory and then load the elf segments at appropriate place. One will have to be little careful while moving ELF parser or while decompressing the file to a temporary buffer so that we don't stomp over any other data loaded by boot-loader (like kexec does) or we don't go beyond the memory bounds which might have been created in the case of using kdump. Thanks Vivek - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Rob Landley wrote: > > Er, make that objcopy, not objdump. > > Sane, maybe not. Something people want to do (and under the mistaken > assumption I know more about initramfs then they do, have asked me how), yes. > > It always boils down to "do you have a vmlinux image lying around? Doing > this with a bzImage _is_ brain surgery", and has yet to get beyond that > question. I had about half of a script worked out for this, once... > If it can be done today on a vmlinux then it can be done the same way with the mechanism I have proposed. Period, full stop. > You can also supply an external initramfs image through the initrd mechanism, > but this is unpleasant to do with some bootloaders (or lack of bootloaders). > Plus it doesn't remove the old one, and wasting space makes embedded > developers itch. In thory one could create an extended bzImage format which could handle a concatenated, and easily replaceable, initrd, but if it's done on vmlinux today it would make a *lot* more sense to have it be done on the vmlinux and nothing else. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wednesday 06 June 2007 9:54 pm, H. Peter Anvin wrote: > Rob Landley wrote: > > On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: > >> This makes vmlinux (normally stripped) recoverable from the bzImage file > >> and so anything that is currently booting vmlinux would be serviced by > >> this scheme. > > > > Would this make it sane to strip the initramfs image out of vmlinux with > > objdump and replace it with another one, or are there offsets resolved during > > the build that stop that for vmlinux? > > > > There probably are offsets resolved during the build. However, that > wouldn't be all that hard to fix. Still, one can argue whether or not > it is sane under any definition to do this kind of unpacking-repacking > of ELF files. Er, make that objcopy, not objdump. Sane, maybe not. Something people want to do (and under the mistaken assumption I know more about initramfs then they do, have asked me how), yes. It always boils down to "do you have a vmlinux image lying around? Doing this with a bzImage _is_ brain surgery", and has yet to get beyond that question. I had about half of a script worked out for this, once... You can also supply an external initramfs image through the initrd mechanism, but this is unpleasant to do with some bootloaders (or lack of bootloaders). Plus it doesn't remove the old one, and wasting space makes embedded developers itch. Rob -- The Google cluster became self-aware at 2:14am EDT August 29, 2007... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wed, Jun 06, 2007 at 05:42:35PM -0700, H. Peter Anvin wrote: Jeremy Fitzhardinge wrote: Certainly, but much harder to implement. The ELF parser needs to be prepared to move itself around to get out of the way of the ELF file. It's a fairly large change from how it works now. It doesn't if we simply declare that a certain chunk of memory is available to it, for the case where it runs in the native configuration. Since it doesn't have to support *any* ELF file, just the kernel one, that's an option. On the other hand, I guess with the decompressor/ELF parser being PIC, one would simply look for the highest used address, and relocate itself above that point. It's not really all that different from what the decompressor does today, except that it knows the address a priori. One would not know highest used address until ELF headers have been parsed. May be it is two step movement. First decompress ELF.gz and ELF parser can be at the end of decompressed data. Then it can parse the ELF headers and move itself out of the ELF header destination memory and then load the elf segments at appropriate place. One will have to be little careful while moving ELF parser or while decompressing the file to a temporary buffer so that we don't stomp over any other data loaded by boot-loader (like kexec does) or we don't go beyond the memory bounds which might have been created in the case of using kdump. Thanks Vivek - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Vivek Goyal wrote: One would not know highest used address until ELF headers have been parsed. May be it is two step movement. First decompress ELF.gz and ELF parser can be at the end of decompressed data. Then it can parse the ELF headers and move itself out of the ELF header destination memory and then load the elf segments at appropriate place. One will have to be little careful while moving ELF parser or while decompressing the file to a temporary buffer so that we don't stomp over any other data loaded by boot-loader (like kexec does) or we don't go beyond the memory bounds which might have been created in the case of using kdump. The easiest is probably to decode the ELF headers (which can be done in O(1) space), relocate, reset the decompressor and restart. Relocation is currently done in the decompressor, but it could also be done at the kernel entrypoint, as long as the kernel entrypoint code is all PIC. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wednesday 06 June 2007 9:54 pm, H. Peter Anvin wrote: Rob Landley wrote: On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. Would this make it sane to strip the initramfs image out of vmlinux with objdump and replace it with another one, or are there offsets resolved during the build that stop that for vmlinux? There probably are offsets resolved during the build. However, that wouldn't be all that hard to fix. Still, one can argue whether or not it is sane under any definition to do this kind of unpacking-repacking of ELF files. Er, make that objcopy, not objdump. Sane, maybe not. Something people want to do (and under the mistaken assumption I know more about initramfs then they do, have asked me how), yes. It always boils down to do you have a vmlinux image lying around? Doing this with a bzImage _is_ brain surgery, and has yet to get beyond that question. I had about half of a script worked out for this, once... You can also supply an external initramfs image through the initrd mechanism, but this is unpleasant to do with some bootloaders (or lack of bootloaders). Plus it doesn't remove the old one, and wasting space makes embedded developers itch. Rob -- The Google cluster became self-aware at 2:14am EDT August 29, 2007... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Rob Landley wrote: Er, make that objcopy, not objdump. Sane, maybe not. Something people want to do (and under the mistaken assumption I know more about initramfs then they do, have asked me how), yes. It always boils down to do you have a vmlinux image lying around? Doing this with a bzImage _is_ brain surgery, and has yet to get beyond that question. I had about half of a script worked out for this, once... If it can be done today on a vmlinux then it can be done the same way with the mechanism I have proposed. Period, full stop. You can also supply an external initramfs image through the initrd mechanism, but this is unpleasant to do with some bootloaders (or lack of bootloaders). Plus it doesn't remove the old one, and wasting space makes embedded developers itch. In thory one could create an extended bzImage format which could handle a concatenated, and easily replaceable, initrd, but if it's done on vmlinux today it would make a *lot* more sense to have it be done on the vmlinux and nothing else. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Rob Landley wrote: > On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: >> This makes vmlinux (normally stripped) recoverable from the bzImage file >> and so anything that is currently booting vmlinux would be serviced by >> this scheme. > > Would this make it sane to strip the initramfs image out of vmlinux with > objdump and replace it with another one, or are there offsets resolved during > the build that stop that for vmlinux? > There probably are offsets resolved during the build. However, that wouldn't be all that hard to fix. Still, one can argue whether or not it is sane under any definition to do this kind of unpacking-repacking of ELF files. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: > This makes vmlinux (normally stripped) recoverable from the bzImage file > and so anything that is currently booting vmlinux would be serviced by > this scheme. Would this make it sane to strip the initramfs image out of vmlinux with objdump and replace it with another one, or are there offsets resolved during the build that stop that for vmlinux? Rob -- The Google cluster became self-aware at 2:14am EDT August 29, 2007... - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: > It doesn't if we simply declare that a certain chunk of memory is > available to it, for the case where it runs in the native configuration. > Since it doesn't have to support *any* ELF file, just the kernel one, > that's an option. > I suppose. But given that its always built at the same time as - and linked to - the kernel itself, it can have private knowledge about the kernel. > On the other hand, I guess with the decompressor/ELF parser being PIC, > one would simply look for the highest used address, and relocate itself > above that point. It's not really all that different from what the > decompressor does today, except that it knows the address a priori. > Yes, it would have to decompress the ELF file into a temp buffer, and then rearrange itself and the decompressed ELF file to make space for the ELF file's final location. Seems a bit more complex because it has to be done in the middle of execution rather that at start of day. But perhaps that doesn't matter very much. >> I was thinking of making the ELF file entirely descriptive, since its >> just a set of ELF headers inserted into the existing bzImage structure, >> and it still relies on the bzImage being build properly in the first place. >> > > Again, it's an option. The downside is that you don't get the automatic > test coverage of having it be exercised as often as possible. I don't follow your argument at all. I'm proposing the kernel take the same code path regardless of how its booted, with the only two variations: 1. boot all the way up from 16-bit mode, or 2. start directly in 32-bit mode which is essentially the current situation (setup vs code32_start). All I'm adding is a bit more metadata for the domain builder to work with. The code will get exercised on every boot in every environment, and the metadata will be tested by whichever environment cares about it. You're proposing that we add a third booting variation, where the bootloader takes on the responsibility for decompressing and loading the kernel's ELF image. In addition, you're proposing changing the existing 32-bit portion of the boot to perform the same job as the third method, but in a way which is not reusable by a paravirtual domain builder. This means that the boot path is unique for each boot environment, and so will overall get less coverage. Given that one axis of the test matrix - "number of subarchtectures" - is the same in both cases, and the other axis - "number of ways of booting" - is larger in your proposal, it seems to me that your's has the higher testing burden. Anyway, I added an extra pointer in the boot_params so that you can implement it that way if you really want (no real reason you can have ELF within ELF within bzImage, but it starts to look a bit engineering-by-compromise at that point). It isn't, however, the approach I want to take with Xen. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: > > Certainly, but much harder to implement. The ELF parser needs to be > prepared to move itself around to get out of the way of the ELF file. > It's a fairly large change from how it works now. > It doesn't if we simply declare that a certain chunk of memory is available to it, for the case where it runs in the native configuration. Since it doesn't have to support *any* ELF file, just the kernel one, that's an option. On the other hand, I guess with the decompressor/ELF parser being PIC, one would simply look for the highest used address, and relocate itself above that point. It's not really all that different from what the decompressor does today, except that it knows the address a priori. > I was thinking of making the ELF file entirely descriptive, since its > just a set of ELF headers inserted into the existing bzImage structure, > and it still relies on the bzImage being build properly in the first place. Again, it's an option. The downside is that you don't get the automatic test coverage of having it be exercised as often as possible. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: > I was thinking prescriptive, having the decompressor read the output > stream and interpret it as ELF. I guess a descriptive approach could be > made to work, too (I haven't really thought about that avenue of > approach), but the prescriptive model seems more powerful, at least to me. Certainly, but much harder to implement. The ELF parser needs to be prepared to move itself around to get out of the way of the ELF file. It's a fairly large change from how it works now. I was thinking of making the ELF file entirely descriptive, since its just a set of ELF headers inserted into the existing bzImage structure, and it still relies on the bzImage being build properly in the first place. J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: > > I'm not sure I fully understand the mechanism you're proposing. You > have the 16-bit setup code, the 32-bit decompressor, and an ELF.gz. Once > the decompressor has extracted the actual ELF file, are you proposing > that it properly parse the ELF file and follow its instuctions to put > the segments in the appropriate places, or are you assuming that the > decompressor can just skip that part and plonk the ELF file where it wants? > > In other words, do you see the Phdrs as being descriptive or prescriptive? > I was thinking prescriptive, having the decompressor read the output stream and interpret it as ELF. I guess a descriptive approach could be made to work, too (I haven't really thought about that avenue of approach), but the prescriptive model seems more powerful, at least to me. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: > I still believe that we should provide, in effect, vmlinux as a > (compressed) ELF file rather than provide the intermediate stage. It > would reduce the complexity of testing (all information provided about a > stage have to be both guaranteed to even make sense in the future as > well as be tested to conform to such information I'm not sure I follow you. Sure, you're right that the Phdr info contained within the bzImage needs to be tested for correctness. This wouldn't normally happen when booting native, but when booting under the most constrained environment - Xen - it will be tested (and I intend making the Xen loader as strict as possible). Of course, it won't help if the Phdrs are overmap too much, but I don't think that matters too much, so long as the mappings are not excessively large. I'm not sure what you mean about "make sense in the future". If you're booting the kernel in a new paravirtualized environment, you've presumably modified the kernel to understand that environment, and perhaps had to update the boot image format a bit to deal with its requirements. I agree that updating the bzImage format may require retesting in all the other environments, but I think that's probably true for your scheme as well. After all, you're assuming that the vmlinux itself provides all necessary information to be loaded in any environment, which is not necessarily true (it may need extra ELF notes, for example). But if there are any major structural changes needed in the vmlinux, then that will be equally problematic for both directly using vmlinux and using ELF-in-bzImage. So I don't think your argument convincingly sways in any particular direction. > ) as well as cover a > larger number of environments -- any environment where injecting data > into memory is cheaper than execution is quite unhappy about the current > system. Such environments include heterogeneous embedded systems (think > a slow CPU on a PCI card where the host CPU has direct access to the > memory on the card) as well as simulators/emulators. > Well, nothing in this scheme precludes the ELF file from being a plain uncompressed kernel image. If that's what these environments want, its easy to provide with a small update to the Makefiles. > For environments where so is appropriate it would even be possible to > run the setup, invoke the code32_setup hook to do the decompression (and > relocation, if appropriate) in host space. > Well, that's what we currently have, and we can't break backwards compatibility. > This makes vmlinux (normally stripped) recoverable from the bzImage file > and so anything that is currently booting vmlinux would be serviced by > this scheme. > I'm not sure I fully understand the mechanism you're proposing. You have the 16-bit setup code, the 32-bit decompressor, and an ELF.gz. Once the decompressor has extracted the actual ELF file, are you proposing that it properly parse the ELF file and follow its instuctions to put the segments in the appropriate places, or are you assuming that the decompressor can just skip that part and plonk the ELF file where it wants? In other words, do you see the Phdrs as being descriptive or prescriptive? J - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: > This patch makes the payload of the bzImage file an ELF file. In > other words, the bzImage is structured as follows: > - boot sector > - 16bit setup code > - ELF header > - decompressor > - compressed kernel > > A bootloader may find the start of the ELF file by looking at the > setup_size entry in the boot params, and using that to find the offset > of the ELF header. The ELF Phdrs contain all the mapped memory > required to decompress and start booting the kernel. > > One slightly complex part of this is that the bzImage boot_params need > to know about the internal structure of the ELF file, at least to the > extent of being able to point the core32_start entry at the ELF file's > entrypoint, so that loaders which use this field will still work. > > Similarly, the ELF header needs to know how big the kernel vmlinux's > bss segment is, in order to make sure is is mapped properly. > > To handle these two cases, we generate abstracted versions of the > object files which only contain the symbols we care about (generated > with objcopy --strip-all --keep-symbol=X), and then include those > symbol tables with ld -R. I still believe that we should provide, in effect, vmlinux as a (compressed) ELF file rather than provide the intermediate stage. It would reduce the complexity of testing (all information provided about a stage have to be both guaranteed to even make sense in the future as well as be tested to conform to such information) as well as cover a larger number of environments -- any environment where injecting data into memory is cheaper than execution is quite unhappy about the current system. Such environments include heterogeneous embedded systems (think a slow CPU on a PCI card where the host CPU has direct access to the memory on the card) as well as simulators/emulators. For environments where so is appropriate it would even be possible to run the setup, invoke the code32_setup hook to do the decompression (and relocation, if appropriate) in host space. This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. -hpa - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH RFC 6/7] i386: make the bzImage payload an ELF file
This patch makes the payload of the bzImage file an ELF file. In other words, the bzImage is structured as follows: - boot sector - 16bit setup code - ELF header - decompressor - compressed kernel A bootloader may find the start of the ELF file by looking at the setup_size entry in the boot params, and using that to find the offset of the ELF header. The ELF Phdrs contain all the mapped memory required to decompress and start booting the kernel. One slightly complex part of this is that the bzImage boot_params need to know about the internal structure of the ELF file, at least to the extent of being able to point the core32_start entry at the ELF file's entrypoint, so that loaders which use this field will still work. Similarly, the ELF header needs to know how big the kernel vmlinux's bss segment is, in order to make sure is is mapped properly. To handle these two cases, we generate abstracted versions of the object files which only contain the symbols we care about (generated with objcopy --strip-all --keep-symbol=X), and then include those symbol tables with ld -R. Signed-off-by: Jeremy Fitzhardinge <[EMAIL PROTECTED]> Cc: "Eric W. Biederman" <[EMAIL PROTECTED]> Cc: H. Peter Anvin <[EMAIL PROTECTED]> Cc: Vivek Goyal <[EMAIL PROTECTED]> Cc: Rusty Russell <[EMAIL PROTECTED]> --- arch/i386/boot/Makefile | 11 -- arch/i386/boot/compressed/Makefile| 29 +-- arch/i386/boot/compressed/elfhdr.S| 60 + arch/i386/boot/compressed/head.S |9 ++-- arch/i386/boot/compressed/notes.S |7 +++ arch/i386/boot/compressed/vmlinux.lds | 24 ++--- arch/i386/boot/header.S |7 --- arch/i386/boot/setup.ld |5 ++ arch/i386/kernel/head.S |1 arch/i386/kernel/vmlinux.lds.S|1 10 files changed, 131 insertions(+), 23 deletions(-) === --- a/arch/i386/boot/Makefile +++ b/arch/i386/boot/Makefile @@ -72,14 +72,19 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ SETUP_OBJS = $(addprefix $(obj)/,$(setup-y)) -LDFLAGS_setup.elf := -T -$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE +$(obj)/zImage $(obj)/bzImage: \ + LDFLAGS := \ + -R $(obj)/compressed/blob-syms \ + --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T + +$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS)\ + $(obj)/compressed/blob-syms FORCE $(call if_changed,ld) $(obj)/payload.o: EXTRA_AFLAGS := -Wa,-I$(obj) $(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin -$(obj)/compressed/blob: FORCE +$(obj)/compressed/blob $(obj)/compressed/blob-syms: FORCE $(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@ # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel === --- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -4,21 +4,42 @@ # create a compressed vmlinux image from the original vmlinux # -targets:= blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ +targets:= blob vmlinux.bin vmlinux.bin.gz \ + elfhdr.o head.o misc.o notes.o piggy.o \ vmlinux.bin.all vmlinux.relocs -LDFLAGS_blob := -T hostprogs-y:= relocs CFLAGS := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \ -fno-strict-aliasing -fPIC \ $(call cc-option,-ffreestanding) \ $(call cc-option,-fno-stack-protector) -LDFLAGS := -m elf_i386 +LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T -$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o) + +$(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE $(call if_changed,ld) @: + +# Generate a stripped-down object including only the symbols needed +# so that we can get them with ld -R. Direct stderr to /dev/null to +# shut useless warning up. +quiet_cmd_symextract = SYMEXT $@ + cmd_symextract = objcopy -S \ + $(addprefix -j,$(EXTRACTSECTS)) \ + $(addprefix -K,$(EXTRACTSYMS)) \ + $< $@ 2>/dev/null + +$(obj)/blob-syms: EXTRACTSYMS := blob_entry blob_payload +$(obj)/blob-syms: EXTRACTSECTS := .text.head .data.compressed +$(obj)/blob-syms: $(obj)/blob FORCE + $(call if_changed,symextract) + +$(obj)/vmlinux-syms: EXTRACTSYMS := __reserved_end +$(obj)/vmlinux-syms: EXTRACTSECTS := .bss +$(obj)/vmlinux-syms: vmlinux FORCE + $(call if_changed,symextract) $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) ===
[PATCH RFC 6/7] i386: make the bzImage payload an ELF file
This patch makes the payload of the bzImage file an ELF file. In other words, the bzImage is structured as follows: - boot sector - 16bit setup code - ELF header - decompressor - compressed kernel A bootloader may find the start of the ELF file by looking at the setup_size entry in the boot params, and using that to find the offset of the ELF header. The ELF Phdrs contain all the mapped memory required to decompress and start booting the kernel. One slightly complex part of this is that the bzImage boot_params need to know about the internal structure of the ELF file, at least to the extent of being able to point the core32_start entry at the ELF file's entrypoint, so that loaders which use this field will still work. Similarly, the ELF header needs to know how big the kernel vmlinux's bss segment is, in order to make sure is is mapped properly. To handle these two cases, we generate abstracted versions of the object files which only contain the symbols we care about (generated with objcopy --strip-all --keep-symbol=X), and then include those symbol tables with ld -R. Signed-off-by: Jeremy Fitzhardinge [EMAIL PROTECTED] Cc: Eric W. Biederman [EMAIL PROTECTED] Cc: H. Peter Anvin [EMAIL PROTECTED] Cc: Vivek Goyal [EMAIL PROTECTED] Cc: Rusty Russell [EMAIL PROTECTED] --- arch/i386/boot/Makefile | 11 -- arch/i386/boot/compressed/Makefile| 29 +-- arch/i386/boot/compressed/elfhdr.S| 60 + arch/i386/boot/compressed/head.S |9 ++-- arch/i386/boot/compressed/notes.S |7 +++ arch/i386/boot/compressed/vmlinux.lds | 24 ++--- arch/i386/boot/header.S |7 --- arch/i386/boot/setup.ld |5 ++ arch/i386/kernel/head.S |1 arch/i386/kernel/vmlinux.lds.S|1 10 files changed, 131 insertions(+), 23 deletions(-) === --- a/arch/i386/boot/Makefile +++ b/arch/i386/boot/Makefile @@ -72,14 +72,19 @@ AFLAGS := $(CFLAGS) -D__ASSEMBLY__ SETUP_OBJS = $(addprefix $(obj)/,$(setup-y)) -LDFLAGS_setup.elf := -T -$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS) FORCE +$(obj)/zImage $(obj)/bzImage: \ + LDFLAGS := \ + -R $(obj)/compressed/blob-syms \ + --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T + +$(obj)/setup.elf: $(src)/setup.ld $(SETUP_OBJS)\ + $(obj)/compressed/blob-syms FORCE $(call if_changed,ld) $(obj)/payload.o: EXTRA_AFLAGS := -Wa,-I$(obj) $(obj)/payload.o: $(src)/payload.S $(obj)/blob.bin -$(obj)/compressed/blob: FORCE +$(obj)/compressed/blob $(obj)/compressed/blob-syms: FORCE $(Q)$(MAKE) $(build)=$(obj)/compressed IMAGE_OFFSET=$(IMAGE_OFFSET) $@ # Set this if you want to pass append arguments to the zdisk/fdimage/isoimage kernel === --- a/arch/i386/boot/compressed/Makefile +++ b/arch/i386/boot/compressed/Makefile @@ -4,21 +4,42 @@ # create a compressed vmlinux image from the original vmlinux # -targets:= blob vmlinux.bin vmlinux.bin.gz head.o misc.o piggy.o \ +targets:= blob vmlinux.bin vmlinux.bin.gz \ + elfhdr.o head.o misc.o notes.o piggy.o \ vmlinux.bin.all vmlinux.relocs -LDFLAGS_blob := -T hostprogs-y:= relocs CFLAGS := -m32 -D__KERNEL__ $(LINUX_INCLUDE) -O2 \ -fno-strict-aliasing -fPIC \ $(call cc-option,-ffreestanding) \ $(call cc-option,-fno-stack-protector) -LDFLAGS := -m elf_i386 +LDFLAGS := -R $(obj)/vmlinux-syms --defsym IMAGE_OFFSET=$(IMAGE_OFFSET) -T -$(obj)/blob: $(src)/vmlinux.lds $(obj)/head.o $(obj)/misc.o $(obj)/piggy.o FORCE +OBJS=$(addprefix $(obj)/,elfhdr.o head.o misc.o notes.o piggy.o) + +$(obj)/blob: $(src)/vmlinux.lds $(obj)/vmlinux-syms $(OBJS) FORCE $(call if_changed,ld) @: + +# Generate a stripped-down object including only the symbols needed +# so that we can get them with ld -R. Direct stderr to /dev/null to +# shut useless warning up. +quiet_cmd_symextract = SYMEXT $@ + cmd_symextract = objcopy -S \ + $(addprefix -j,$(EXTRACTSECTS)) \ + $(addprefix -K,$(EXTRACTSYMS)) \ + $ $@ 2/dev/null + +$(obj)/blob-syms: EXTRACTSYMS := blob_entry blob_payload +$(obj)/blob-syms: EXTRACTSECTS := .text.head .data.compressed +$(obj)/blob-syms: $(obj)/blob FORCE + $(call if_changed,symextract) + +$(obj)/vmlinux-syms: EXTRACTSYMS := __reserved_end +$(obj)/vmlinux-syms: EXTRACTSECTS := .bss +$(obj)/vmlinux-syms: vmlinux FORCE + $(call if_changed,symextract) $(obj)/vmlinux.bin: vmlinux FORCE $(call if_changed,objcopy) === --- /dev/null
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: This patch makes the payload of the bzImage file an ELF file. In other words, the bzImage is structured as follows: - boot sector - 16bit setup code - ELF header - decompressor - compressed kernel A bootloader may find the start of the ELF file by looking at the setup_size entry in the boot params, and using that to find the offset of the ELF header. The ELF Phdrs contain all the mapped memory required to decompress and start booting the kernel. One slightly complex part of this is that the bzImage boot_params need to know about the internal structure of the ELF file, at least to the extent of being able to point the core32_start entry at the ELF file's entrypoint, so that loaders which use this field will still work. Similarly, the ELF header needs to know how big the kernel vmlinux's bss segment is, in order to make sure is is mapped properly. To handle these two cases, we generate abstracted versions of the object files which only contain the symbols we care about (generated with objcopy --strip-all --keep-symbol=X), and then include those symbol tables with ld -R. I still believe that we should provide, in effect, vmlinux as a (compressed) ELF file rather than provide the intermediate stage. It would reduce the complexity of testing (all information provided about a stage have to be both guaranteed to even make sense in the future as well as be tested to conform to such information) as well as cover a larger number of environments -- any environment where injecting data into memory is cheaper than execution is quite unhappy about the current system. Such environments include heterogeneous embedded systems (think a slow CPU on a PCI card where the host CPU has direct access to the memory on the card) as well as simulators/emulators. For environments where so is appropriate it would even be possible to run the setup, invoke the code32_setup hook to do the decompression (and relocation, if appropriate) in host space. This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: I still believe that we should provide, in effect, vmlinux as a (compressed) ELF file rather than provide the intermediate stage. It would reduce the complexity of testing (all information provided about a stage have to be both guaranteed to even make sense in the future as well as be tested to conform to such information I'm not sure I follow you. Sure, you're right that the Phdr info contained within the bzImage needs to be tested for correctness. This wouldn't normally happen when booting native, but when booting under the most constrained environment - Xen - it will be tested (and I intend making the Xen loader as strict as possible). Of course, it won't help if the Phdrs are overmap too much, but I don't think that matters too much, so long as the mappings are not excessively large. I'm not sure what you mean about make sense in the future. If you're booting the kernel in a new paravirtualized environment, you've presumably modified the kernel to understand that environment, and perhaps had to update the boot image format a bit to deal with its requirements. I agree that updating the bzImage format may require retesting in all the other environments, but I think that's probably true for your scheme as well. After all, you're assuming that the vmlinux itself provides all necessary information to be loaded in any environment, which is not necessarily true (it may need extra ELF notes, for example). But if there are any major structural changes needed in the vmlinux, then that will be equally problematic for both directly using vmlinux and using ELF-in-bzImage. So I don't think your argument convincingly sways in any particular direction. ) as well as cover a larger number of environments -- any environment where injecting data into memory is cheaper than execution is quite unhappy about the current system. Such environments include heterogeneous embedded systems (think a slow CPU on a PCI card where the host CPU has direct access to the memory on the card) as well as simulators/emulators. Well, nothing in this scheme precludes the ELF file from being a plain uncompressed kernel image. If that's what these environments want, its easy to provide with a small update to the Makefiles. For environments where so is appropriate it would even be possible to run the setup, invoke the code32_setup hook to do the decompression (and relocation, if appropriate) in host space. Well, that's what we currently have, and we can't break backwards compatibility. This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. I'm not sure I fully understand the mechanism you're proposing. You have the 16-bit setup code, the 32-bit decompressor, and an ELF.gz. Once the decompressor has extracted the actual ELF file, are you proposing that it properly parse the ELF file and follow its instuctions to put the segments in the appropriate places, or are you assuming that the decompressor can just skip that part and plonk the ELF file where it wants? In other words, do you see the Phdrs as being descriptive or prescriptive? J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: I'm not sure I fully understand the mechanism you're proposing. You have the 16-bit setup code, the 32-bit decompressor, and an ELF.gz. Once the decompressor has extracted the actual ELF file, are you proposing that it properly parse the ELF file and follow its instuctions to put the segments in the appropriate places, or are you assuming that the decompressor can just skip that part and plonk the ELF file where it wants? In other words, do you see the Phdrs as being descriptive or prescriptive? I was thinking prescriptive, having the decompressor read the output stream and interpret it as ELF. I guess a descriptive approach could be made to work, too (I haven't really thought about that avenue of approach), but the prescriptive model seems more powerful, at least to me. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: I was thinking prescriptive, having the decompressor read the output stream and interpret it as ELF. I guess a descriptive approach could be made to work, too (I haven't really thought about that avenue of approach), but the prescriptive model seems more powerful, at least to me. Certainly, but much harder to implement. The ELF parser needs to be prepared to move itself around to get out of the way of the ELF file. It's a fairly large change from how it works now. I was thinking of making the ELF file entirely descriptive, since its just a set of ELF headers inserted into the existing bzImage structure, and it still relies on the bzImage being build properly in the first place. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Jeremy Fitzhardinge wrote: Certainly, but much harder to implement. The ELF parser needs to be prepared to move itself around to get out of the way of the ELF file. It's a fairly large change from how it works now. It doesn't if we simply declare that a certain chunk of memory is available to it, for the case where it runs in the native configuration. Since it doesn't have to support *any* ELF file, just the kernel one, that's an option. On the other hand, I guess with the decompressor/ELF parser being PIC, one would simply look for the highest used address, and relocate itself above that point. It's not really all that different from what the decompressor does today, except that it knows the address a priori. I was thinking of making the ELF file entirely descriptive, since its just a set of ELF headers inserted into the existing bzImage structure, and it still relies on the bzImage being build properly in the first place. Again, it's an option. The downside is that you don't get the automatic test coverage of having it be exercised as often as possible. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
H. Peter Anvin wrote: It doesn't if we simply declare that a certain chunk of memory is available to it, for the case where it runs in the native configuration. Since it doesn't have to support *any* ELF file, just the kernel one, that's an option. I suppose. But given that its always built at the same time as - and linked to - the kernel itself, it can have private knowledge about the kernel. On the other hand, I guess with the decompressor/ELF parser being PIC, one would simply look for the highest used address, and relocate itself above that point. It's not really all that different from what the decompressor does today, except that it knows the address a priori. Yes, it would have to decompress the ELF file into a temp buffer, and then rearrange itself and the decompressed ELF file to make space for the ELF file's final location. Seems a bit more complex because it has to be done in the middle of execution rather that at start of day. But perhaps that doesn't matter very much. I was thinking of making the ELF file entirely descriptive, since its just a set of ELF headers inserted into the existing bzImage structure, and it still relies on the bzImage being build properly in the first place. Again, it's an option. The downside is that you don't get the automatic test coverage of having it be exercised as often as possible. I don't follow your argument at all. I'm proposing the kernel take the same code path regardless of how its booted, with the only two variations: 1. boot all the way up from 16-bit mode, or 2. start directly in 32-bit mode which is essentially the current situation (setup vs code32_start). All I'm adding is a bit more metadata for the domain builder to work with. The code will get exercised on every boot in every environment, and the metadata will be tested by whichever environment cares about it. You're proposing that we add a third booting variation, where the bootloader takes on the responsibility for decompressing and loading the kernel's ELF image. In addition, you're proposing changing the existing 32-bit portion of the boot to perform the same job as the third method, but in a way which is not reusable by a paravirtual domain builder. This means that the boot path is unique for each boot environment, and so will overall get less coverage. Given that one axis of the test matrix - number of subarchtectures - is the same in both cases, and the other axis - number of ways of booting - is larger in your proposal, it seems to me that your's has the higher testing burden. Anyway, I added an extra pointer in the boot_params so that you can implement it that way if you really want (no real reason you can have ELF within ELF within bzImage, but it starts to look a bit engineering-by-compromise at that point). It isn't, however, the approach I want to take with Xen. J - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. Would this make it sane to strip the initramfs image out of vmlinux with objdump and replace it with another one, or are there offsets resolved during the build that stop that for vmlinux? Rob -- The Google cluster became self-aware at 2:14am EDT August 29, 2007... - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH RFC 6/7] i386: make the bzImage payload an ELF file
Rob Landley wrote: On Wednesday 06 June 2007 7:41 pm, H. Peter Anvin wrote: This makes vmlinux (normally stripped) recoverable from the bzImage file and so anything that is currently booting vmlinux would be serviced by this scheme. Would this make it sane to strip the initramfs image out of vmlinux with objdump and replace it with another one, or are there offsets resolved during the build that stop that for vmlinux? There probably are offsets resolved during the build. However, that wouldn't be all that hard to fix. Still, one can argue whether or not it is sane under any definition to do this kind of unpacking-repacking of ELF files. -hpa - To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/