Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 17/10/2009 07:17:04: [SNIP] Apologies if this is getting way off-topic for a simple boot loader, but this is information I have gathered from far and wide over the net. I am surprised that there isn't a web site out there on 'How to create a relocatable boot loader'... :), now you can write one :) OK, its all starting to come together now - It helps when you look at the right files ;) Firstly, u-boot.map 0x380589a0__rel_dyn_start = . .rel.dyn0x380589a0 0x42b0 *(.rel.dyn) .rel.got 0x0x0 cpu/i386/start.o .rel.plt 0x0x0 cpu/i386/start.o .rel.text 0x380589a0 0x2e28 cpu/i386/start.o .rel.start16 0x3805b7c8 0x10 cpu/i386/start.o .rel.data 0x3805b7d8 0xc18 cpu/i386/start.o .rel.rodata0x3805c3f0 0x360 cpu/i386/start.o .rel.u_boot_cmd 0x3805c750 0x500 cpu/i386/start.o 0x3805cc50__rel_dyn_end = . And the output of readelf... Section Headers: [Nr] Name TypeAddr OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .text PROGBITS3804 001000 0118a4 00 AX 0 0 4 [ 2] .rel.text REL 066c68 005d00 08 40 1 4 [ 3] .rodata PROGBITS380518a4 0128a4 005da5 00 A 0 0 4 [ 4] .rel.rodata REL 06c968 000360 08 40 3 4 [ 5] .interp PROGBITS38057649 018649 13 00 A 0 0 1 [ 6] .dynstr STRTAB 3805765c 01865c 0001ee 00 A 0 0 1 [ 7] .hash HASH3805784c 01884c cc 04 A 11 0 4 [ 8] .data PROGBITS38057918 018918 000a3c 00 WA 0 0 4 [ 9] .rel.data REL 06ccc8 000c18 08 40 8 4 [10] .got.plt PROGBITS38058354 019354 0c 04 WA 0 0 4 [11] .dynsym DYNSYM 38058360 019360 000200 10 A 6 1 4 [12] .dynamic DYNAMIC 38058560 019560 80 08 WA 6 0 4 [13] .u_boot_cmd PROGBITS380585e0 0195e0 0003c0 00 WA 0 0 4 [14] .rel.u_boot_cmd REL 06d8e0 000500 08 40 13 4 [15] .bss NOBITS 3805cc50 01ec50 001a34 00 WA 0 0 4 [16] .bios PROGBITS 01e000 00053e 00 AX 0 0 1 [17] .rel.bios REL 06dde0 c0 08 40 16 4 [18] .rel.dyn REL 380589a0 0199a0 0042b0 08 A 11 0 4 [19] .start16 PROGBITSf800 01e800 000110 00 AX 0 0 1 [20] .rel.start16 REL 06dea0 38 08 40 19 4 [21] .resetvec PROGBITSfff0 01eff0 10 00 AX 0 0 1 [22] .rel.resetvec REL 06ded8 08 08 40 21 4 ... Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries: Offset InfoTypeSym.Value Sym. Name 38040010 0101 R_386_32 3804 .text 3804001e 0101 R_386_32 3804 .text 38040028 0101 R_386_32 3804 .text 3804003f 0101 R_386_32 3804 .text 38040051 0101 R_386_32 3804 .text 38040075 0101 R_386_32 3804 .text 38040085 0101 R_386_32 3804 .text 3804009d 0003e602 R_386_PC32380403fa load_uboot 380400a6 0101 R_386_32 3804 .text 38040015 00029f02 R_386_PC323804bdd8 early_board_init 38040023 0003f702 R_386_PC323804bdda show_boot_progress_asm ... Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries: Offset InfoTypeSym.Value Sym. Name 38051908 0201 R_386_32 380518a4 .rodata 38051938 0201 R_386_32 380518a4 .rodata 38051968 0201 R_386_32 380518a4 .rodata 38051998 0201 R_386_32 380518a4 .rodata 380519c8 0201 R_386_32 380518a4 .rodata 380519f8 0201 R_386_32 380518a4 .rodata ... Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries: Offset InfoTypeSym.Value Sym. Name f838 0008 R_386_RELATIVE f846 0008 R_386_RELATIVE 38040010 0008 R_386_RELATIVE 3804001e 0008 R_386_RELATIVE 38040028 0008 R_386_RELATIVE 3804003f 0008 R_386_RELATIVE 38040051 0008 R_386_RELATIVE 38040075 0008 R_386_RELATIVE 38040085 0008 R_386_RELATIVE Notice that, apart from .rel.dyn, non of the .rel.* sections have the A (Allocated) flag set - They do not end up in the stripped binary image. .rel.dyn is
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ wrote: On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Joakim Tjernlund wrote: megasnip Apologies if this is getting way off-topic for a simple boot loader, but this is information I have gathered from far and wide over the net. I am surprised that there isn't a web site out there on 'How to create a relocatable boot loader'... OK, its all starting to come together now - It helps when you look at the right files ;) Firstly, u-boot.map 0x380589a0__rel_dyn_start = . .rel.dyn0x380589a0 0x42b0 *(.rel.dyn) .rel.got 0x0x0 cpu/i386/start.o .rel.plt 0x0x0 cpu/i386/start.o .rel.text 0x380589a0 0x2e28 cpu/i386/start.o .rel.start16 0x3805b7c8 0x10 cpu/i386/start.o .rel.data 0x3805b7d8 0xc18 cpu/i386/start.o .rel.rodata0x3805c3f0 0x360 cpu/i386/start.o .rel.u_boot_cmd 0x3805c750 0x500 cpu/i386/start.o 0x3805cc50__rel_dyn_end = . And the output of readelf... Section Headers: [Nr] Name TypeAddr OffSize ES Flg Lk Inf Al [ 0] NULL 00 00 00 0 0 0 [ 1] .text PROGBITS3804 001000 0118a4 00 AX 0 0 4 [ 2] .rel.text REL 066c68 005d00 08 40 1 4 [ 3] .rodata PROGBITS380518a4 0128a4 005da5 00 A 0 0 4 [ 4] .rel.rodata REL 06c968 000360 08 40 3 4 [ 5] .interp PROGBITS38057649 018649 13 00 A 0 0 1 [ 6] .dynstr STRTAB 3805765c 01865c 0001ee 00 A 0 0 1 [ 7] .hash HASH3805784c 01884c cc 04 A 11 0 4 [ 8] .data PROGBITS38057918 018918 000a3c 00 WA 0 0 4 [ 9] .rel.data REL 06ccc8 000c18 08 40 8 4 [10] .got.plt PROGBITS38058354 019354 0c 04 WA 0 0 4 [11] .dynsym DYNSYM 38058360 019360 000200 10 A 6 1 4 [12] .dynamic DYNAMIC 38058560 019560 80 08 WA 6 0 4 [13] .u_boot_cmd PROGBITS380585e0 0195e0 0003c0 00 WA 0 0 4 [14] .rel.u_boot_cmd REL 06d8e0 000500 08 40 13 4 [15] .bss NOBITS 3805cc50 01ec50 001a34 00 WA 0 0 4 [16] .bios PROGBITS 01e000 00053e 00 AX 0 0 1 [17] .rel.bios REL 06dde0 c0 08 40 16 4 [18] .rel.dyn REL 380589a0 0199a0 0042b0 08 A 11 0 4 [19] .start16 PROGBITSf800 01e800 000110 00 AX 0 0 1 [20] .rel.start16 REL 06dea0 38 08 40 19 4 [21] .resetvec PROGBITSfff0 01eff0 10 00 AX 0 0 1 [22] .rel.resetvec REL 06ded8 08 08 40 21 4 ... Relocation section '.rel.text' at offset 0x66c68 contains 2976 entries: Offset InfoTypeSym.Value Sym. Name 38040010 0101 R_386_32 3804 .text 3804001e 0101 R_386_32 3804 .text 38040028 0101 R_386_32 3804 .text 3804003f 0101 R_386_32 3804 .text 38040051 0101 R_386_32 3804 .text 38040075 0101 R_386_32 3804 .text 38040085 0101 R_386_32 3804 .text 3804009d 0003e602 R_386_PC32380403fa load_uboot 380400a6 0101 R_386_32 3804 .text 38040015 00029f02 R_386_PC323804bdd8 early_board_init 38040023 0003f702 R_386_PC323804bdda show_boot_progress_asm ... Relocation section '.rel.rodata' at offset 0x6c968 contains 108 entries: Offset InfoTypeSym.Value Sym. Name 38051908 0201 R_386_32 380518a4 .rodata 38051938 0201 R_386_32 380518a4 .rodata 38051968 0201 R_386_32 380518a4 .rodata 38051998 0201 R_386_32 380518a4 .rodata 380519c8 0201 R_386_32 380518a4 .rodata 380519f8 0201 R_386_32 380518a4 .rodata ... Relocation section '.rel.dyn' at offset 0x199a0 contains 2134 entries: Offset InfoTypeSym.Value Sym. Name f838 0008 R_386_RELATIVE f846 0008 R_386_RELATIVE 38040010 0008 R_386_RELATIVE 3804001e 0008 R_386_RELATIVE 38040028 0008 R_386_RELATIVE 3804003f 0008 R_386_RELATIVE 38040051 0008 R_386_RELATIVE 38040075 0008 R_386_RELATIVE 38040085 0008 R_386_RELATIVE Notice that, apart from .rel.dyn, non of the .rel.* sections have the A (Allocated) flag set - They do not end
Re: [U-Boot] Relocation size penalty calculation
On Thu, Oct 15, 2009 at 3:45 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 14/10/2009 13:48:27: On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new
Re: [U-Boot] Relocation size penalty calculation
J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address. Normally one do what you describe but here we know that the whole img has moved so we don't have
Re: [U-Boot] Relocation size penalty calculation
On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 14/10/2009 13:48:27: On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address
Re: [U-Boot] Relocation size penalty calculation
J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 17:35:44: Joakim Tjernlund wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] [Yet another SNIP :)] Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address. Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch. If the addresses of the bss, text, and data segments change by the same value, I think you are correct. However, if the text and data/bss segments are moved by different offsets, naturally the relocations would be different. One reason to retain this capability would be to allow the u-boot copy to execute in place in NOR flash while re-locating the read-write storage once memory has been sized. Having different relocation factors is not much worse than just one, and it may be just as easy to get working initially as a single relocation constant. How do figure that? You need to rewrite the insn to access the moved data/bss and they are in flash, did I miss something? FWIW, the ultimate solution to minimum relocation size is a post-processing step that creates several arrays of relocation offsets as two byte quantities. This reduces the cost of each relocation entry to just a bit more than two bytes (there is a small overhead for array size, MSB values and relocation offset selection.) Naturally, this is much less than the ELF version of the same relocations, because we do not need to retain as much information and ELF doesn't worry about size that much.. This may pacify users for which the flash size of the image is critical, at the expense of an extra link step. Naturally, getting things to work with standard ELF is the most important step, and probably enough for most people. That would save 2+4 bytes/reloc on REL arches and 2+4+4 on RELA(ppc) (provided one can ignore r_addend) But yes, this is probably too fancy for the moment. Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 14/10/2009 13:48:27: On Wed, Oct 14, 2009 at 6:25 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 17:35:44: Joakim Tjernlund wrote: J. William Campbell jwilliamcampb...@comcast.net wrote on 14/10/2009 01:48:52: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] [Yet another SNIP :)] Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them BTW, how many relocs do you get compared with -fPIC? I suspect you more now but hopefully not that many more. Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Still think you can get away with just ADDING an offset. The image is linked to a specific address and then you move the whole image to a new address. Therefore you should be able to read the current address, add offset, write back the new address. Normally one do what you describe but here we know that the whole img has moved so we don't have to do calculate the new address from scratch. If the addresses of the bss, text, and data segments change by the same value, I think you are correct. However, if the text and data/bss segments are moved by different offsets, naturally the relocations would be different. One reason to retain this capability would be to allow the u-boot copy to execute in place in NOR flash while re-locating the read-write storage once memory has been sized. Having different relocation factors is not much worse than just one, and it may be just as easy to get working initially as a single relocation constant. How do figure that? You need to rewrite the insn to access the moved data/bss and they are in flash, did I miss something? No, I did. You are quite correct, there would be references in flash that couldn't be fixed. Sorry about that. Best Regards, Bill Campbell FWIW, the ultimate solution to minimum relocation size is a post-processing step that creates several arrays of relocation offsets as two byte quantities. This reduces the cost of each relocation entry to just a bit more than two bytes (there is a small overhead for array size, MSB values and relocation offset selection.) Naturally, this is much less than the ELF version of the same relocations, because we do not need to retain as much information and ELF doesn't worry about size that much.. This may pacify users for which the flash size of the image is critical, at the expense of an extra link step. Naturally, getting things to work with standard ELF is the most important step, and probably enough for most people. That would save 2+4 bytes/reloc on REL arches and 2+4+4 on RELA(ppc) (provided one can ignore r_addend) But yes, this is probably too fancy for the moment. Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored [0b] 0x6FFA, 0x0236 ???, Entries in .rel.dyn [0c] 0x, 0x DT_NULL, End of Array [0d] 0x, 0x DT_NULL, End of Array [0e] 0x, 0x DT_NULL, End of Array [0f] 0x, 0x DT_NULL, End of Array [10] 0x, 0x DT_NULL, End of Array I think some more investigation into the need for .dynsym and .dynamic is still required... .dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM I don't think .dynamic is needed due to the exporting of section addresses from the linker script Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored [0b] 0x6FFA, 0x0236 ???, Entries in .rel.dyn [0c] 0x, 0x DT_NULL, End of Array [0d] 0x, 0x DT_NULL, End of Array [0e] 0x, 0x DT_NULL, End of Array [0f] 0x, 0x DT_NULL, End of Array [10] 0x, 0x DT_NULL, End of Array I think some more investigation into the need for .dynsym and .dynamic is still required... .dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM Why do you need to much around with u_boot_cmd at all? Now that relocation works you should be able to drop all that code/linker stuff? Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM Still occupies space in the *bin image though. ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. I think this approach will turn out to be a big win. At present, the problem with just using the relocs is that objcopy is stripping them out when u-boot.bin is created, as I understand it. It seems this can be solved by changing the command switches appropriately, like using --strip-unneeded. In any case, there is some combination of switches that will preserve the relocation data. The executable code will get smaller, there will be no .got, and the relocation data will be larger (than with -fpic). In total size, it probably will be slightly smaller, but that is a guess. The most important benefit of this approach is that it will work for all architectures, thereby solving the problem once and forever! Even if the result is a bit larger, the RAM footprint will be reduced by the smaller object code size (since the relocation data need not be copied into ram).Having this approach as an option would be real nice, since it would always just work. Best Regards, Bill Campbell This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored [0b] 0x6FFA, 0x0236 ???, Entries in .rel.dyn [0c] 0x, 0x DT_NULL, End of Array [0d] 0x, 0x DT_NULL, End of Array [0e] 0x, 0x DT_NULL, End of Array [0f] 0x, 0x DT_NULL, End of Array [10] 0x, 0x DT_NULL, End of Array I think some more investigation into the need for .dynsym and .dynamic is still required... .dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean
Re: [U-Boot] Relocation size penalty calculation
J. William Campbell jwilliamcampb...@comcast.net wrote on 13/10/2009 18:30:43: Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. I think this approach will turn out to be a big win. At present, the problem with just using the relocs is that objcopy is stripping them out when u-boot.bin is created, as I understand it. It seems this can be solved by changing the command switches appropriately, like using --strip-unneeded. In any case, there is some combination of switches that will preserve the relocation data. The executable code will get smaller, there will be no .got, and the relocation data will be larger (than with -fpic). In total size, it probably will be slightly smaller, but that is a guess. The most important benefit of this approach is that it will work for all architectures, thereby solving the problem once and forever! Even if the result is a bit larger, the RAM footprint will be reduced by the smaller object code size (since the relocation data need not be copied into ram).Having this approach as an option would be real nice, since it would always just work. Yes, I had this in the back of my head. I do think some other arch than ppc will have to try this out though :) I am not 100% sure this will work with my end goal, true PIC so I can load the same img anywhere in flash. Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them This is not a problem because this is very low-level init that is not called once relocated into RAM - These relocations can be safely ignored [0b] 0x6FFA, 0x0236 ???, Entries in .rel.dyn [0c] 0x, 0x DT_NULL, End of Array [0d] 0x, 0x DT_NULL, End of Array [0e] 0x, 0x DT_NULL, End of Array [0f] 0x, 0x DT_NULL, End of Array [10] 0x, 0x DT_NULL, End of Array I think some more investigation into the need for .dynsym and .dynamic is still required... .dynsym may still be required if only for accessing the __u_boot_cmd structure. However, I may be able to hack that a little and not create a __u_boot_cmd symbol in the linker script (create some other temporary symbol) and populate __u_boot_cmd with a valid value after relocation. It will look a little weird, but may mean not loading this section into RAM Why do you need to much around with u_boot_cmd at all? Now that relocation works you should be able to drop all that code/linker stuff? Other than that, .dynsym is now only needed to locate the sections during the relocation phase and can be kept in flash and not copied to RAM Still occupies space in the *bin image though. ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them Possibly, but I think you only need to add an offset to all those relocs. Jokce ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 22:06:56: On Tue, Oct 13, 2009 at 10:53 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 13/10/2009 13:21:05: On Sun, Oct 11, 2009 at 11:51 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 11/10/2009 12:47:19: [Massive Snip :)] So, all that is left are .dynsym and .dynamic ... .dynsym - Contains 70 entries (16 bytes each, 1120 bytes) - 44 entries mimic those entries in .got which are not relocated - 21 entries are the remaining symbols exported from the linker script - 4 entries are labels defined in inline asm and used in C Try adding proper asm declarations. Look at what gcc generates for a function/variable and mimic these. Thanks - Now .dynsym contains only exports from the linker script :) - 1 entry is a NULL entry .dynamic - 88 bytes - Array of Elf32_Dyn - typedef struct { Elf32_Sword d_tag; union { Elf32_Word d_val; Elf32_Addr d_ptr; } d_un; } Elf32_Dyn; - 0x11 entries [00] 0x0010, 0x DT_SYMBOLIC, (ignored) [01] 0x0004, 0x38059994 DT_HASH, points to .hash [02] 0x0005, 0x380595AB DT_STRTAB, points to .dynstr [03] 0x0006, 0x3805BDCC DT_SYMTAB, points to .dynsym [04] 0x000A, 0x03E6 DT_STRSZ, size of .dynstr [05] 0x000B, 0x0010 DT_SYMENT, ??? [06] 0x0015, 0x DT_DEBUG, ??? [07] 0x0011, 0x3805A8F4 DT_REL, points to .rel.text [08] 0x0012, 0x14D8 DT_RELSZ, ??? How big DT_REL is [09] 0x0013, 0x0008 DT_RELENT, ??? hmm, cannot remeber :) How big an entry in DT_REL is Right, how could I forget :) [0a] 0x0016, 0x DT_TEXTREL, ??? Oops, you got text relocations. This is generally a bad thing. TEXTREL is commonly caused by asm code that arent truly pic so it needs to modify the .text segment to adjust for relocation. You should get rid of this one. Look for DT_TEXTREL in .o files to find the culprit. Alas I cannot - The relocations are a result of loading a register with a return address when calling show_boot_progress in the very early stages of initialisation prior to the stack becoming available. The x86 does not allow direct access to the IP so the only way to find the 'current execution address' is to 'call' to the next instruction and pop the return address off the stack hmm, same as ppc but that in it self should not cause a TEXREL, should it? Ahh, the 'call' is absolute, not relative? I guess there is some way around it but it is not important ATM I guess. Evil idea, skip -fpic et. all and add the full reloc procedure to relocate by rewriting directly in TEXT segment. Then you save space but you need more relocation code. Something like dl_do_reloc from uClibc. Wonder how much extra code that would be? Not too much I think. With the following flags PLATFORM_RELFLAGS += -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pic --emit-relocs -Bsymbolic -Bsymbolic-functions I get no .got, but a lot of R_386_PC32 and R_386_32 relocations. I think this might mean I need the symbol table in the binary in order to resolve them Possibly, but I think you only need to add an offset to all those relocs. Almost right. The relocations specify a symbol value that needs to be added to the data in memory to relocate the reference. The symbol values involved should be the start of the text section for program references, the start of the uninitialized data section for bss references, and the start of the data section for initialized data and constants. So there are about four symbols whose value you need to keep. Take a look at http://refspecs.freestandards.org/elf/elf.pdf (which you have probably already looked at) and it tells you what to do with R_386_PC32 ad R_386_32 relocations. Hopefully the objcopy with the --strip-unneeded will remove all the symbols you don't actually need, but I don't know that for sure. Note also that you can change the section flags of a section marked noload to load. Best Regards, Bill Campbell Jokce ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sun, Oct 11, 2009 at 2:38 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 13:21:10: On Sat, Oct 10, 2009 at 9:47 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code todo some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference not on x86, on ppc it is a big difference. Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference not on x86, on ppc it is a big difference. Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic Ah, that is because you built an exe with -pie Then you should be able to drop everything but the RELATIVE from the linking, or almost in any case. Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sat, Oct 10, 2009 at 9:47 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference not on x86, on ppc it is a big difference. Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic Ah, that is because you built an exe with -pie Then you should be able to drop everything but the RELATIVE from the linking, or almost in any case. Jocke Hmm, so its seems I may have hit the limit. I tried: PLATFORM_LDFLAGS += -r --emit-relocs but there is not enough information left to complete the relocation. It
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 13:21:10: On Sat, Oct 10, 2009 at 9:47 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code todo some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference not on x86, on ppc it is a big difference. Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic Ah, that is because you built an exe with -pie Then you should be able to drop everything
Re: [U-Boot] Relocation size penalty calculation
On Saturday 10 October 2009 06:47:42 Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: -fpic / -pic make no difference not on x86, on ppc it is a big difference. i think you guys mean -fpic and -fPIC because there is no -pic flag ... while the two make a big diff on some arches like ppc, they make pretty much no different on x86 last i looked -mike signature.asc Description: This is a digitally signed message part. ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Mike Frysinger vap...@gentoo.org wrote on 10/10/2009 18:52:29: On Saturday 10 October 2009 06:47:42 Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: -fpic / -pic make no difference not on x86, on ppc it is a big difference. i think you guys mean -fpic and -fPIC because there is no -pic flag ... while the two make a big diff on some arches like ppc, they make pretty much no different on x86 last i looked Yes, this was what I was thinking(-fpic vs. -fPIC). These will probably only make a difference on RISC like arches. Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sun, Oct 11, 2009 at 4:45 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Mike Frysinger vap...@gentoo.org wrote on 10/10/2009 18:52:29: On Saturday 10 October 2009 06:47:42 Joakim Tjernlund wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: -fpic / -pic make no difference not on x86, on ppc it is a big difference. i think you guys mean -fpic and -fPIC because there is no -pic flag ... while the two make a big diff on some arches like ppc, they make pretty much no different on x86 last i looked Sorry for the confusion - by -fpic / -pic I was referring to -fpic (gcc) / -pic (ld) flags versus -fpie (gcc) / -pie (ld) flags. Yes, this was what I was thinking(-fpic vs. -fPIC). These will probably only make a difference on RISC like arches. There appears to be no difference (on x86) between pic, PIC, and pie. The big difference is when I drop ld's -pic and use ld's --emit-relocs instead Jocke Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Sun, Oct 11, 2009 at 3:18 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Sat, Oct 10, 2009 at 9:47 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 12:38:19: On Sat, Oct 10, 2009 at 8:27 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 10:46:52: On Sat, Oct 10, 2009 at 7:07 PM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: Graeme Russ graeme.r...@gmail.com wrote on 10/10/2009 06:43:52: On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Great, so now you are back at just a few percent added I guess? Not really - The .eh_frame saving applies to both relocated and non relocated builds OK, so you didn't use PIC before at all? Anyway I think you can do more. Using -Bsymbolic you should get away with RELATIVE relocs only and be able to skip a lot of segments above. Have a look at uClibc ldso/ldso/dl-startup.c My build options thus far are: PLATFORM_RELFLAGS += -fpie -fvisibility=hidden PLATFORM_CPPFLAGS += -fno-dwarf2-cfi-asm PLATFORM_LDFLAGS += -pie -fpic / -pic make no difference not on x86, on ppc it is a big difference. Interestingly, -Bsymbolic adds exactly 8 bytes to .dynamic, but doesn't change the size of any other section Pulling apart the relocation sections, it seems that all relocations are already RELATIVE even without -Bsymbolic Ah, that is because you built an exe with -pie Then you should be able to drop everything but the RELATIVE from the linking, or almost in any case. Jocke Hmm, so its seems I may have hit the limit. I tried: PLATFORM_LDFLAGS += -r --emit-relocs but there is not enough information left to complete the relocation. Hi Graeme, I am glad you tried this. It should work, -fpie should not be necessary. Did you also change PLATFORM_RELFLAGS to omit the -fpie? Without pie, and with no libraries linked in that are pie, there should BE no .got, AFIK. I wonder if
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 10:12 AM, Joakim Tjernlund joakim.tjernl...@transmode.se wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Thanks - Shaved another 2539 bytes off the binary Also found out how to get rid of .eh_frame (crept in when I upgraded to gcc 4.4.1) with -fno-dwarf2-cfi-asm, so that shaves another 13452 bytes Total saving of 15.6k Jocke Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
[U-Boot] Relocation size penalty calculation
Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Have any metrics been done for PPC? Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Thu, 2009-10-08 at 22:54 +1100, Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Have any metrics been done for PPC? Things actually improve a little bit when we use -mrelocatable and get rid of all the manual += gd-reloc_off fixups: 1) Top of mainline on XPedite5370: textdata bss dec hex filename 308612 24488 33172 366272 596c0 u-boot 2) Top of reloc branch on XPedite5370 (ie -mrelocatable): textdata bss dec hex filename 303704 28644 33156 365504 593c0 u-boot For fun: 3) #2 but with s/-mrelocatable/-fpic/ (probably doesn't boot): textdata bss dec hex filename 303704 24472 33156 361332 58374 u-boot There may be some other changes that affect the size between mainline and reloc, but their sizes are in the same general ballpark. Best, Peter ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Peter Tyser wrote: On Thu, 2009-10-08 at 22:54 +1100, Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Have any metrics been done for PPC? Things actually improve a little bit when we use -mrelocatable and get rid of all the manual += gd-reloc_off fixups: 1) Top of mainline on XPedite5370: text data bss dec hex filename 308612 24488 33172 366272 596c0 u-boot 2) Top of reloc branch on XPedite5370 (ie -mrelocatable): text data bss dec hex filename 303704 28644 33156 365504 593c0 u-boot Hi Peter, Just to be clear, the total text+data length of u-boot with the manual relocations (#1) is LARGER than the text+data length of u-boot with the manual relocations removed and the necessary centralized relocation code added, along with any additional data sections required by -mrelocateable (#2), by 768 (dec) bytes? And both cases (1 and 2) work equivalently? Best Regards, Bill Campbell. For fun: 3) #2 but with s/-mrelocatable/-fpic/ (probably doesn't boot): text data bss dec hex filename 303704 24472 33156 361332 58374 u-boot There may be some other changes that affect the size between mainline and reloc, but their sizes are in the same general ballpark. Best, Peter ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. Best Regards, Bill Campbell Have any metrics been done for PPC? Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Thu, 2009-10-08 at 08:53 -0700, J. William Campbell wrote: Peter Tyser wrote: On Thu, 2009-10-08 at 22:54 +1100, Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Have any metrics been done for PPC? Things actually improve a little bit when we use -mrelocatable and get rid of all the manual += gd-reloc_off fixups: 1) Top of mainline on XPedite5370: textdata bss dec hex filename 308612 24488 33172 366272 596c0 u-boot 2) Top of reloc branch on XPedite5370 (ie -mrelocatable): textdata bss dec hex filename 303704 28644 33156 365504 593c0 u-boot Hi Peter, Just to be clear, the total text+data length of u-boot with the manual relocations (#1) is LARGER than the text+data length of u-boot with the manual relocations removed and the necessary centralized relocation code added, along with any additional data sections required by -mrelocateable (#2), by 768 (dec) bytes? Hi Bill, Doah, looks like I chose a bad board as an example. The XPedite5370 already had -mrelocatable defined in its own board/xes/xpedite5370/config.mk in mainline, so the above comparison should be ignored as both builds used -mrelocatable. Here's some *real* results from the MPC8548CDS: 1) Top of mainline: textdata bss dec hex filename 219968 17052 22992 260012 3f7ac u-boot 2) Top of reloc branch (ie -mrelocatable) textdata bss dec hex filename 219192 20640 22980 262812 4029c u-boot So the reloc branch is 2.7K bigger for the MPC8548CDS. Best, Peter ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Peter Tyser wrote: On Thu, 2009-10-08 at 08:53 -0700, J. William Campbell wrote: Peter Tyser wrote: On Thu, 2009-10-08 at 22:54 +1100, Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Have any metrics been done for PPC? Things actually improve a little bit when we use -mrelocatable and get rid of all the manual += gd-reloc_off fixups: 1) Top of mainline on XPedite5370: textdata bss dec hex filename 308612 24488 33172 366272 596c0 u-boot 2) Top of reloc branch on XPedite5370 (ie -mrelocatable): textdata bss dec hex filename 303704 28644 33156 365504 593c0 u-boot Hi Peter, Just to be clear, the total text+data length of u-boot with the manual relocations (#1) is LARGER than the text+data length of u-boot with the manual relocations removed and the necessary centralized relocation code added, along with any additional data sections required by -mrelocateable (#2), by 768 (dec) bytes? Hi Bill, Doah, looks like I chose a bad board as an example. The XPedite5370 already had -mrelocatable defined in its own board/xes/xpedite5370/config.mk in mainline, so the above comparison should be ignored as both builds used -mrelocatable. Here's some *real* results from the MPC8548CDS: 1) Top of mainline: text data bss dec hex filename 219968 17052 22992 260012 3f7ac u-boot 2) Top of reloc branch (ie -mrelocatable) text data bss dec hex filename 219192 20640 22980 262812 4029c u-boot So the reloc branch is 2.7K bigger for the MPC8548CDS. Hi Peter, OK, that's more like it! A 1.2 % size increase in ROM seems like a very small price to pay for a truly relocatable u-boot image that will run on any size memory without the programmer having to actively worry about what may need relocating as code is written. . Also, it should be noted that the size increase in 2) is mostly in relocation segments that do not need to be copied into ram, so the ram footprint should be smaller for 2) than 1). The relocation code itself could also be placed is a segment that is not copied into ram, although that may be more trouble than it is worth. I am looking forward to Graeme's results with the 386. I expect that it will not be quite so favorable, perhaps a 4 or 5% size increase for -mrelocatable over an absolute build. However, -mrelocatable vs. -fpic may be comparable, with -mrelocatable actually winning. But then again, I could be totally wrong! Best Regards, Bill Campbell Best, Peter ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Best Regards, Bill Campbell Have any metrics been done for PPC? Regards, Graeme Once the reloc branch has been merged, how many arches are left which do not support relocation? Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Dear Graeme Russ, In message d66caabb0910081358h5b013922tf7f9dce4cce41...@mail.gmail.com you wrote: Once the reloc branch has been merged, how many arches are left which do not support relocation? All but PPC ? Best regards, Wolfgang Denk -- DENX Software Engineering GmbH, MD: Wolfgang Denk Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: w...@denx.de There comes to all races an ultimate crisis which you have yet to face One day our minds became so powerful we dared think of ourselves as gods. -- Sargon, Return to Tomorrow, stardate 4768.3 ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 8:23 AM, Wolfgang Denk w...@denx.de wrote: Dear Graeme Russ, In message d66caabb0910081358h5b013922tf7f9dce4cce41...@mail.gmail.com you wrote: Once the reloc branch has been merged, how many arches are left which do not support relocation? All but PPC ? Hmm, so commit 0630535e2d062dd73c1ceca5c6125c86d1127a49 is all about removing code that is not used because these arches do not do any relocation at all? So ultimately, what we are looking at is the complete and utter removal of any code which references a relocation adjustment in lieu of each arch either: a) Execute in Place from Flash, or; b) Setting a fixed TEXT_BASE at a known RAM location and copying the contents of Flash to RAM, or; c) Implementing full Relocation Best regards, Wolfgang Denk Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, 2009-10-09 at 09:02 +1100, Graeme Russ wrote: On Fri, Oct 9, 2009 at 8:23 AM, Wolfgang Denk w...@denx.de wrote: Dear Graeme Russ, In message d66caabb0910081358h5b013922tf7f9dce4cce41...@mail.gmail.com you wrote: Once the reloc branch has been merged, how many arches are left which do not support relocation? All but PPC ? Hmm, so commit 0630535e2d062dd73c1ceca5c6125c86d1127a49 is all about removing code that is not used because these arches do not do any relocation at all? I sent that patch/RFC after noticing none of those architectures performed manual relocation fixups, thus they could save some code space by defining CONFIG_RELOC_FIXUP_WORKS. Similarly the gd-reloc_off field was no longer needed for them. I'm not familiar with if or how those architectures are relocating, just that they didn't need relocation fixups. So that was the logic... So ultimately, what we are looking at is the complete and utter removal of any code which references a relocation adjustment in lieu of each arch either: a) Execute in Place from Flash, or; b) Setting a fixed TEXT_BASE at a known RAM location and copying the contents of Flash to RAM, or; c) Implementing full Relocation d) Leaving those architectures the way they are now Could be added if a,b,c won't work for some reason too. I think it would be great to remove any manual relocation adjustments in the long run. This isn't strictly necessary though, as we can still have manual relocations littering the code - its just a bit dirty and prone to issues in the long run. So my vote would be to shoot for c) for all arches, but I have no idea what impact that would have on them:) Best, Peter ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. Best Regards, Bill Campbell ** Best Regards, Bill Campbell Have any metrics been done for PPC? Regards, Graeme Once the reloc branch has been merged, how many arches are left which do not support relocation? Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Regards, Graeme Best Regards, Bill Campbell ** Best Regards, Bill Campbell Have any metrics been done for PPC? Regards, Graeme Once the reloc branch has been merged, how many arches are left which do not support relocation? Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform Try -fvisibility=hidden Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
Joakim Tjernlund wrote: On Fri, Oct 9, 2009 at 9:27 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: On Fri, Oct 9, 2009 at 2:58 AM, J. William Campbell jwilliamcampb...@comcast.net wrote: Graeme Russ wrote: Out of curiosity, I wanted to see just how much of a size penalty I am incurring by using gcc -fpic / ld -pic on my x86 u-boot build. Here are the results (fixed width font will help - its space, not tab, formatted): Section non-reloc reloc --- .text000118c4 000137fc - 0x1f38 bytes (~8kB) bigger .rodata 5bad 59d0 .interp n/a 0013 .dynstr n/a 0648 .hashn/a 0428 .eh_frame3268 34fc .data0a6c 01dc .data.reln/a 0098 .data.rel.ro.local n/a 0178 .data.rel.local n/a 07e4 .got 01f0 .got.plt n/a 000c .rel.got n/a 03e0 .rel.dyn n/a 1228 .dynsym n/a 0850 .dynamic n/a 0080 .u_boot_cmd 03c0 03c0 .bss 1a34 1a34 .realmode0166 0166 .bios053e 053e === Total0001d5dd 00022287 - 0x4caa bytes (~19kB) bigger Its more than a 16% increase in size!!! .text accounts for a little under half of the total bloat, and of that, the crude dynamic loader accounts for only 341 bytes Hi Graeme, I would be interested in a third option (column), the x86 build with just -mrelocateable but NOT -fpic. It will not be definitive because there will be extra code that references the GOT and missing code to do some of the relocation, but it would still be interesting. x86 does not have -mrelocatable. This is a PPC only option :( Hi Graeme, You are unfortunately correct. However, I wonder if we can get essentially the same result by executing the final ld step with the --emit-relocs switch included. This may also include some extra sections that we would want to strip out, but if it works, it could give all ELF-based systems a way to a relocatable u-boot. I don't think --emit-relocs is necessary with -pic. I haven't gone through all the permutations to see if there is a smaller option, but gcc -fpic and ld -pie creates enough information to perform relocation on the x86 platform It is true that --emit-relocs is not required when -pic and -pie are used instead. However, pic and pie are designed to allow shared code (libraries) to appear at different logical addresses in several programs without altering the text. This is grand overkill for what we need, which is the ability to relocate the code. The -pic and -pie code will be larger than the code without pic and pie. How much larger is a good question. On the PPC, it is larger but not much larger, because there are lots of registers available and one is almost for sure got (no pun intended) the magic relocation constant(s) in it. On the 386 with many fewer registers, pic and pie will cause the code to be percentage-wise larger than on the PPC. Thus avoiding pic and pie is a Good Thing in most cases. Try -fvisibility=hidden I assume the -fvisibility=hidden is suggested in order to reduce (eliminate) the symbol table from the output, which we don't need because there are assumed to be no undefined symbols in our final ld. If that works, great! I was assuming we might need a custom strip program to delete any sections that we don't need, but this sounds easier if it gets them all. Best Regards, Bill Campbell Jocke ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Thursday 08 October 2009 18:20:18 Peter Tyser wrote: On Fri, 2009-10-09 at 09:02 +1100, Graeme Russ wrote: On Fri, Oct 9, 2009 at 8:23 AM, Wolfgang Denk w...@denx.de wrote: Graeme Russ wrote: Once the reloc branch has been merged, how many arches are left which do not support relocation? All but PPC ? Hmm, so commit 0630535e2d062dd73c1ceca5c6125c86d1127a49 is all about removing code that is not used because these arches do not do any relocation at all? I sent that patch/RFC after noticing none of those architectures performed manual relocation fixups, thus they could save some code space by defining CONFIG_RELOC_FIXUP_WORKS. Similarly the gd-reloc_off field was no longer needed for them. I'm not familiar with if or how those architectures are relocating, just that they didn't need relocation fixups. So that was the logic... the usage in the Blackfin port is most likely a copy paste of existing code. deleting malloc_bin_reloc() from lib_blackfin/board.c and adding CONFIG_RELOC_FIXUP_WORKS results in a working boot. ive never really looked into relocation as no one has asked for it. -mike signature.asc Description: This is a digitally signed message part. ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot
Re: [U-Boot] Relocation size penalty calculation
On Fri, Oct 9, 2009 at 9:20 AM, Peter Tyser pty...@xes-inc.com wrote: On Fri, 2009-10-09 at 09:02 +1100, Graeme Russ wrote: On Fri, Oct 9, 2009 at 8:23 AM, Wolfgang Denk w...@denx.de wrote: Dear Graeme Russ, In message d66caabb0910081358h5b013922tf7f9dce4cce41...@mail.gmail.com you wrote: Once the reloc branch has been merged, how many arches are left which do not support relocation? All but PPC ? Hmm, so commit 0630535e2d062dd73c1ceca5c6125c86d1127a49 is all about removing code that is not used because these arches do not do any relocation at all? I sent that patch/RFC after noticing none of those architectures performed manual relocation fixups, thus they could save some code space by defining CONFIG_RELOC_FIXUP_WORKS. Similarly the gd-reloc_off field was no longer needed for them. Maybe CONFIG_RELOC_NOT_IMPLEMENTED would be more descriptive I'm not familiar with if or how those architectures are relocating, just that they didn't need relocation fixups. So that was the logic... So ultimately, what we are looking at is the complete and utter removal of any code which references a relocation adjustment in lieu of each arch either: a) Execute in Place from Flash, or; b) Setting a fixed TEXT_BASE at a known RAM location and copying the contents of Flash to RAM, or; c) Implementing full Relocation d) Leaving those architectures the way they are now Could be added if a,b,c won't work for some reason too. Which is essentially either a) or b) depending on which way the arch was implemented. For x86, it has been b) but it is going towards c) I think it would be great to remove any manual relocation adjustments in the long run. This isn't strictly necessary though, as we can still have manual relocations littering the code - its just a bit dirty and prone to issues in the long run. So my vote would be to shoot for c) for all arches, but I have no idea what impact that would have on them:) So the big question now is - How many arches do partial relocation and really need gd-reloc_off Best, Peter Regards, Graeme ___ U-Boot mailing list U-Boot@lists.denx.de http://lists.denx.de/mailman/listinfo/u-boot