Re: [osv-dev] Shrinking kernel

2020-05-26 Thread Waldek Kozaczuk


On Tuesday, May 26, 2020 at 4:26:09 AM UTC-4, Nadav Har'El wrote:
>
> On Sun, May 17, 2020 at 7:59 AM Waldek Kozaczuk  > wrote:
>
>> One of the things, I would like to tackle for the next release is making 
>> the loader.elf smaller. The primary motivation is to lower memory 
>> utilization. But at the same time, I would also like to keep the kernel 
>> (the release version) as debuggable as it is now (or close to). 
>>
>> There are already 2 issues that could help us with that:
>> - https://github.com/cloudius-systems/osv/issues/97 - Be more selective 
>> on symbols exported from the kernel 
>> - https://github.com/cloudius-systems/osv/issues/106 - Consider single 
>> instantiation for some templates (this may apply to more templates than 
>> just debug()).
>>
>
> It is very unclear why issue 106 makes the kernel smaller. It shouldn't... 
> Please make sure you look at the size of loader-stripped.elf - not 
> loader.elf.
>
> If you really want to work on the size of loader-stripped.elf you should 
> probably use objdump/readelf/nm to try to figure out what are the biggest 
> parts of this object. Do we have some big functions we need to fix? Do we 
> have big static arrays (BSS) we can allocate at runtime?
> Here are some example commits I did in the past making the kernel smaller 
> by using these ideas:
>
> 45f93e16f4727d506135101b51b9f2ea98e3a651 - zfs: smaller kernel by dropping 
> utf8 normalization support
> 8693761b737ad74e3d116f923bf0b4323d9df8b4 - build: don't put unnecessary 
> libraries in every image
> c50a090f086968fbfc24ff0a1f085ebd570aa77d - trace: only allocate trace_log 
> when tracepoints are enabled
>
>
>> Besides that, there are other things we can try:
>> 1. Use compiler flags --ffunction-sections -fdata-sections and linker one 
>> -gc-sections.
>> 2. Remove RTTI using -fno-rtti (I think we have over 1000 typeinfo 
>> entries in the symbols tables) - I do not think we have many dynamic_cast 
>> and hopefully, these can be eliminated.
>> 3. Controversial - eliminate exception usage in kernel (how critical is 
>> this?) and then use fno-exceptions.
>>
>
> Yes, this would be controversial... I believe we do have some 
> written-for-C++ code in the kernel which
> does use exceptions, but never tried to estimate how much, or how 
> difficult it would be to get rid of it.
>  
>
>> 4. Use lto - given we have travis enabled there could be an option passed 
>> to makefile/build that could let one build kernel with lto if we deem it to 
>> be a dangerous/experimental feature.
>> 5. Do not link C++ std library whole-archive and effectively hide it. How 
>> would we support internal apps like cpiod, httpserver, cloud init, etc? 
>> Create C API for any symbols that are C++ right now and link those C+++ 
>> apps statically against libstd++?
>>
>> Other things we should try without sacrificing any functionality?
>>
>> As far as option 1 goes, I have already played with it a bit and applied 
>> this patch:
>>
>> diff --git a/Makefile b/Makefile
>> index db3c68cf..1b121fd8 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -279,7 +279,7 @@ gcc-sysroot = $(if $(CROSS_PREFIX), --sysroot 
>> $(aarch64_gccbase)) \
>>  #
>>  #   mydir/*.o EXTRA_FLAGS = 
>>  EXTRA_FLAGS = -D__OSV_CORE__ -DOSV_KERNEL_BASE=$(kernel_base) 
>> -DOSV_KERNEL_VM_BASE=$(kernel_vm_base) \
>> -   -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) 
>> -DOSV_LZKERNEL_BASE=$(lzkernel_base)
>> +   -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) 
>> -DOSV_LZKERNEL_BASE=$(lzkernel_base) -ffunction-sections
>>  EXTRA_LIBS =
>>  COMMON = $(autodepend) -g -Wall -Wno-pointer-arith $(CFLAGS_WERROR) 
>> -Wformat=0 -Wno-format-security \
>> -D __BSD_VISIBLE=1 -U _FORTIFY_SOURCE -fno-stack-protector 
>> $(INCLUDES) \
>> @@ -1859,7 +1859,7 @@ $(out)/loader.elf: $(stage1_targets) 
>> arch/$(arch)/loader.ld $(out)/bootfs.o $(lo
>> $(^:%.ld=-T %.ld) \
>> --whole-archive \
>>   $(libstdc++.a) $(libgcc_eh.a) \
>> - $(boost-libs) \
>> + $(boost-libs) --gc-sections \
>>
>
> I wonder what kind of size saving this might bring. Why should it bring 
> any saving at all? Just by using shorter (?) relative jumps in code?
>
Right now not much, as everything gets exported/included from all objects 
that are supplied to the linker. But once we start using version script to 
export what we really want (as per #97) the compiler/linker flags - 
"--ffunction-sections -fdata-sections" and "-gc-sections" should 
automatically make unneeded code simply "fall off" (garbage collected) and 
leave only what is needed. No? Am I wrong in how it works?

>  
>
>> --no-whole-archive $(libgcc.a), \
>> LINK loader.elf)
>> @# Build libosv.so matching this loader.elf. This is not a 
>> separate
>> @@ -1875,7 +1875,7 @@ $(out)/kernel.elf: $(stage1_targets) 
>> arch/$(arch)/loader.ld $(out)/empty_bootfs.
>> $(^:%.ld=-T %.ld) \
>> --whole-archive \
>>   

Re: [osv-dev] Shrinking kernel

2020-05-26 Thread Nadav Har'El
On Sun, May 17, 2020 at 7:59 AM Waldek Kozaczuk 
wrote:

> One of the things, I would like to tackle for the next release is making
> the loader.elf smaller. The primary motivation is to lower memory
> utilization. But at the same time, I would also like to keep the kernel
> (the release version) as debuggable as it is now (or close to).
>
> There are already 2 issues that could help us with that:
> - https://github.com/cloudius-systems/osv/issues/97 - Be more selective
> on symbols exported from the kernel
> - https://github.com/cloudius-systems/osv/issues/106 - Consider single
> instantiation for some templates (this may apply to more templates than
> just debug()).
>

It is very unclear why issue 106 makes the kernel smaller. It shouldn't...
Please make sure you look at the size of loader-stripped.elf - not
loader.elf.

If you really want to work on the size of loader-stripped.elf you should
probably use objdump/readelf/nm to try to figure out what are the biggest
parts of this object. Do we have some big functions we need to fix? Do we
have big static arrays (BSS) we can allocate at runtime?
Here are some example commits I did in the past making the kernel smaller
by using these ideas:

45f93e16f4727d506135101b51b9f2ea98e3a651 - zfs: smaller kernel by dropping
utf8 normalization support
8693761b737ad74e3d116f923bf0b4323d9df8b4 - build: don't put unnecessary
libraries in every image
c50a090f086968fbfc24ff0a1f085ebd570aa77d - trace: only allocate trace_log
when tracepoints are enabled


> Besides that, there are other things we can try:
> 1. Use compiler flags --ffunction-sections -fdata-sections and linker one
> -gc-sections.
> 2. Remove RTTI using -fno-rtti (I think we have over 1000 typeinfo entries
> in the symbols tables) - I do not think we have many dynamic_cast and
> hopefully, these can be eliminated.
> 3. Controversial - eliminate exception usage in kernel (how critical is
> this?) and then use fno-exceptions.
>

Yes, this would be controversial... I believe we do have some
written-for-C++ code in the kernel which
does use exceptions, but never tried to estimate how much, or how difficult
it would be to get rid of it.


> 4. Use lto - given we have travis enabled there could be an option passed
> to makefile/build that could let one build kernel with lto if we deem it to
> be a dangerous/experimental feature.
> 5. Do not link C++ std library whole-archive and effectively hide it. How
> would we support internal apps like cpiod, httpserver, cloud init, etc?
> Create C API for any symbols that are C++ right now and link those C+++
> apps statically against libstd++?
>
> Other things we should try without sacrificing any functionality?
>
> As far as option 1 goes, I have already played with it a bit and applied
> this patch:
>
> diff --git a/Makefile b/Makefile
> index db3c68cf..1b121fd8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -279,7 +279,7 @@ gcc-sysroot = $(if $(CROSS_PREFIX), --sysroot
> $(aarch64_gccbase)) \
>  #
>  #   mydir/*.o EXTRA_FLAGS = 
>  EXTRA_FLAGS = -D__OSV_CORE__ -DOSV_KERNEL_BASE=$(kernel_base)
> -DOSV_KERNEL_VM_BASE=$(kernel_vm_base) \
> -   -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift)
> -DOSV_LZKERNEL_BASE=$(lzkernel_base)
> +   -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift)
> -DOSV_LZKERNEL_BASE=$(lzkernel_base) -ffunction-sections
>  EXTRA_LIBS =
>  COMMON = $(autodepend) -g -Wall -Wno-pointer-arith $(CFLAGS_WERROR)
> -Wformat=0 -Wno-format-security \
> -D __BSD_VISIBLE=1 -U _FORTIFY_SOURCE -fno-stack-protector
> $(INCLUDES) \
> @@ -1859,7 +1859,7 @@ $(out)/loader.elf: $(stage1_targets)
> arch/$(arch)/loader.ld $(out)/bootfs.o $(lo
> $(^:%.ld=-T %.ld) \
> --whole-archive \
>   $(libstdc++.a) $(libgcc_eh.a) \
> - $(boost-libs) \
> + $(boost-libs) --gc-sections \
>

I wonder what kind of size saving this might bring. Why should it bring any
saving at all? Just by using shorter (?) relative jumps in code?


> --no-whole-archive $(libgcc.a), \
> LINK loader.elf)
> @# Build libosv.so matching this loader.elf. This is not a separate
> @@ -1875,7 +1875,7 @@ $(out)/kernel.elf: $(stage1_targets)
> arch/$(arch)/loader.ld $(out)/empty_bootfs.
> $(^:%.ld=-T %.ld) \
> --whole-archive \
>   $(libstdc++.a) $(libgcc_eh.a) \
> - $(boost-libs) \
> + $(boost-libs) --gc-sections \
> --no-whole-archive $(libgcc.a), \
> LINK kernel.elf)
> $(call quiet, $(STRIP) $(out)/kernel.elf -o
> $(out)/kernel-stripped.elf, STRIP kernel.elf -> kernel-stripped.elf )
> diff --git a/arch/x64/loader.ld b/arch/x64/loader.ld
> index f981859d..ab5cf75b 100644
> --- a/arch/x64/loader.ld
> +++ b/arch/x64/loader.ld
> @@ -56,10 +56,10 @@ SECTIONS
>  memcpy_decode_end = .;
>  } :text
>
> -.eh_frame : AT(ADDR(.eh_frame) - OSV_KERNEL_VM_SHIFT) { *(.eh_frame)
> } : text
> +.eh_frame : AT(ADDR(.eh_frame)