Re: [osv-dev] Shrinking kernel
On Tuesday, May 26, 2020 at 4:26:09 AM UTC-4, Nadav Har'El wrote: > > On Sun, May 17, 2020 at 7:59 AM Waldek Kozaczuk > wrote: > >> One of the things, I would like to tackle for the next release is making >> the loader.elf smaller. The primary motivation is to lower memory >> utilization. But at the same time, I would also like to keep the kernel >> (the release version) as debuggable as it is now (or close to). >> >> There are already 2 issues that could help us with that: >> - https://github.com/cloudius-systems/osv/issues/97 - Be more selective >> on symbols exported from the kernel >> - https://github.com/cloudius-systems/osv/issues/106 - Consider single >> instantiation for some templates (this may apply to more templates than >> just debug()). >> > > It is very unclear why issue 106 makes the kernel smaller. It shouldn't... > Please make sure you look at the size of loader-stripped.elf - not > loader.elf. > > If you really want to work on the size of loader-stripped.elf you should > probably use objdump/readelf/nm to try to figure out what are the biggest > parts of this object. Do we have some big functions we need to fix? Do we > have big static arrays (BSS) we can allocate at runtime? > Here are some example commits I did in the past making the kernel smaller > by using these ideas: > > 45f93e16f4727d506135101b51b9f2ea98e3a651 - zfs: smaller kernel by dropping > utf8 normalization support > 8693761b737ad74e3d116f923bf0b4323d9df8b4 - build: don't put unnecessary > libraries in every image > c50a090f086968fbfc24ff0a1f085ebd570aa77d - trace: only allocate trace_log > when tracepoints are enabled > > >> Besides that, there are other things we can try: >> 1. Use compiler flags --ffunction-sections -fdata-sections and linker one >> -gc-sections. >> 2. Remove RTTI using -fno-rtti (I think we have over 1000 typeinfo >> entries in the symbols tables) - I do not think we have many dynamic_cast >> and hopefully, these can be eliminated. >> 3. Controversial - eliminate exception usage in kernel (how critical is >> this?) and then use fno-exceptions. >> > > Yes, this would be controversial... I believe we do have some > written-for-C++ code in the kernel which > does use exceptions, but never tried to estimate how much, or how > difficult it would be to get rid of it. > > >> 4. Use lto - given we have travis enabled there could be an option passed >> to makefile/build that could let one build kernel with lto if we deem it to >> be a dangerous/experimental feature. >> 5. Do not link C++ std library whole-archive and effectively hide it. How >> would we support internal apps like cpiod, httpserver, cloud init, etc? >> Create C API for any symbols that are C++ right now and link those C+++ >> apps statically against libstd++? >> >> Other things we should try without sacrificing any functionality? >> >> As far as option 1 goes, I have already played with it a bit and applied >> this patch: >> >> diff --git a/Makefile b/Makefile >> index db3c68cf..1b121fd8 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -279,7 +279,7 @@ gcc-sysroot = $(if $(CROSS_PREFIX), --sysroot >> $(aarch64_gccbase)) \ >> # >> # mydir/*.o EXTRA_FLAGS = >> EXTRA_FLAGS = -D__OSV_CORE__ -DOSV_KERNEL_BASE=$(kernel_base) >> -DOSV_KERNEL_VM_BASE=$(kernel_vm_base) \ >> - -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) >> -DOSV_LZKERNEL_BASE=$(lzkernel_base) >> + -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) >> -DOSV_LZKERNEL_BASE=$(lzkernel_base) -ffunction-sections >> EXTRA_LIBS = >> COMMON = $(autodepend) -g -Wall -Wno-pointer-arith $(CFLAGS_WERROR) >> -Wformat=0 -Wno-format-security \ >> -D __BSD_VISIBLE=1 -U _FORTIFY_SOURCE -fno-stack-protector >> $(INCLUDES) \ >> @@ -1859,7 +1859,7 @@ $(out)/loader.elf: $(stage1_targets) >> arch/$(arch)/loader.ld $(out)/bootfs.o $(lo >> $(^:%.ld=-T %.ld) \ >> --whole-archive \ >> $(libstdc++.a) $(libgcc_eh.a) \ >> - $(boost-libs) \ >> + $(boost-libs) --gc-sections \ >> > > I wonder what kind of size saving this might bring. Why should it bring > any saving at all? Just by using shorter (?) relative jumps in code? > Right now not much, as everything gets exported/included from all objects that are supplied to the linker. But once we start using version script to export what we really want (as per #97) the compiler/linker flags - "--ffunction-sections -fdata-sections" and "-gc-sections" should automatically make unneeded code simply "fall off" (garbage collected) and leave only what is needed. No? Am I wrong in how it works? > > >> --no-whole-archive $(libgcc.a), \ >> LINK loader.elf) >> @# Build libosv.so matching this loader.elf. This is not a >> separate >> @@ -1875,7 +1875,7 @@ $(out)/kernel.elf: $(stage1_targets) >> arch/$(arch)/loader.ld $(out)/empty_bootfs. >> $(^:%.ld=-T %.ld) \ >> --whole-archive \ >>
Re: [osv-dev] Shrinking kernel
On Sun, May 17, 2020 at 7:59 AM Waldek Kozaczuk wrote: > One of the things, I would like to tackle for the next release is making > the loader.elf smaller. The primary motivation is to lower memory > utilization. But at the same time, I would also like to keep the kernel > (the release version) as debuggable as it is now (or close to). > > There are already 2 issues that could help us with that: > - https://github.com/cloudius-systems/osv/issues/97 - Be more selective > on symbols exported from the kernel > - https://github.com/cloudius-systems/osv/issues/106 - Consider single > instantiation for some templates (this may apply to more templates than > just debug()). > It is very unclear why issue 106 makes the kernel smaller. It shouldn't... Please make sure you look at the size of loader-stripped.elf - not loader.elf. If you really want to work on the size of loader-stripped.elf you should probably use objdump/readelf/nm to try to figure out what are the biggest parts of this object. Do we have some big functions we need to fix? Do we have big static arrays (BSS) we can allocate at runtime? Here are some example commits I did in the past making the kernel smaller by using these ideas: 45f93e16f4727d506135101b51b9f2ea98e3a651 - zfs: smaller kernel by dropping utf8 normalization support 8693761b737ad74e3d116f923bf0b4323d9df8b4 - build: don't put unnecessary libraries in every image c50a090f086968fbfc24ff0a1f085ebd570aa77d - trace: only allocate trace_log when tracepoints are enabled > Besides that, there are other things we can try: > 1. Use compiler flags --ffunction-sections -fdata-sections and linker one > -gc-sections. > 2. Remove RTTI using -fno-rtti (I think we have over 1000 typeinfo entries > in the symbols tables) - I do not think we have many dynamic_cast and > hopefully, these can be eliminated. > 3. Controversial - eliminate exception usage in kernel (how critical is > this?) and then use fno-exceptions. > Yes, this would be controversial... I believe we do have some written-for-C++ code in the kernel which does use exceptions, but never tried to estimate how much, or how difficult it would be to get rid of it. > 4. Use lto - given we have travis enabled there could be an option passed > to makefile/build that could let one build kernel with lto if we deem it to > be a dangerous/experimental feature. > 5. Do not link C++ std library whole-archive and effectively hide it. How > would we support internal apps like cpiod, httpserver, cloud init, etc? > Create C API for any symbols that are C++ right now and link those C+++ > apps statically against libstd++? > > Other things we should try without sacrificing any functionality? > > As far as option 1 goes, I have already played with it a bit and applied > this patch: > > diff --git a/Makefile b/Makefile > index db3c68cf..1b121fd8 100644 > --- a/Makefile > +++ b/Makefile > @@ -279,7 +279,7 @@ gcc-sysroot = $(if $(CROSS_PREFIX), --sysroot > $(aarch64_gccbase)) \ > # > # mydir/*.o EXTRA_FLAGS = > EXTRA_FLAGS = -D__OSV_CORE__ -DOSV_KERNEL_BASE=$(kernel_base) > -DOSV_KERNEL_VM_BASE=$(kernel_vm_base) \ > - -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) > -DOSV_LZKERNEL_BASE=$(lzkernel_base) > + -DOSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) > -DOSV_LZKERNEL_BASE=$(lzkernel_base) -ffunction-sections > EXTRA_LIBS = > COMMON = $(autodepend) -g -Wall -Wno-pointer-arith $(CFLAGS_WERROR) > -Wformat=0 -Wno-format-security \ > -D __BSD_VISIBLE=1 -U _FORTIFY_SOURCE -fno-stack-protector > $(INCLUDES) \ > @@ -1859,7 +1859,7 @@ $(out)/loader.elf: $(stage1_targets) > arch/$(arch)/loader.ld $(out)/bootfs.o $(lo > $(^:%.ld=-T %.ld) \ > --whole-archive \ > $(libstdc++.a) $(libgcc_eh.a) \ > - $(boost-libs) \ > + $(boost-libs) --gc-sections \ > I wonder what kind of size saving this might bring. Why should it bring any saving at all? Just by using shorter (?) relative jumps in code? > --no-whole-archive $(libgcc.a), \ > LINK loader.elf) > @# Build libosv.so matching this loader.elf. This is not a separate > @@ -1875,7 +1875,7 @@ $(out)/kernel.elf: $(stage1_targets) > arch/$(arch)/loader.ld $(out)/empty_bootfs. > $(^:%.ld=-T %.ld) \ > --whole-archive \ > $(libstdc++.a) $(libgcc_eh.a) \ > - $(boost-libs) \ > + $(boost-libs) --gc-sections \ > --no-whole-archive $(libgcc.a), \ > LINK kernel.elf) > $(call quiet, $(STRIP) $(out)/kernel.elf -o > $(out)/kernel-stripped.elf, STRIP kernel.elf -> kernel-stripped.elf ) > diff --git a/arch/x64/loader.ld b/arch/x64/loader.ld > index f981859d..ab5cf75b 100644 > --- a/arch/x64/loader.ld > +++ b/arch/x64/loader.ld > @@ -56,10 +56,10 @@ SECTIONS > memcpy_decode_end = .; > } :text > > -.eh_frame : AT(ADDR(.eh_frame) - OSV_KERNEL_VM_SHIFT) { *(.eh_frame) > } : text > +.eh_frame : AT(ADDR(.eh_frame)