On Wed, Oct 10, 2018 at 10:57 PM Waldek Kozaczuk <jwkozac...@gmail.com> wrote:
> Last week I wrote a long post with suggestions on how to improve memory > allocation and utilization and seeking some feedback. This time I would > like to briefly update those interested on how we could drastically reduce > kernel size (loader-stripped.elf) by 3MB (~ 33%). > > Month ago there was a post about ideas to reduce kernel size and some > findings I discovered using bloaty. One of the things Nadav noted was large > (and unexplained) size of .rodata section - 2.47MB. Since then I spent some > time digging into it and trying to find in our code anything that defines > some large static data (strings, numbers, etc) that would go into .rodata. > No success until I looked closer at this lines in makefile and noticed* > --whole-archive* option: > > $(out)/loader.elf: $(stage1_targets) arch/$(arch)/loader.ld $(out)/bootfs.o > $(call quiet, $(LD) -o $@ --defsym=OSV_KERNEL_BASE=$(kernel_base) \ > -Bdynamic --export-dynamic --eh-frame-hdr > --enable-new-dtags \ > $(^:%.ld=-T %.ld) \ > --whole-archive \ > $(libstdc++.a) $(libgcc.a) $(libgcc_eh.a) \ > $(boost-libs) \ > --no-whole-archive, \ > LINK loader.elf) > > I turns out that *--whole-archive* option forces linker to link > everything (in reality I am guessing only sections we have in our linker > script loader.ld) from those 5 libraries whether our kernel code uses this > or not. Once I disabled this option the kernel size dropped by 3MB. And > many images/apps I tested (native-example, java, python) work just fine but > others fail with missing symbol errors. > The story of "--whole-archive" is related to the story of library versions. OSv uses pieces of libstdc++ (and a couple of Boost libraries I wish it hadn't). Without --whole-archive, only the pieces that OSv really needs will have been included inside the kernel. But OSv would *not* include a complete C++ library. So other C++ applications would need to include their own version of libstdc++, a complete one, which has all the symbols which *they* need. But then, we'll have a problem if the two versions of libstdc++ - the one linked inside the kernel and the shared library included with the application - are not exactly the same thing. Because the application will use random pieces from its own library, and other random pieces from OSv - and if those pieces - internal libstdc++ pieces - do not work together, nothing would work. Another option (see https://github.com/cloudius-systems/osv/issues/821) is to *hide* all the libstdc++ symbols in the OSv kernel from the application. So OSv will only include the minimal C++ library functions it needs to run, but if the application also needs the C++ library, it will need to bring its own copy as a shared library, and use symbols from it, NOT from OSv. This option will make the OSv kernel smaller and solve some of our version problems, but will make C++-based applications slightly bigger. > > > the most striking of those is libgcc.a that has almost 2MB (!!!) of rodata: > Interesting what this is. libgcc.a is pretty small - it's not the large C++ library, so it's not nice that it's having a large impact on our size. Note that libgcc.a is an ar archive: you can open it up (with ar) and inspect the individual objects it contains. Hopefully most of this "rodata" comes from a single object there, and we can figure out why it is there. > ../bloaty/bloaty -d sections /usr/lib/gcc/x86_64-linux-gnu/5/libgcc.a > VM SIZE FILE SIZE > -------------- -------------- > 78.5% 1.94Mi .rodata 1.94Mi 67.0% > 18.8% 473Ki .text 473Ki 16.0% > 0.0% 0 [ELF Headers] 179Ki 6.1% > 0.0% 0 .rela.text 101Ki 3.4% > 0.0% 0 .symtab 67.8Ki 2.3% > 1.4% 35.5Ki .data 35.5Ki 1.2% > 1.2% 30.9Ki .eh_frame 30.9Ki 1.0% > 0.0% 0 .strtab 22.7Ki 0.8% > 0.0% 0 .shstrtab 20.8Ki 0.7% > 0.0% 0 [AR Headers] 13.5Ki 0.5% > 0.0% 0 [AR Symbol Table] 11.8Ki 0.4% > 0.0% 0 .rela.eh_frame 11.2Ki 0.4% > 0.0% 0 .rela.rodata 2.60Ki 0.1% > 0.0% 0 [Unmapped] 2.19Ki 0.1% > 0.0% 1.16Ki .text.startup 1.16Ki 0.0% > 0.0% 0 .rela.text.startup 792 0.0% > 0.0% 368 .rodata.cst16 368 0.0% > 0.0% 243 [11 Others] 347 0.0% > 0.0% 248 .rodata.cst8 248 0.0% > 0.0% 208 .tbss 0 0.0% > 0.0% 168 .bss 0 0.0% > 100.0% 2.46Mi TOTAL 2.89Mi 100.0% > > Here are the examples of failure when whole-archive was disabled: > You can have more success on disabling whole-archive on *just* -lgcc and not on -lstdc++ Also, disabling whole-archive on -lgcc should be safe (IIRC) because we also put libgcc_s.so.1 in our image, and that would include any missing libgcc symbols, I think. > 1) golang > /go.so: failed looking up symbol _ZNSaIcEC1Ev > (std::allocator<char>::allocator()) > > [backtrace] > 0x0000000000343d29 <elf::object::symbol(unsigned int, bool)+825> > 0x0000000000343e7b <elf::object::resolve_pltgot(unsigned int)+139> > 0x0000000000344065 <elf_resolve_pltgot+69> > 0x000000000038b16f <???+3715439> > 0x00002000001ffe4f <???+2096719> > 0x00000000004198ec <osv::application::run_main()+60> > 0x000000000020c298 <osv::application::main()+152> > 0x0000000000419a98 <???+4299416> > 0x000000000044ad85 <???+4500869> > 0x00000000003e90d6 <thread_main_c+38> > 0x000000000038c4b2 <???+3720370> > > 2) tst-async.so > TEST tst-async.so > OSv v0.51.0-37-g186779b > eth0: 192.168.122.15 > /usr/lib/libboost_unit_test_framework.so.1.55.0: failed looking up symbol > _ZTISt19basic_ostringstreamIcSt11char_traitsIcESaIcEE (typeinfo for > std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> > >) > > [backtrace] > 0x0000000000343d29 <elf::object::symbol(unsigned int, bool)+825> > 0x000000000038ff06 <elf::object::arch_relocate_rela(unsigned int, unsigned > int, void*, long)+166> > 0x000000000033eb54 <elf::object::relocate_rela()+148> > 0x00000000003416e7 <elf::object::relocate()+199> > 0x0000000000345162 > <elf::program::load_object(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > > >, > std::vector<std::shared_ptr<elf::object>, > std::allocator<std::shared_ptr<elf::object> > >&)+1602> > 0x00000000003443b8 > <elf::object::load_needed(std::vector<std::shared_ptr<elf::object>, > std::allocator<std::shared_ptr<elf::object> > >&)+520> > 0x0000000000345156 > <elf::program::load_object(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > > >, > std::vector<std::shared_ptr<elf::object>, > std::allocator<std::shared_ptr<elf::object> > >&)+1590> > 0x00000000003459aa > <elf::program::get_library(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > > >, bool)+330> > 0x0000000000418e81 > <osv::application::application(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const&, > std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > > > const&, bool, > std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > >, > std::allocator<std::pair<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const, > std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > > > > const*, std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const&, > std::function<0x00000000004195c7 > <osv::application::run(std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const&, > std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > > > const&, bool, > std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> >, std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > >, > std::allocator<std::pair<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const, > std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > > > > const*, std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> > const&, std::function<void > ()>0x000000000041982a > <osv::application::run(std::vector<std::__cxx11::basic_string<char, > std::char_traits<char>, std::allocator<char> >, > std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, > std::allocator<char> > > > const&)+90> > 0x00000000002131d9 <do_main_thread(void*)+2601> > 0x000000000044ad85 <???+4500869> > 0x00000000003e90d6 <thread_main_c+38> > 0x000000000038c4b2 <???+3720370> > Test tst-async.so FAILED > > I wonder if these are simply "missing symbol" scenarios that could be > addressed by somehow forcing to link those into loader.elf. > > Relatedly I found this commit from 5 years ago by Avi that introduced > --whole-archive option for good reasons - > https://github.com/cloudius-systems/osv/commit/c9e61d4a45d88d8c8e79cd52fbcd38b91b291d5e. > But I wonder if there is a better way to not use whole-archive and solve > this problem in a different way (btw huge rodata is in libgcc.a not > libstdc++.a). I found this article but not sure if it provides solution > to different problem by using -u<symbol> workaround - > http://www.lysium.de/blog/index.php?/archives/222-Lost-static-objects-in-static-libraries-with-GNU-linker-ld.html > . > > In either case I was to run it by gcc/linker gurus on this mailing list to > see if they can think of other ways we can mitigate possible problems (and > what these problems might be) of not using --whole-archive. Certainly it > would be nice to make kernel smaller by 3MB by simply removing 1 line from > Makefile :-) I am also attaching 2 full bloaty reports as they also show > statistics for other sections when we disable whole-archive. > > Finally I found this interesting presentation about ways to reduce code > size - > https://elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdf > > Regards, > Waldek > > -- > You received this message because you are subscribed to the Google Groups > "OSv Development" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to osv-dev+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.