Last week I wrote a long post with suggestions on how to improve memory 
allocation and utilization and seeking some feedback. This time I would 
like to briefly update those interested on how we could drastically reduce 
kernel size (loader-stripped.elf) by 3MB (~ 33%).

Month ago there was a post about ideas to reduce kernel size and some 
findings I discovered using bloaty. One of the things Nadav noted was large 
(and unexplained) size of .rodata section - 2.47MB. Since then I spent some 
time digging into it and trying to find in our code anything that defines 
some large static data (strings, numbers, etc) that would go into .rodata. 
No success until I looked closer at this lines in makefile and noticed* 
--whole-archive* option:

$(out)/loader.elf: $(stage1_targets) arch/$(arch)/loader.ld $(out)/bootfs.o
        $(call quiet, $(LD) -o $@ --defsym=OSV_KERNEL_BASE=$(kernel_base) \
                -Bdynamic --export-dynamic --eh-frame-hdr 
--enable-new-dtags \
            $(^:%.ld=-T %.ld) \
            --whole-archive \
              $(libstdc++.a) $(libgcc.a) $(libgcc_eh.a) \
              $(boost-libs) \
            --no-whole-archive, \
                LINK loader.elf)

I turns out that *--whole-archive* option forces linker to link everything 
(in reality I am guessing only sections we have in our linker script 
loader.ld) from those 5 libraries whether our kernel code uses this or not. 
Once I disabled this option the kernel size dropped by 3MB. And many 
images/apps I tested (native-example, java, python) work just fine but 
others fail with missing symbol errors.

Here is some statistics about rodata section with whole-archive enabled and 
disabled:
....  
  26.0%  2.47Mi .rodata                                                    
                       2.47Mi  27.7%
      93.3%  2.30Mi [section .rodata]                                      
                           2.30Mi  93.3%
       2.7%  67.3Ki musl/src/locale/iconv.c                                
                           67.3Ki   2.7%
       1.6%  41.1Ki [407 Others]                                            
                          41.1Ki   1.6%
       0.4%  10.1Ki bsd/sys/crypto/rijndael/rijndael-alg-fst.c              
                          10.1Ki   0.4%
       0.3%  6.58Ki libc/crypt/encrypt.c                                    
                          6.58Ki   0.3%
       0.3%  6.58Ki musl/src/crypt/crypt_des.c                              
                          6.58Ki   0.3%
       0.2%  4.34Ki musl/src/crypt/crypt_blowfish.c                        
                           4.34Ki   0.2%
       0.2%  4.00Ki musl/src/math/exp2.c                                    
                          4.00Ki   0.2%
       0.1%  3.09Ki musl/src/ctype/iswpunct.c                              
                           3.09Ki   0.1%
       0.1%  2.91Ki musl/src/ctype/iswalpha.c                              
                           2.91Ki   0.1%
       0.1%  2.91Ki musl/src/ctype/wcwidth.c                                
                          2.91Ki   0.1%
       0.1%  2.77Ki musl/src/math/__rem_pio2_large.c                        
                          2.77Ki   0.1%
       0.1%  2.43Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c              
      2.43Ki   0.1%
       0.1%  2.16Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/zmod/inflate.c                  
      2.16Ki   0.1%
       0.1%  2.02Ki 
external/x64/acpica/source/components/parser/psopcode.c                    
       2.02Ki   0.1%
       0.1%  2.00Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/zmod/opensolaris_crc32.c        
      2.00Ki   0.1%
       0.1%  2.00Ki musl/src/math/exp2l.c                                  
                           2.00Ki   0.1%
       0.1%  1.85Ki musl/src/errno/strerror.c                              
                           1.85Ki   0.1%
       0.1%  1.75Ki 
external/x64/acpica/source/components/namespace/nspredef.c                  
      1.75Ki   0.1%
       0.1%  1.51Ki musl/src/ctype/__ctype_tolower_loc.c                    
                          1.51Ki   0.1%
       0.1%  1.51Ki musl/src/ctype/__ctype_toupper_loc.c                    
                          1.51Ki   0.1%


....
   7.8%   531Ki .rodata                                                    
                        531Ki   8.6%
      68.3%   363Ki [section .rodata]                                      
                            363Ki  68.3%
      12.6%  67.3Ki musl/src/locale/iconv.c                                
                           67.3Ki  12.6%
       7.7%  41.1Ki [407 Others]                                            
                          41.1Ki   7.7%
       1.9%  10.1Ki bsd/sys/crypto/rijndael/rijndael-alg-fst.c              
                          10.1Ki   1.9%
       1.2%  6.58Ki libc/crypt/encrypt.c                                    
                          6.58Ki   1.2%
       1.2%  6.58Ki musl/src/crypt/crypt_des.c                              
                          6.58Ki   1.2%
       0.8%  4.34Ki musl/src/crypt/crypt_blowfish.c                        
                           4.34Ki   0.8%
       0.8%  4.00Ki musl/src/math/exp2.c                                    
                          4.00Ki   0.8%
       0.6%  3.09Ki musl/src/ctype/iswpunct.c                              
                           3.09Ki   0.6%
       0.5%  2.91Ki musl/src/ctype/iswalpha.c                              
                           2.91Ki   0.5%
       0.5%  2.91Ki musl/src/ctype/wcwidth.c                                
                          2.91Ki   0.5%
       0.5%  2.77Ki musl/src/math/__rem_pio2_large.c                        
                          2.77Ki   0.5%
       0.5%  2.43Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c              
      2.43Ki   0.5%
       0.4%  2.16Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/zmod/inflate.c                  
      2.16Ki   0.4%
       0.4%  2.02Ki 
external/x64/acpica/source/components/parser/psopcode.c                    
       2.02Ki   0.4%
       0.4%  2.00Ki 
bsd/sys/cddl/contrib/opensolaris/uts/common/zmod/opensolaris_crc32.c        
      2.00Ki   0.4%
       0.4%  2.00Ki musl/src/math/exp2l.c                                  
                           2.00Ki   0.4%
       0.3%  1.85Ki musl/src/errno/strerror.c                              
                           1.85Ki   0.3%
       0.3%  1.75Ki 
external/x64/acpica/source/components/namespace/nspredef.c                  
      1.75Ki   0.3%
       0.3%  1.51Ki musl/src/ctype/__ctype_tolower_loc.c                    
                          1.51Ki   0.3%
       0.3%  1.51Ki musl/src/ctype/__ctype_toupper_loc.c                    
                          1.51Ki   0.3%

I do not understand how bloaty works but I think this line is key in the 
interpretation:
93.3%  2.30Mi [section .rodata]                                            
                     2.30Mi  93.3%
vs
68.3%   363Ki [section .rodata]                                            
                      363Ki  68.3%
which I guess specifies how much .rodata comes from these 5 libraries we 
link against.

Some bloaty statistics about the 5 libraries we use:
../bloaty/bloaty -d sections \
/usr/lib/gcc/x86_64-linux-gnu/5/libstdc++.a \
/usr/lib/gcc/x86_64-linux-gnu/5/libgcc.a \
/usr/lib/gcc/x86_64-linux-gnu/5/libgcc_eh.a \
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu//libboost_program_options.a
 
\
/usr/lib/gcc/x86_64-linux-gnu/5/../../../x86_64-linux-gnu//libboost_system.a

     VM SIZE                                                                
                        FILE SIZE
 --------------                                                            
                      --------------
  50.3%  1.94Mi .rodata                                                    
                       1.94Mi  20.4%
  25.6%  1013Ki [11335 Others]                                              
                      1.71Mi  18.0%
   0.0%       0 [ELF Headers]                                              
                       1.65Mi  17.4%
   0.0%       0 .shstrtab                                                  
                        934Ki   9.6%
   0.0%       0 .symtab                                                    
                        798Ki   8.2%
  16.2%   638Ki .text                                                      
                        638Ki   6.6%
   0.0%       0 .strtab                                                    
                        632Ki   6.5%
   0.0%       0 [AR Symbol Table]                                          
                        466Ki   4.8%
   5.8%   229Ki .eh_frame                                                  
                        229Ki   2.4%
   0.0%       0 .rela.text                                                  
                       190Ki   2.0%
   0.0%       0 .rela.eh_frame                                              
                       158Ki   1.6%
   0.0%       0 .group                                                      
                      55.7Ki   0.6%
   0.0%       0 [Unmapped]                                                  
                      44.5Ki   0.5%
   0.9%  35.5Ki .data                                                      
                       35.5Ki   0.4%
   0.0%       0 .rela.rodata                                                
                      28.6Ki   0.3%
   0.0%       0 [AR Headers]                                                
                      25.5Ki   0.3%
   0.4%  14.1Ki .gcc_except_table                                          
                       14.1Ki   0.1%
   0.3%  13.1Ki 
.text._ZN5boost15program_options17parse_config_fileIcEENS0_20basic_parsed_option
  
13.1Ki   0.1%
   0.3%  11.9Ki 
.text._ZN5boost15program_options17parse_config_fileIwEENS0_20basic_parsed_option
  
11.9Ki   0.1%
   0.0%       0 .rela.text._ZNSt6locale5_ImplC2Em                          
                       11.4Ki   0.1%
   0.3%  10.3Ki .rodata.str1.8                                              
                      10.3Ki   0.1%
 100.0%  3.86Mi TOTAL                                                      
                       9.50Mi 100.0%

the most striking of those is libgcc.a that has almost 2MB (!!!) of rodata:
../bloaty/bloaty -d sections /usr/lib/gcc/x86_64-linux-gnu/5/libgcc.a
     VM SIZE                          FILE SIZE
 --------------                    --------------
  78.5%  1.94Mi .rodata             1.94Mi  67.0%
  18.8%   473Ki .text                473Ki  16.0%
   0.0%       0 [ELF Headers]        179Ki   6.1%
   0.0%       0 .rela.text           101Ki   3.4%
   0.0%       0 .symtab             67.8Ki   2.3%
   1.4%  35.5Ki .data               35.5Ki   1.2%
   1.2%  30.9Ki .eh_frame           30.9Ki   1.0%
   0.0%       0 .strtab             22.7Ki   0.8%
   0.0%       0 .shstrtab           20.8Ki   0.7%
   0.0%       0 [AR Headers]        13.5Ki   0.5%
   0.0%       0 [AR Symbol Table]   11.8Ki   0.4%
   0.0%       0 .rela.eh_frame      11.2Ki   0.4%
   0.0%       0 .rela.rodata        2.60Ki   0.1%
   0.0%       0 [Unmapped]          2.19Ki   0.1%
   0.0%  1.16Ki .text.startup       1.16Ki   0.0%
   0.0%       0 .rela.text.startup     792   0.0%
   0.0%     368 .rodata.cst16          368   0.0%
   0.0%     243 [11 Others]            347   0.0%
   0.0%     248 .rodata.cst8           248   0.0%
   0.0%     208 .tbss                    0   0.0%
   0.0%     168 .bss                     0   0.0%
 100.0%  2.46Mi TOTAL               2.89Mi 100.0%

Here are the examples of failure when whole-archive was disabled:
1) golang
/go.so: failed looking up symbol _ZNSaIcEC1Ev 
(std::allocator<char>::allocator())

[backtrace]
0x0000000000343d29 <elf::object::symbol(unsigned int, bool)+825>
0x0000000000343e7b <elf::object::resolve_pltgot(unsigned int)+139>
0x0000000000344065 <elf_resolve_pltgot+69>
0x000000000038b16f <???+3715439>
0x00002000001ffe4f <???+2096719>
0x00000000004198ec <osv::application::run_main()+60>
0x000000000020c298 <osv::application::main()+152>
0x0000000000419a98 <???+4299416>
0x000000000044ad85 <???+4500869>
0x00000000003e90d6 <thread_main_c+38>
0x000000000038c4b2 <???+3720370>

2) tst-async.so
  TEST tst-async.so
OSv v0.51.0-37-g186779b
eth0: 192.168.122.15
/usr/lib/libboost_unit_test_framework.so.1.55.0: failed looking up symbol 
_ZTISt19basic_ostringstreamIcSt11char_traitsIcESaIcEE (typeinfo for 
std::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> 
>)

[backtrace]
0x0000000000343d29 <elf::object::symbol(unsigned int, bool)+825>
0x000000000038ff06 <elf::object::arch_relocate_rela(unsigned int, unsigned 
int, void*, long)+166>
0x000000000033eb54 <elf::object::relocate_rela()+148>
0x00000000003416e7 <elf::object::relocate()+199>
0x0000000000345162 
<elf::program::load_object(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > >, 
std::vector<std::shared_ptr<elf::object>, 
std::allocator<std::shared_ptr<elf::object> > >&)+1602>
0x00000000003443b8 
<elf::object::load_needed(std::vector<std::shared_ptr<elf::object>, 
std::allocator<std::shared_ptr<elf::object> > >&)+520>
0x0000000000345156 
<elf::program::load_object(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > >, 
std::vector<std::shared_ptr<elf::object>, 
std::allocator<std::shared_ptr<elf::object> > >&)+1590>
0x00000000003459aa 
<elf::program::get_library(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > >, bool)+330>
0x0000000000418e81 
<osv::application::application(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, 
std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > > const&, bool, 
std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > > > const*, std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, 
std::function<0x00000000004195c7 
<osv::application::run(std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, 
std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > > > const&, bool, 
std::unordered_map<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> >, std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::hash<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > >, std::equal_to<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > >, 
std::allocator<std::pair<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const, 
std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > > > const*, std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> > const&, std::function<void 
()>0x000000000041982a 
<osv::application::run(std::vector<std::__cxx11::basic_string<char, 
std::char_traits<char>, std::allocator<char> >, 
std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, 
std::allocator<char> > > > const&)+90>
0x00000000002131d9 <do_main_thread(void*)+2601>
0x000000000044ad85 <???+4500869>
0x00000000003e90d6 <thread_main_c+38>
0x000000000038c4b2 <???+3720370>
Test tst-async.so FAILED

I wonder if these are simply "missing symbol" scenarios that could be 
addressed by somehow forcing to link those into loader.elf.

Relatedly I found this commit from 5 years ago by Avi that introduced 
--whole-archive option for good reasons 
- 
https://github.com/cloudius-systems/osv/commit/c9e61d4a45d88d8c8e79cd52fbcd38b91b291d5e.
 
But I wonder if there is a better way to not use whole-archive and solve 
this problem in a different way (btw huge rodata is in libgcc.a not 
libstdc++.a). I found this article but not sure if it provides solution to 
different problem by using -u<symbol> workaround 
- 
http://www.lysium.de/blog/index.php?/archives/222-Lost-static-objects-in-static-libraries-with-GNU-linker-ld.html.

In either case I was to run it by gcc/linker gurus on this mailing list to 
see if they can think of other ways we can mitigate possible problems (and 
what these problems might be) of not using --whole-archive. Certainly it 
would be nice to make kernel smaller by 3MB by simply removing 1 line from 
Makefile :-) I am also attaching 2 full bloaty reports as they also show 
statistics for other sections when we disable whole-archive.

Finally I found this interesting presentation about ways to reduce code 
size - https://elinux.org/images/2/2d/ELC2010-gc-sections_Denys_Vlasenko.pdf

Regards,
Waldek

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Attachment: bloaty_no_whole_archive
Description: Binary data

Attachment: bloaty_whole_archive
Description: Binary data

Reply via email to