Sent from my iPhone

> On May 31, 2020, at 12:32, Waldek Kozaczuk <[email protected]> wrote:
> 
> 
> 
> 
>> On Sun, May 31, 2020 at 04:40 Nadav Har'El <[email protected]> wrote:
>>> 
>>>> On Sat, May 30, 2020 at 5:25 PM Waldek Kozaczuk <[email protected]> 
>>>> wrote:
>>>> On my Ubuntu 20.04 applying the patch, I am attaching below, shrinks the 
>>>> kernel.elf from 6.8MB down to 5MB (on Fedora 32 from 6.7MB down to 5MB, 
>>>> and older Ubuntu 19.04 [gcc 8.3] from 6.2MB down to 4.9MB). And this is 
>>>> before applying any GCC optimizations I was suggesting in 
>>>> https://groups.google.com/forum/#!topic/osv-dev/hCYGRzytaJ4 and 
>>>> modularization ideas in 
>>>> https://groups.google.com/forum/#!topic/osv-dev/BHwN6Stm3n4) (except 
>>>> hiding C++ library).
>>>> 
>>>> diff --git a/Makefile b/Makefile
>>>> index 20ddf3b1..8caac77b 100644
>>>> --- a/Makefile
>>>> +++ b/Makefile
>>>> @@ -1873,10 +1873,11 @@ $(out)/kernel.elf: $(stage1_targets) 
>>>> arch/$(arch)/loader.ld $(out)/empty_bootfs.
>>>>        --defsym=OSV_KERNEL_VM_BASE=$(kernel_vm_base) 
>>>> --defsym=OSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) \
>>>>            -Bdynamic --export-dynamic --eh-frame-hdr --enable-new-dtags 
>>>> -L$(out)/arch/$(arch) \
>>>>        $(^:%.ld=-T %.ld) \
>>>> +      --version-script=/tmp/version_script \
>>>>        --whole-archive \
>>>> -        $(libstdc++.a) $(libgcc_eh.a) \
>>>> +        $(libgcc_eh.a) \
>>>>          $(boost-libs) \
>>> 
>>> If you don't --whole-archive the C++ library, why do it for the even more 
>>> esoteric Boost libraries? And gcc_eh? Maybe you could save even more?
>> Sure. Do you know what is the purpose of libgcc_eh.a and what was the 
>> original reason of linking it ‘—whole-archive’? And if we don’t, would we 
>> have to add some other .so file to the image like with C++ standard library?
>> 
>>> -       --no-whole-archive $(libgcc.a), \
>>> +       --no-whole-archive $(libstdc++.a) $(libgcc.a), \
>>>             LINK kernel.elf)
>>>     $(call quiet, $(STRIP) $(out)/kernel.elf -o $(out)/kernel-stripped.elf, 
>>> STRIP kernel.elf -> kernel-stripped.elf )
>>>     $(call very-quiet, cp $(out)/kernel-stripped.elf $(out)/kernel.elf)
>>> diff --git a/core/elf.cc b/core/elf.cc
>>> index ffb16004..830b2577 100644
>>> --- a/core/elf.cc
>>> +++ b/core/elf.cc
>>> @@ -1310,7 +1310,6 @@ program::program(void* addr)
>>>            "libpthread.so.0",
>>>            "libdl.so.2",
>>>            "librt.so.1",
>>> -          "libstdc++.so.6",
>>>            "libaio.so.1",
>>>            "libxenstore.so.3.0",
>>>            "libcrypt.so.1",
>>> diff --git a/scripts/generate_version_script.sh 
>>> b/scripts/generate_version_script.sh
>>> new file mode 100755
>>> index 00000000..607c71fa
>>> --- /dev/null
>>> +++ b/scripts/generate_version_script.sh
>>> @@ -0,0 +1,42 @@
>>> +#!/bin/bash
>>> +
>>> +VERSION_SCRIPT_FILE=$1
>>> +
>>> +extract_symbols_from_library()
>>> +{
>>> +  local LIB_NAME=$1
>>> +  local LIB_PATH=$(gcc -print-file-name=$LIB_NAME)
>>> +  echo "/*------- $LIB_PATH */" >> $VERSION_SCRIPT_FILE
>>> +  nm -g -CD --defined-only $LIB_PATH | awk '// { printf("    %s;\n",$3) }' 
>>> >> $VERSION_SCRIPT_FILE
>>> +}
>>> +
>>> +
>>> +VERSION_SCRIPT_START=$(cat <<"EOF"
>>> +{
>>> +  global:
>>> +EOF
>>> +)
>>> +
>>> +VERSION_SCRIPT_END=$(cat <<"EOF"
>>> +  local:
>>> +    *;
>>> +};
>>> +EOF
>>> +)
>>> +
>>> +echo "$VERSION_SCRIPT_START" > $VERSION_SCRIPT_FILE
>>> +
>>> +extract_symbols_from_library "libresolv.so.2"
>>> +extract_symbols_from_library "libc.so.6"
>>> +extract_symbols_from_library "libm.so.6"
>>> +extract_symbols_from_library "ld-linux-x86-64.so.2"
>>> +extract_symbols_from_library "libpthread.so.0"
>>> +extract_symbols_from_library "libdl.so.2"
>>> +extract_symbols_from_library "librt.so.1"
>>> +extract_symbols_from_library "libcrypt.so.1"
>>> +extract_symbols_from_library "libaio.so.1"
>>> +extract_symbols_from_library "libxenstore.so.3.0"
>>> +#extract_symbols_from_library "libboost_system.so"
>>> +
>>> +echo "$VERSION_SCRIPT_END" >> $VERSION_SCRIPT_FILE
>>> +#-fvisibility=hidden -fvisibility-inlines-hidden
>>> diff --git a/scripts/manifest_from_host.sh b/scripts/manifest_from_host.sh
>>> index 71b3340d..82b47d1a 100755
>>> --- a/scripts/manifest_from_host.sh
>>> +++ b/scripts/manifest_from_host.sh
>>> @@ -61,7 +61,7 @@ output_manifest()
>>>     echo "# Dependencies" | tee -a $OUTPUT
>>>     echo "# --------------------" | tee -a $OUTPUT
>>>     lddtree $so_path | grep -v "not found" | grep -v "$so_path" | grep -v 
>>> 'ld-linux-x86-64' | \
>>> -           grep -Pv 
>>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
>>>  | \
>>> +           grep -Pv 
>>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
>>>  | \
>>>             sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | sort | uniq 
>>> | tee -a $OUTPUT
>>>  }
>>>  
>>> @@ -135,7 +135,7 @@ if [[ -d $NAME_OR_PATH ]]; then
>>>             echo "# Dependencies" | tee -a $OUTPUT
>>>             echo "# --------------------" | tee -a $OUTPUT
>>>             lddtree $SO_FILES | grep -v "not found" | grep -v 
>>> "$NAME_OR_PATH/$SUBDIRECTORY_PATH" | grep -v 'ld-linux-x86-64' | \
>>> -                   grep -Pv 
>>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
>>>  | \
>>> +                   grep -Pv 
>>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
>>>  | \
>>>                     sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | 
>>> sort | uniq | tee -a $OUTPUT
>>>     fi
>>>     exit 0
>>> 
>>> 
>>> In essence, this PoC patch hides most symbols as well as links libstdc++.a 
>>> with '--no-whole-archive' (please note it only applies to kernel.elf not 
>>> loader.elf for now). I have successfully tested both ZFS and ROFS images 
>>> with native-example, java-example (running '/usr/bin/java -cp /java-example 
>>> Hello !' so that it does not use our java wrapper), golang-pie-example and 
>>> redis-memonly. 
>>> 
>>> You will notice that it adds a new script - 
>>> scripts/generate_version_script.sh - to generate limited number of symbols 
>>> to be visible from kernel.elf. Feel free to offer suggestions for doing it 
>>> better (other options for nm or using readelf).
>> 
>> What does this "version script" do if it finds a symbol in the host library 
>> which we never implemented?
> From my experiments it looks like it simply ignores such missing symbol, 
> which is what we want. As a matter of fact the generated version-script 
> contains over 4000 symbols but we end up with only a little 1,300 ones 
> exported in kernel.elf. I also noticed that if you compile individual sources 
> with  '-fvisibility=hidden -fvisibility-inlines-hidden' and have relevant 
> symbols listed in the version script files, it is too late in the process and 
> they do not get exported. If we relied on  '-fvisibility=hidden 
> -fvisibility-inlines-hidden' and added proper annotation to keep what we want 
> as public it would be pretty painful to do so.
Ideally we should generate and then maintain a version script file that ONLy 
contains the list of symbols that OSv actually implements from these 7-8 
libraries I am running nm against. 
>> 
>> I'm a bit worried that this approach makes use depend even more on the setup 
>> on the host - if there's a function not defined there, it also won't be 
>> available from OSv. On the other hand, it may be a valid approach at least 
>> as an *option*, while we're still considering if this is a good idea or not. 
> In this demo patch (PoC) I am assuming we generate version script dynamically 
> as the part of the build process. But instead we could use the same or 
> similar build script generate_version_script.sh to create the version file 
> once and check it into source control. We would probably have to create some 
> sort of superset version file and run this tool against library files from 
> Ubuntu and Fedora host in case those libraries (like "libc.so.6) contain 
> different symbols. The downside would be that we would have to manually add 
> new symbols to this checked in version file (effectively maintain it), but 
> maybe it is not so bad given it would be single file. What do you think? 
>>  
>>  
>>> 
>>> Here is the output from bloaty before applying my patch:
>>> 
>>>     FILE SIZE        VM SIZE    
>>>  --------------  -------------- 
>>>   55.8%  3.74Mi  51.3%  3.74Mi    .text
>>>   10.4%   714Ki   9.6%   714Ki    .dynstr
>>>    0.0%       0   8.1%   605Ki    .bss
>>>    8.7%   597Ki   8.0%   597Ki    .eh_frame
>>>    6.8%   467Ki   6.3%   467Ki    .rodata
>>>    5.8%   397Ki   5.3%   397Ki    .dynsym
>>>    2.8%   192Ki   2.6%   192Ki    .percpu
>>>    2.1%   146Ki   2.0%   146Ki    .gnu.hash
>>>    1.9%   130Ki   1.7%   130Ki    .hash
>>>    1.9%   130Ki   1.7%   130Ki    .eh_frame_hdr
>>>    1.5%   102Ki   1.4%   102Ki    .gcc_except_table
>>>    1.2%  80.1Ki   1.1%  80.0Ki    .data.rel.ro
>>>    0.4%  26.7Ki   0.4%  26.7Ki    .data
>>>    0.2%  10.4Ki   0.2%  12.5Ki    [LOAD #0 [RWX]]
>>>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>>>    0.2%  10.6Ki   0.1%  10.6Ki    .data.rel.local
>>>    0.1%  6.98Ki   0.1%  3.95Ki    [47 Others]
>>>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>>>    0.0%  2.22Ki   0.0%  2.16Ki    .init_array
>>>    0.0%  1.99Ki   0.0%       0    .shstrtab
>>>    0.0%       0   0.0%  1.73Ki    .tbss
>>>  100.0%  6.71Mi 100.0%  7.30Mi    TOTAL
>>> 
>>> and number of exported symbols:
>>> nm -g -CD --defined-only build/release/kernel.elf | wc -l
>>> 16960
>>> 
>>> 
>>> After hiding most symbols but C++ library still whole-archive:
>>>     FILE SIZE        VM SIZE    
>>>  --------------  -------------- 
>>>   69.1%  3.74Mi  62.3%  3.74Mi    .text
>>>    0.0%       0   9.8%   605Ki    .bss
>>>   10.8%   597Ki   9.7%   597Ki    .eh_frame
>>>    8.4%   467Ki   7.6%   467Ki    .rodata
>>>    3.5%   192Ki   3.1%   192Ki    .percpu
>>>    2.3%   130Ki   2.1%   130Ki    .eh_frame_hdr
>>>    1.9%   102Ki   1.7%   102Ki    .gcc_except_table
>>>    1.4%  80.1Ki   1.3%  80.0Ki    .data.rel.ro
>>>    0.6%  31.6Ki   0.5%  31.6Ki    .dynsym
>>>    0.5%  26.7Ki   0.4%  26.7Ki    .data
>>>    0.2%  12.5Ki   0.2%  14.6Ki    [LOAD #0 [RWX]]
>>>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>>>    0.2%  12.3Ki   0.2%  12.2Ki    .dynstr
>> 
>> Unsurprisingly, most of the savings seem to come from .dynsym and .dynstr.
>> Which makes me think if we can't get most of this savings by just hiding 
>> specific very long symbol names (perhaps a lot of instances of some C++ 
>> template? Or what?), making more implementation details "static" or 
>> "visibility hidden" - without trying to hide almost everything except what 
>> we want to deliberately allow. 
> I am afraid it would be more painful. I think we still would want to compile 
> with visibility hidden as much as possible, but having central version file 
> that lists all symbols we export is really better because we would de facto 
> define the "interface" for the apps.
>> 
>>>    0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
>>>    0.2%  10.3Ki   0.2%  10.3Ki    .gnu.hash
>>>    0.2%  9.36Ki   0.2%  9.30Ki    .hash
>>>    0.1%  6.58Ki   0.1%  3.54Ki    [47 Others]
>>>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>>>    0.0%  2.22Ki   0.0%  2.16Ki    .init_array
>>>    0.0%  1.99Ki   0.0%       0    .shstrtab
>>>    0.0%       0   0.0%  1.73Ki    .tbss
>>>  100.0%  5.42Mi 100.0%  6.01Mi    TOTAL
>>> 
>>> nm -g -CD --defined-only build/release/kernel.elf | wc -l
>>> 1339
>>> 
>>> Full patch:
>>>     FILE SIZE        VM SIZE    
>>>  --------------  -------------- 
>>>   68.5%  3.42Mi  61.3%  3.42Mi    .text
>>>    0.0%       0  10.6%   604Ki    .bss
>>>   10.6%   541Ki   9.5%   541Ki    .eh_frame
>>>    8.9%   454Ki   7.9%   454Ki    .rodata
>>>    3.8%   192Ki   3.4%   192Ki    .percpu
>>>    2.3%   119Ki   2.1%   119Ki    .eh_frame_hdr
>>>    1.8%  90.1Ki   1.6%  90.0Ki    .gcc_except_table
>>>    1.5%  74.6Ki   1.3%  74.5Ki    .data.rel.ro
>>>    0.6%  31.6Ki   0.6%  31.6Ki    .dynsym
>>>    0.5%  26.7Ki   0.5%  26.7Ki    .data
>>>    0.2%  11.9Ki   0.2%  14.1Ki    [LOAD #0 [RWX]]
>>>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>>>    0.2%  12.3Ki   0.2%  12.2Ki    .dynstr
>>>    0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
>>>    0.2%  10.3Ki   0.2%  10.3Ki    .gnu.hash
>>>    0.2%  9.36Ki   0.2%  9.30Ki    .hash
>>>    0.1%  6.08Ki   0.1%  3.48Ki    [39 Others]
>>>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>>>    0.0%  2.20Ki   0.0%  2.13Ki    .init_array
>>>    0.0%       0   0.0%  1.73Ki    .tbss
>>>    0.0%  1.62Ki   0.0%       0    .shstrtab
>>>  100.0%  5.00Mi 100.0%  5.59Mi    TOTAL
>>> 
>>> nm -g -CD --defined-only build/release/kernel.elf | wc -l
>>> 1339
>>> 
>>> I wonder if it would be beneficial to compile as many sources as possible 
>>> with '-fvisibility=hidden -fvisibility-inlines-hidden' - it will not make 
>>> the kernel much smaller if any, but may produce more optimized code 
>>> sometimes, no?
>> 
>> Yes, this similar to what I said above. It will make the kernel smaller if 
>> we don't use this version file thing.
>> I don't know if it's something we should do at a per-source-file level or 
>> perhaps (but this will take a lot of work) as a default and just enable 
>> visibility of specific things. 
> We should try to do it as much as possible but still use single version file 
> to define all symbols exported.
>>  
>>> 
>>> We still need to figure out what exactly to do with internal apps like 
>>> cpiod, httpserver and tests that need some of the hidden symbols available. 
>>> Possibly expose those extra symbols as C.
>> 
>> Note that it's fine to expose C++ symbols as in <osv/power.hh> - which have 
>> mangled names and even namespaces. The problems start when such functions 
>> take C++ types (even something as simple as std::string) or throw 
>> exceptions. Those require the C++ on both kernel and application to be the 
>> same one :-(
>> 
>> Anyway, you're probably right that whatever new APIs add should probably be 
>> in C, not C++. In the beginning we thought we would have a lot of those, but 
>> it seems this will never happen. We already started converting - or at least 
>> just wrapping - some of the new APIs to C because we needed to use them in C 
>> code - see include/osv/osv_c_wrappers.h and I think there's even more I 
>> don't remember.
> Agreed. 
>> 
>> 
>>> 
>>> Waldek
>>> -- 
>>> You received this message because you are subscribed to the Google Groups 
>>> "OSv Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send an 
>>> email to [email protected].
>>> To view this discussion on the web visit 
>>> https://groups.google.com/d/msgid/osv-dev/147d66a0-dcc3-41f8-b94f-71d761dbb98e%40googlegroups.com.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/A5621EC1-A0A2-48DF-88CE-F44157F71B17%40gmail.com.

Reply via email to