On Sat, May 30, 2020 at 5:25 PM Waldek Kozaczuk <[email protected]>
wrote:

> On my Ubuntu 20.04 applying the patch, I am attaching below, shrinks the
> kernel.elf from 6.8MB down to 5MB (on Fedora 32 from 6.7MB down to 5MB, and
> older Ubuntu 19.04 [gcc 8.3] from 6.2MB down to 4.9MB). And this is before
> applying any GCC optimizations I was suggesting in
> https://groups.google.com/forum/#!topic/osv-dev/hCYGRzytaJ4 and
> modularization ideas in
> https://groups.google.com/forum/#!topic/osv-dev/BHwN6Stm3n4) (except
> hiding C++ library).
>
> diff --git a/Makefile b/Makefile
> index 20ddf3b1..8caac77b 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -1873,10 +1873,11 @@ $(out)/kernel.elf: $(stage1_targets)
> arch/$(arch)/loader.ld $(out)/empty_bootfs.
>       --defsym=OSV_KERNEL_VM_BASE=$(kernel_vm_base)
> --defsym=OSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) \
>   -Bdynamic --export-dynamic --eh-frame-hdr --enable-new-dtags
> -L$(out)/arch/$(arch) \
>       $(^:%.ld=-T %.ld) \
> +     --version-script=/tmp/version_script \
>       --whole-archive \
> -       $(libstdc++.a) $(libgcc_eh.a) \
> +       $(libgcc_eh.a) \
>         $(boost-libs) \
>

If you don't --whole-archive the C++ library, why do it for the even more
esoteric Boost libraries? And gcc_eh? Maybe you could save even more?

-     --no-whole-archive $(libgcc.a), \
> +     --no-whole-archive $(libstdc++.a) $(libgcc.a), \
>   LINK kernel.elf)
>   $(call quiet, $(STRIP) $(out)/kernel.elf -o $(out)/kernel-stripped.elf,
> STRIP kernel.elf -> kernel-stripped.elf )
>   $(call very-quiet, cp $(out)/kernel-stripped.elf $(out)/kernel.elf)
> diff --git a/core/elf.cc b/core/elf.cc
> index ffb16004..830b2577 100644
> --- a/core/elf.cc
> +++ b/core/elf.cc
> @@ -1310,7 +1310,6 @@ program::program(void* addr)
>            "libpthread.so.0",
>            "libdl.so.2",
>            "librt.so.1",
> -          "libstdc++.so.6",
>            "libaio.so.1",
>            "libxenstore.so.3.0",
>            "libcrypt.so.1",
> diff --git a/scripts/generate_version_script.sh
> b/scripts/generate_version_script.sh
> new file mode 100755
> index 00000000..607c71fa
> --- /dev/null
> +++ b/scripts/generate_version_script.sh
> @@ -0,0 +1,42 @@
> +#!/bin/bash
> +
> +VERSION_SCRIPT_FILE=$1
> +
> +extract_symbols_from_library()
> +{
> +  local LIB_NAME=$1
> +  local LIB_PATH=$(gcc -print-file-name=$LIB_NAME)
> +  echo "/*------- $LIB_PATH */" >> $VERSION_SCRIPT_FILE
> +  nm -g -CD --defined-only $LIB_PATH | awk '// { printf("    %s;\n",$3)
> }' >> $VERSION_SCRIPT_FILE
> +}
> +
> +
> +VERSION_SCRIPT_START=$(cat <<"EOF"
> +{
> +  global:
> +EOF
> +)
> +
> +VERSION_SCRIPT_END=$(cat <<"EOF"
> +  local:
> +    *;
> +};
> +EOF
> +)
> +
> +echo "$VERSION_SCRIPT_START" > $VERSION_SCRIPT_FILE
> +
> +extract_symbols_from_library "libresolv.so.2"
> +extract_symbols_from_library "libc.so.6"
> +extract_symbols_from_library "libm.so.6"
> +extract_symbols_from_library "ld-linux-x86-64.so.2"
> +extract_symbols_from_library "libpthread.so.0"
> +extract_symbols_from_library "libdl.so.2"
> +extract_symbols_from_library "librt.so.1"
> +extract_symbols_from_library "libcrypt.so.1"
> +extract_symbols_from_library "libaio.so.1"
> +extract_symbols_from_library "libxenstore.so.3.0"
> +#extract_symbols_from_library "libboost_system.so"
> +
> +echo "$VERSION_SCRIPT_END" >> $VERSION_SCRIPT_FILE
> +#-fvisibility=hidden -fvisibility-inlines-hidden
> diff --git a/scripts/manifest_from_host.sh b/scripts/manifest_from_host.sh
> index 71b3340d..82b47d1a 100755
> --- a/scripts/manifest_from_host.sh
> +++ b/scripts/manifest_from_host.sh
> @@ -61,7 +61,7 @@ output_manifest()
>   echo "# Dependencies" | tee -a $OUTPUT
>   echo "# --------------------" | tee -a $OUTPUT
>   lddtree $so_path | grep -v "not found" | grep -v "$so_path" | grep -v
> 'ld-linux-x86-64' | \
> - grep -Pv
> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
> | \
> + grep -Pv
> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
> | \
>   sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | sort | uniq | tee
> -a $OUTPUT
>  }
>
> @@ -135,7 +135,7 @@ if [[ -d $NAME_OR_PATH ]]; then
>   echo "# Dependencies" | tee -a $OUTPUT
>   echo "# --------------------" | tee -a $OUTPUT
>   lddtree $SO_FILES | grep -v "not found" | grep -v
> "$NAME_OR_PATH/$SUBDIRECTORY_PATH" | grep -v 'ld-linux-x86-64' | \
> - grep -Pv
> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
> | \
> + grep -Pv
> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?'
> | \
>   sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | sort | uniq | tee
> -a $OUTPUT
>   fi
>   exit 0
>
>
> In essence, this PoC patch hides most symbols as well as links libstdc++.a
> with '--no-whole-archive' (please note it only applies to kernel.elf not
> loader.elf for now). I have successfully tested both ZFS and ROFS images
> with native-example, java-example (running '/usr/bin/java -cp /java-example
> Hello !' so that it does not use our java wrapper), golang-pie-example and
> redis-memonly.
>
> You will notice that it adds a new script - scripts/generate_version_script.sh
> - to generate limited number of symbols to be visible from kernel.elf.
> Feel free to offer suggestions for doing it better (other options for nm or
> using readelf).
>

What does this "version script" do if it finds a symbol in the host library
which we never implemented?

I'm a bit worried that this approach makes use depend even more on the
setup on the host - if there's a function not defined there, it also won't
be available from OSv. On the other hand, it may be a valid approach at
least as an *option*, while we're still considering if this is a good idea
or not.


>
> Here is the output from bloaty before applying my patch:
>
>     FILE SIZE        VM SIZE
>  --------------  --------------
>   55.8%  3.74Mi  51.3%  3.74Mi    .text
>   10.4%   714Ki   9.6%   714Ki    .dynstr
>    0.0%       0   8.1%   605Ki    .bss
>    8.7%   597Ki   8.0%   597Ki    .eh_frame
>    6.8%   467Ki   6.3%   467Ki    .rodata
>    5.8%   397Ki   5.3%   397Ki    .dynsym
>    2.8%   192Ki   2.6%   192Ki    .percpu
>    2.1%   146Ki   2.0%   146Ki    .gnu.hash
>    1.9%   130Ki   1.7%   130Ki    .hash
>    1.9%   130Ki   1.7%   130Ki    .eh_frame_hdr
>    1.5%   102Ki   1.4%   102Ki    .gcc_except_table
>    1.2%  80.1Ki   1.1%  80.0Ki    .data.rel.ro
>    0.4%  26.7Ki   0.4%  26.7Ki    .data
>    0.2%  10.4Ki   0.2%  12.5Ki    [LOAD #0 [RWX]]
>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>    0.2%  10.6Ki   0.1%  10.6Ki    .data.rel.local
>    0.1%  6.98Ki   0.1%  3.95Ki    [47 Others]
>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>    0.0%  2.22Ki   0.0%  2.16Ki    .init_array
>    0.0%  1.99Ki   0.0%       0    .shstrtab
>    0.0%       0   0.0%  1.73Ki    .tbss
>  100.0%  6.71Mi 100.0%  7.30Mi    TOTAL
>
> and number of exported symbols:
> nm -g -CD --defined-only build/release/kernel.elf | wc -l
> 16960
>
>
> After hiding most symbols but C++ library still whole-archive:
>     FILE SIZE        VM SIZE
>  --------------  --------------
>   69.1%  3.74Mi  62.3%  3.74Mi    .text
>    0.0%       0   9.8%   605Ki    .bss
>   10.8%   597Ki   9.7%   597Ki    .eh_frame
>    8.4%   467Ki   7.6%   467Ki    .rodata
>    3.5%   192Ki   3.1%   192Ki    .percpu
>    2.3%   130Ki   2.1%   130Ki    .eh_frame_hdr
>    1.9%   102Ki   1.7%   102Ki    .gcc_except_table
>    1.4%  80.1Ki   1.3%  80.0Ki    .data.rel.ro
>    0.6%  31.6Ki   0.5%  31.6Ki    .dynsym
>    0.5%  26.7Ki   0.4%  26.7Ki    .data
>    0.2%  12.5Ki   0.2%  14.6Ki    [LOAD #0 [RWX]]
>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>    0.2%  12.3Ki   0.2%  12.2Ki    .dynstr
>

Unsurprisingly, most of the savings seem to come from .dynsym and .dynstr.
Which makes me think if we can't get most of this savings by just hiding
specific very long symbol names (perhaps a lot of instances of some C++
template? Or what?), making more implementation details "static" or
"visibility hidden" - without trying to hide almost everything except what
we want to deliberately allow.

   0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
>    0.2%  10.3Ki   0.2%  10.3Ki    .gnu.hash
>    0.2%  9.36Ki   0.2%  9.30Ki    .hash
>    0.1%  6.58Ki   0.1%  3.54Ki    [47 Others]
>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>    0.0%  2.22Ki   0.0%  2.16Ki    .init_array
>    0.0%  1.99Ki   0.0%       0    .shstrtab
>    0.0%       0   0.0%  1.73Ki    .tbss
>  100.0%  5.42Mi 100.0%  6.01Mi    TOTAL
>
> nm -g -CD --defined-only build/release/kernel.elf | wc -l
> 1339
>
> Full patch:
>     FILE SIZE        VM SIZE
>  --------------  --------------
>   68.5%  3.42Mi  61.3%  3.42Mi    .text
>    0.0%       0  10.6%   604Ki    .bss
>   10.6%   541Ki   9.5%   541Ki    .eh_frame
>    8.9%   454Ki   7.9%   454Ki    .rodata
>    3.8%   192Ki   3.4%   192Ki    .percpu
>    2.3%   119Ki   2.1%   119Ki    .eh_frame_hdr
>    1.8%  90.1Ki   1.6%  90.0Ki    .gcc_except_table
>    1.5%  74.6Ki   1.3%  74.5Ki    .data.rel.ro
>    0.6%  31.6Ki   0.6%  31.6Ki    .dynsym
>    0.5%  26.7Ki   0.5%  26.7Ki    .data
>    0.2%  11.9Ki   0.2%  14.1Ki    [LOAD #0 [RWX]]
>    0.2%  12.3Ki   0.2%  12.2Ki    .tracepoint_patch_sites
>    0.2%  12.3Ki   0.2%  12.2Ki    .dynstr
>    0.2%  10.6Ki   0.2%  10.6Ki    .data.rel.local
>    0.2%  10.3Ki   0.2%  10.3Ki    .gnu.hash
>    0.2%  9.36Ki   0.2%  9.30Ki    .hash
>    0.1%  6.08Ki   0.1%  3.48Ki    [39 Others]
>    0.1%  5.61Ki   0.1%  5.55Ki    .data.rel
>    0.0%  2.20Ki   0.0%  2.13Ki    .init_array
>    0.0%       0   0.0%  1.73Ki    .tbss
>    0.0%  1.62Ki   0.0%       0    .shstrtab
>  100.0%  5.00Mi 100.0%  5.59Mi    TOTAL
>
> nm -g -CD --defined-only build/release/kernel.elf | wc -l
> 1339
>
> I wonder if it would be beneficial to compile as many sources as possible
> with '-fvisibility=hidden -fvisibility-inlines-hidden' - it will not make
> the kernel much smaller if any, but may produce more optimized code
> sometimes, no?
>

Yes, this similar to what I said above. It will make the kernel smaller if
we don't use this version file thing.
I don't know if it's something we should do at a per-source-file level or
perhaps (but this will take a lot of work) as a default and just enable
visibility of specific things.


>
> We still need to figure out what exactly to do with internal apps like
> cpiod, httpserver and tests that need some of the hidden symbols available.
> Possibly expose those extra symbols as C.
>

Note that it's fine to expose C++ symbols as in <osv/power.hh> - which have
mangled names and even namespaces. The problems start when such functions
take C++ types (even something as simple as std::string) or throw
exceptions. Those require the C++ on both kernel and application to be the
same one :-(

Anyway, you're probably right that whatever new APIs add should probably be
in C, not C++. In the beginning we thought we would have a lot of those,
but it seems this will never happen. We already started converting - or at
least just wrapping - some of the new APIs to C because we needed to use
them in C code - see include/osv/osv_c_wrappers.h and I think there's even
more I don't remember.



> Waldek
>
> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/147d66a0-dcc3-41f8-b94f-71d761dbb98e%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/147d66a0-dcc3-41f8-b94f-71d761dbb98e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CANEVyjutsgm%2BCGVZ%3DrBo-rCm9%3D2sg4aFyBDGhh7V8PssA76y4A%40mail.gmail.com.

Reply via email to