Sent from my iPhone
> On May 31, 2020, at 12:32, Waldek Kozaczuk <[email protected]> wrote: > > > > >> On Sun, May 31, 2020 at 04:40 Nadav Har'El <[email protected]> wrote: >>> >>>> On Sat, May 30, 2020 at 5:25 PM Waldek Kozaczuk <[email protected]> >>>> wrote: >>>> On my Ubuntu 20.04 applying the patch, I am attaching below, shrinks the >>>> kernel.elf from 6.8MB down to 5MB (on Fedora 32 from 6.7MB down to 5MB, >>>> and older Ubuntu 19.04 [gcc 8.3] from 6.2MB down to 4.9MB). And this is >>>> before applying any GCC optimizations I was suggesting in >>>> https://groups.google.com/forum/#!topic/osv-dev/hCYGRzytaJ4 and >>>> modularization ideas in >>>> https://groups.google.com/forum/#!topic/osv-dev/BHwN6Stm3n4) (except >>>> hiding C++ library). >>>> >>>> diff --git a/Makefile b/Makefile >>>> index 20ddf3b1..8caac77b 100644 >>>> --- a/Makefile >>>> +++ b/Makefile >>>> @@ -1873,10 +1873,11 @@ $(out)/kernel.elf: $(stage1_targets) >>>> arch/$(arch)/loader.ld $(out)/empty_bootfs. >>>> --defsym=OSV_KERNEL_VM_BASE=$(kernel_vm_base) >>>> --defsym=OSV_KERNEL_VM_SHIFT=$(kernel_vm_shift) \ >>>> -Bdynamic --export-dynamic --eh-frame-hdr --enable-new-dtags >>>> -L$(out)/arch/$(arch) \ >>>> $(^:%.ld=-T %.ld) \ >>>> + --version-script=/tmp/version_script \ >>>> --whole-archive \ >>>> - $(libstdc++.a) $(libgcc_eh.a) \ >>>> + $(libgcc_eh.a) \ >>>> $(boost-libs) \ >>> >>> If you don't --whole-archive the C++ library, why do it for the even more >>> esoteric Boost libraries? And gcc_eh? Maybe you could save even more? >> Sure. Do you know what is the purpose of libgcc_eh.a and what was the >> original reason of linking it ‘—whole-archive’? And if we don’t, would we >> have to add some other .so file to the image like with C++ standard library? >> >>> - --no-whole-archive $(libgcc.a), \ >>> + --no-whole-archive $(libstdc++.a) $(libgcc.a), \ >>> LINK kernel.elf) >>> $(call quiet, $(STRIP) $(out)/kernel.elf -o $(out)/kernel-stripped.elf, >>> STRIP kernel.elf -> kernel-stripped.elf ) >>> $(call very-quiet, cp $(out)/kernel-stripped.elf $(out)/kernel.elf) >>> diff --git a/core/elf.cc b/core/elf.cc >>> index ffb16004..830b2577 100644 >>> --- a/core/elf.cc >>> +++ b/core/elf.cc >>> @@ -1310,7 +1310,6 @@ program::program(void* addr) >>> "libpthread.so.0", >>> "libdl.so.2", >>> "librt.so.1", >>> - "libstdc++.so.6", >>> "libaio.so.1", >>> "libxenstore.so.3.0", >>> "libcrypt.so.1", >>> diff --git a/scripts/generate_version_script.sh >>> b/scripts/generate_version_script.sh >>> new file mode 100755 >>> index 00000000..607c71fa >>> --- /dev/null >>> +++ b/scripts/generate_version_script.sh >>> @@ -0,0 +1,42 @@ >>> +#!/bin/bash >>> + >>> +VERSION_SCRIPT_FILE=$1 >>> + >>> +extract_symbols_from_library() >>> +{ >>> + local LIB_NAME=$1 >>> + local LIB_PATH=$(gcc -print-file-name=$LIB_NAME) >>> + echo "/*------- $LIB_PATH */" >> $VERSION_SCRIPT_FILE >>> + nm -g -CD --defined-only $LIB_PATH | awk '// { printf(" %s;\n",$3) }' >>> >> $VERSION_SCRIPT_FILE >>> +} >>> + >>> + >>> +VERSION_SCRIPT_START=$(cat <<"EOF" >>> +{ >>> + global: >>> +EOF >>> +) >>> + >>> +VERSION_SCRIPT_END=$(cat <<"EOF" >>> + local: >>> + *; >>> +}; >>> +EOF >>> +) >>> + >>> +echo "$VERSION_SCRIPT_START" > $VERSION_SCRIPT_FILE >>> + >>> +extract_symbols_from_library "libresolv.so.2" >>> +extract_symbols_from_library "libc.so.6" >>> +extract_symbols_from_library "libm.so.6" >>> +extract_symbols_from_library "ld-linux-x86-64.so.2" >>> +extract_symbols_from_library "libpthread.so.0" >>> +extract_symbols_from_library "libdl.so.2" >>> +extract_symbols_from_library "librt.so.1" >>> +extract_symbols_from_library "libcrypt.so.1" >>> +extract_symbols_from_library "libaio.so.1" >>> +extract_symbols_from_library "libxenstore.so.3.0" >>> +#extract_symbols_from_library "libboost_system.so" >>> + >>> +echo "$VERSION_SCRIPT_END" >> $VERSION_SCRIPT_FILE >>> +#-fvisibility=hidden -fvisibility-inlines-hidden >>> diff --git a/scripts/manifest_from_host.sh b/scripts/manifest_from_host.sh >>> index 71b3340d..82b47d1a 100755 >>> --- a/scripts/manifest_from_host.sh >>> +++ b/scripts/manifest_from_host.sh >>> @@ -61,7 +61,7 @@ output_manifest() >>> echo "# Dependencies" | tee -a $OUTPUT >>> echo "# --------------------" | tee -a $OUTPUT >>> lddtree $so_path | grep -v "not found" | grep -v "$so_path" | grep -v >>> 'ld-linux-x86-64' | \ >>> - grep -Pv >>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?' >>> | \ >>> + grep -Pv >>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?' >>> | \ >>> sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | sort | uniq >>> | tee -a $OUTPUT >>> } >>> >>> @@ -135,7 +135,7 @@ if [[ -d $NAME_OR_PATH ]]; then >>> echo "# Dependencies" | tee -a $OUTPUT >>> echo "# --------------------" | tee -a $OUTPUT >>> lddtree $SO_FILES | grep -v "not found" | grep -v >>> "$NAME_OR_PATH/$SUBDIRECTORY_PATH" | grep -v 'ld-linux-x86-64' | \ >>> - grep -Pv >>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|stdc\+\+|aio|xenstore|crypt|selinux)\.so([\d.]+)?' >>> | \ >>> + grep -Pv >>> 'lib(gcc_s|resolv|c|m|pthread|dl|rt|aio|xenstore|crypt|selinux)\.so([\d.]+)?' >>> | \ >>> sed 's/ =>/:/' | sed 's/^\s*lib/\/usr\/lib\/lib/' | >>> sort | uniq | tee -a $OUTPUT >>> fi >>> exit 0 >>> >>> >>> In essence, this PoC patch hides most symbols as well as links libstdc++.a >>> with '--no-whole-archive' (please note it only applies to kernel.elf not >>> loader.elf for now). I have successfully tested both ZFS and ROFS images >>> with native-example, java-example (running '/usr/bin/java -cp /java-example >>> Hello !' so that it does not use our java wrapper), golang-pie-example and >>> redis-memonly. >>> >>> You will notice that it adds a new script - >>> scripts/generate_version_script.sh - to generate limited number of symbols >>> to be visible from kernel.elf. Feel free to offer suggestions for doing it >>> better (other options for nm or using readelf). >> >> What does this "version script" do if it finds a symbol in the host library >> which we never implemented? > From my experiments it looks like it simply ignores such missing symbol, > which is what we want. As a matter of fact the generated version-script > contains over 4000 symbols but we end up with only a little 1,300 ones > exported in kernel.elf. I also noticed that if you compile individual sources > with '-fvisibility=hidden -fvisibility-inlines-hidden' and have relevant > symbols listed in the version script files, it is too late in the process and > they do not get exported. If we relied on '-fvisibility=hidden > -fvisibility-inlines-hidden' and added proper annotation to keep what we want > as public it would be pretty painful to do so. Ideally we should generate and then maintain a version script file that ONLy contains the list of symbols that OSv actually implements from these 7-8 libraries I am running nm against. >> >> I'm a bit worried that this approach makes use depend even more on the setup >> on the host - if there's a function not defined there, it also won't be >> available from OSv. On the other hand, it may be a valid approach at least >> as an *option*, while we're still considering if this is a good idea or not. > In this demo patch (PoC) I am assuming we generate version script dynamically > as the part of the build process. But instead we could use the same or > similar build script generate_version_script.sh to create the version file > once and check it into source control. We would probably have to create some > sort of superset version file and run this tool against library files from > Ubuntu and Fedora host in case those libraries (like "libc.so.6) contain > different symbols. The downside would be that we would have to manually add > new symbols to this checked in version file (effectively maintain it), but > maybe it is not so bad given it would be single file. What do you think? >> >> >>> >>> Here is the output from bloaty before applying my patch: >>> >>> FILE SIZE VM SIZE >>> -------------- -------------- >>> 55.8% 3.74Mi 51.3% 3.74Mi .text >>> 10.4% 714Ki 9.6% 714Ki .dynstr >>> 0.0% 0 8.1% 605Ki .bss >>> 8.7% 597Ki 8.0% 597Ki .eh_frame >>> 6.8% 467Ki 6.3% 467Ki .rodata >>> 5.8% 397Ki 5.3% 397Ki .dynsym >>> 2.8% 192Ki 2.6% 192Ki .percpu >>> 2.1% 146Ki 2.0% 146Ki .gnu.hash >>> 1.9% 130Ki 1.7% 130Ki .hash >>> 1.9% 130Ki 1.7% 130Ki .eh_frame_hdr >>> 1.5% 102Ki 1.4% 102Ki .gcc_except_table >>> 1.2% 80.1Ki 1.1% 80.0Ki .data.rel.ro >>> 0.4% 26.7Ki 0.4% 26.7Ki .data >>> 0.2% 10.4Ki 0.2% 12.5Ki [LOAD #0 [RWX]] >>> 0.2% 12.3Ki 0.2% 12.2Ki .tracepoint_patch_sites >>> 0.2% 10.6Ki 0.1% 10.6Ki .data.rel.local >>> 0.1% 6.98Ki 0.1% 3.95Ki [47 Others] >>> 0.1% 5.61Ki 0.1% 5.55Ki .data.rel >>> 0.0% 2.22Ki 0.0% 2.16Ki .init_array >>> 0.0% 1.99Ki 0.0% 0 .shstrtab >>> 0.0% 0 0.0% 1.73Ki .tbss >>> 100.0% 6.71Mi 100.0% 7.30Mi TOTAL >>> >>> and number of exported symbols: >>> nm -g -CD --defined-only build/release/kernel.elf | wc -l >>> 16960 >>> >>> >>> After hiding most symbols but C++ library still whole-archive: >>> FILE SIZE VM SIZE >>> -------------- -------------- >>> 69.1% 3.74Mi 62.3% 3.74Mi .text >>> 0.0% 0 9.8% 605Ki .bss >>> 10.8% 597Ki 9.7% 597Ki .eh_frame >>> 8.4% 467Ki 7.6% 467Ki .rodata >>> 3.5% 192Ki 3.1% 192Ki .percpu >>> 2.3% 130Ki 2.1% 130Ki .eh_frame_hdr >>> 1.9% 102Ki 1.7% 102Ki .gcc_except_table >>> 1.4% 80.1Ki 1.3% 80.0Ki .data.rel.ro >>> 0.6% 31.6Ki 0.5% 31.6Ki .dynsym >>> 0.5% 26.7Ki 0.4% 26.7Ki .data >>> 0.2% 12.5Ki 0.2% 14.6Ki [LOAD #0 [RWX]] >>> 0.2% 12.3Ki 0.2% 12.2Ki .tracepoint_patch_sites >>> 0.2% 12.3Ki 0.2% 12.2Ki .dynstr >> >> Unsurprisingly, most of the savings seem to come from .dynsym and .dynstr. >> Which makes me think if we can't get most of this savings by just hiding >> specific very long symbol names (perhaps a lot of instances of some C++ >> template? Or what?), making more implementation details "static" or >> "visibility hidden" - without trying to hide almost everything except what >> we want to deliberately allow. > I am afraid it would be more painful. I think we still would want to compile > with visibility hidden as much as possible, but having central version file > that lists all symbols we export is really better because we would de facto > define the "interface" for the apps. >> >>> 0.2% 10.6Ki 0.2% 10.6Ki .data.rel.local >>> 0.2% 10.3Ki 0.2% 10.3Ki .gnu.hash >>> 0.2% 9.36Ki 0.2% 9.30Ki .hash >>> 0.1% 6.58Ki 0.1% 3.54Ki [47 Others] >>> 0.1% 5.61Ki 0.1% 5.55Ki .data.rel >>> 0.0% 2.22Ki 0.0% 2.16Ki .init_array >>> 0.0% 1.99Ki 0.0% 0 .shstrtab >>> 0.0% 0 0.0% 1.73Ki .tbss >>> 100.0% 5.42Mi 100.0% 6.01Mi TOTAL >>> >>> nm -g -CD --defined-only build/release/kernel.elf | wc -l >>> 1339 >>> >>> Full patch: >>> FILE SIZE VM SIZE >>> -------------- -------------- >>> 68.5% 3.42Mi 61.3% 3.42Mi .text >>> 0.0% 0 10.6% 604Ki .bss >>> 10.6% 541Ki 9.5% 541Ki .eh_frame >>> 8.9% 454Ki 7.9% 454Ki .rodata >>> 3.8% 192Ki 3.4% 192Ki .percpu >>> 2.3% 119Ki 2.1% 119Ki .eh_frame_hdr >>> 1.8% 90.1Ki 1.6% 90.0Ki .gcc_except_table >>> 1.5% 74.6Ki 1.3% 74.5Ki .data.rel.ro >>> 0.6% 31.6Ki 0.6% 31.6Ki .dynsym >>> 0.5% 26.7Ki 0.5% 26.7Ki .data >>> 0.2% 11.9Ki 0.2% 14.1Ki [LOAD #0 [RWX]] >>> 0.2% 12.3Ki 0.2% 12.2Ki .tracepoint_patch_sites >>> 0.2% 12.3Ki 0.2% 12.2Ki .dynstr >>> 0.2% 10.6Ki 0.2% 10.6Ki .data.rel.local >>> 0.2% 10.3Ki 0.2% 10.3Ki .gnu.hash >>> 0.2% 9.36Ki 0.2% 9.30Ki .hash >>> 0.1% 6.08Ki 0.1% 3.48Ki [39 Others] >>> 0.1% 5.61Ki 0.1% 5.55Ki .data.rel >>> 0.0% 2.20Ki 0.0% 2.13Ki .init_array >>> 0.0% 0 0.0% 1.73Ki .tbss >>> 0.0% 1.62Ki 0.0% 0 .shstrtab >>> 100.0% 5.00Mi 100.0% 5.59Mi TOTAL >>> >>> nm -g -CD --defined-only build/release/kernel.elf | wc -l >>> 1339 >>> >>> I wonder if it would be beneficial to compile as many sources as possible >>> with '-fvisibility=hidden -fvisibility-inlines-hidden' - it will not make >>> the kernel much smaller if any, but may produce more optimized code >>> sometimes, no? >> >> Yes, this similar to what I said above. It will make the kernel smaller if >> we don't use this version file thing. >> I don't know if it's something we should do at a per-source-file level or >> perhaps (but this will take a lot of work) as a default and just enable >> visibility of specific things. > We should try to do it as much as possible but still use single version file > to define all symbols exported. >> >>> >>> We still need to figure out what exactly to do with internal apps like >>> cpiod, httpserver and tests that need some of the hidden symbols available. >>> Possibly expose those extra symbols as C. >> >> Note that it's fine to expose C++ symbols as in <osv/power.hh> - which have >> mangled names and even namespaces. The problems start when such functions >> take C++ types (even something as simple as std::string) or throw >> exceptions. Those require the C++ on both kernel and application to be the >> same one :-( >> >> Anyway, you're probably right that whatever new APIs add should probably be >> in C, not C++. In the beginning we thought we would have a lot of those, but >> it seems this will never happen. We already started converting - or at least >> just wrapping - some of the new APIs to C because we needed to use them in C >> code - see include/osv/osv_c_wrappers.h and I think there's even more I >> don't remember. > Agreed. >> >> >>> >>> Waldek >>> -- >>> You received this message because you are subscribed to the Google Groups >>> "OSv Development" group. >>> To unsubscribe from this group and stop receiving emails from it, send an >>> email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/osv-dev/147d66a0-dcc3-41f8-b94f-71d761dbb98e%40googlegroups.com. -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/A5621EC1-A0A2-48DF-88CE-F44157F71B17%40gmail.com.
