[osv-dev] Librarization/Modularization

Waldek Kozaczuk Mon, 25 May 2020 22:29:56 -0700

Hi,

I am going to be sending proper "Next Release Proposal" email later this 
week (or next) and "Librarization/Modularization" will be a key part of it. 
Currently, OSv kernel provides quite a significant subset of the 
functionality of some standard Linux libraries listed here - 
https://github.com/cloudius-systems/osv#kernel-size. In reality, many 
applications do not need all of this functionality, but they "get it" 
whether they need it or not. Even Java, which used to need lots of symbols 
from standard libraries, has become way more modular, and with the advent 
of GraalVM and other AOT-type technologies, OSv kernel does not need to 
provide all this functionality universally to every app. Worse, if you run 
an app on Firecracker which needs console, non-PCI virtio-blk and 
virtio-net drivers only, one gets all other drivers including ones for 
VirtualBox, Xen, VMware, etc. This actually makes OSv barely a unikernel or 
at best a "fat" one. This has some real negative consequences - higher 
memory utilization (kernel needs to be loaded in memory), larger kernel 
file (makes decompression longer), and poorer security because of the 
fairly vast number of exported symbols (at this moment everything 
non-static gets exported) and finally possibly less optimized code. On the 
other hand, because of this "universality", it is *quite easy*, comparing 
to other unikernels, to run an arbitrary Linux app on OSv. And no matter 
what we do to make OSv more modular, we should preserve that "ease" and not 
make it harder, at least by default, to run an app on OSv.


So in general, what I am advocating for, is an ability (and a mechanism) to 
create more "stripped-down" versions of kernels *tailored to the need of 
specific app and/or specific hypervisor *OSv will run on while preserving 
the default universal kernel. And also shrinking the universal kernel by 
*extracting 
optional functionality* from it, where it makes sense and is relatively 
easy to do so, as a shared library to be loaded during the boot process. 
The latter should also ideally involve the build process (compile/link) 
optimizations I have already proposed in my other email I sent a week ago 
to the group - 
https://groups.google.com/d/msg/osv-dev/hCYGRzytaJ4/D23S_ibNAgAJ

In the end, what I am proposing could be organized in the following three 
categories:

   1. Tailor kernel (and really drivers) to a specific hypervisor - this 
   could be as simple as defining more granular sets of targets in the main 
   makefile and adding #ifdef in all relevant places and possibly using 
   existing ./conf/*.mk - based mechanism; for starters we could define a 
   build configuration for Firacracker and QEMU microvm machine that I believe 
   requires the same small subset of drivers.
   2. Extract optional functionality into shared libraries - this is more 
   difficult than the above. One example of such functionality is ZFS and 
   there is already an open issue - 
   https://github.com/cloudius-systems/osv/issues/1009. Some drivers could 
   be extracted as libraries as well but it might be more difficult to do so. 
   The main difficulty here is that there needs to be a filesystem mounted 
   early enough in the boot process to load such a library from - bootfs (less 
   attractive as it is part of loader.elf/kernel.elf) or RoFS. 
   3. Create a mechanism to build a smaller kernel "tailored" to a specific 
   app. This would require some sort of ELF analyzer tool that would identify 
   all symbols needed by the given app and its dependencies and create a 
   version script file defining specific set of symbols to be exported from 
   kernel. To achieve that we could start with addressing the issue - 
   https://github.com/cloudius-systems/osv/issues/97 - "Be more selective 
   on symbols exported from the kernel" - that could deliver such a generic 
   solution.

Addressing 3) could help us with another issue - 
https://github.com/cloudius-systems/osv/issues/821 - "Combining 
pre-compiled OSv kernel with pre-compiled executable". To that end, we 
could also consider creating a mechanism that would let us build a 
stripped-down version of the kernel with functionality exposed through 
SYSCALL instruction only and no built-in musl (except for dynamic linker 
function (dlopen, etc)) and libc and let one mix in original pre-built musl 
library which would interact with kernel through those SYSCALL calls. This 
would require probably exposing more functions as SYSCALL than we have now 
in linux.cc - at least brk and clone. I am not sure if that is even 
feasible but I think I think at least one of the unikernels does just this 
- Hermitux.

I am also leaning more and more toward hiding C++ library - this should 
help us with 821 and there is at least one case - dotnet apps - that 
require an incompatible version of libstdc++.so. This would impact existing 
internal C++ apps like cpiod and httpserver as we would have to add 
libstdc++.so to the manifest for any of these apps. So there are some space 
and memory trade-offs here.

What do you think?

Waldek

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/f587322c-97af-450c-831e-99122b93cbda%40googlegroups.com.

[osv-dev] Librarization/Modularization

Reply via email to