Hello all, this is a report of a work I started on HelenOS camp regarding changes to libposix and the way how we build other applications.
TL;DR version is: as an alternative, it is possible to use musl libc as the base library for ported software and build coastline packages with it. Internally, the actual (Linux) syscalls in musl are replaced with HelenOS handler that translates them to native HelenOS API. It seems to work (at least as a proof of concept) on ia32. Longer version follows. I am sorry for the length but I wanted to put all the information together so we can discuss whether we want to use this in HelenOS mainline. I have partially explained this already in my blog post [blog], I repeat it here again for completeness. I will start with a brief overview of issues we are facing with libposix. Historically libposix directly included files from libc to make them accessible to ported software. Some tweaks - usually to prevent clashes between names in libposix and libc - were done with special macros and various #ifdef tricks. That worked well except some packages actually defined macros to wrap standard functions with their replacements (and that broke when the identifier was already a macro). So we switched to renaming the symbols at object level and keeping libc and libposix headers as separated as possible (the changes in last weeks separating headers even more helped a lot too). That works pretty well yet our (libposix) headers are still somewhat different from glibc headers (a.k.a. "the libc" on GNU/Linux today) and configure scripts do not detect some functions correctly. That is, by the way, the reason why we define some "have-xy-header" macros in several harbour files. We are adding functions to libposix on demand: when new application is ported to HelenOS and some functionality is missing, we add it to libposix. That can have some adverse effects on its own as it may break existing applications (e.g. configure detected that some functions are missing and bypassed their usage, now they are there and the build process may crash because it expects to find other functions too). But the biggest issue is that we are now past the simple cases and we are adding a non-trivial definitions there that are difficult to maintain. Porting QEMU highlighted this. Some required headers are expected to define hundreds of macros describing the system. Writing such headers from scratch is difficult because they are often Linux-specific and there is not much of a documentation anyway. And taking these headers from existing libraries (e.g. glibc) introduces a maintenance problem (and licensing might be an issue too). So maybe it is time to look for another solution. And that is to find a suitable libc that could be tweaked to become the new POSIX emulation layer. The reasoning is that this existing libc would take care of proper definitions, would provide OS-agnostic functionality (e.g. sprintf) and we would only provide the binding that is HelenOS specific, such as file system operations. Ideally there should be a well-defined level where library does things on its own and where services from the OS are needed. Looking at existing libcs for Linux, this separation layer is at the level of system calls. Therefore if HelenOS would be able to emulate Linux system calls, any C library should work. And now about the actual implementation. I decided to use musl libc [musl] as the potential replacement of libposix. The reasons were that it is (1) actively maintained, (2) has MIT license and (3) is reasonably small yet reasonably complete [comparison]. The overall idea is that syscalls in musl would be replaced by a normal function call that would emulate them and mimic that the application is running on GNU/Linux. Since syscalls in musl are defined through a macro it is easy to redefine it to call a central dispatcher of the emulation layer. The actual emulation is implemented in libinux and currently supports (at least partially) about 15 system calls. The important thing is that libinux basically exports only two symbols - one for initialization and one for the syscall handler. Everything else is separated (musl vs libinux and HelenOS libc) and thus there shall not be any naming clashes at all (it prefixes HelenOS libc functions as libposix does but no shared headers are present). Ported software is then compiled against musl headers and linked with musl and libinux (together with our libc). That works very well and for packages I tested so far I was able to remove all the hacks and basically reduce number of patches to zero. This new approach has some advantages and several disadvantages too. The biggest advantage is that we compile the ported applications against a full-fledged libc that is known to work and that greatly simplifies the compilation process. We also reduce the amount of code in HelenOS as we do not need to care about headers but only about implementation of a few syscalls. There is also less duplication in the code as the syscalls are (almost) orthogonal. And now the disadvantages. The obvious one is: instead of implementing plenty of POSIX functions (and we already have a lot of them), we would be implementing plenty of Linux syscalls (and starting from scratch). Well, there are not that many syscalls, we do not need to implement all of them and in many cases they offer nicer separation from which we should profit in cleaner code in general (e.g. instead of implementing write, fwrite, printf etc. we need to only implement single syscall - write). Of course, some syscalls are very difficult to port to HelenOS (fork is a prominent example) but these were even in libposix handled by patching the original application (e.g. PEX module in libiberty for GCC). A serious drawback of musl is that it does not support all the architectures we do. For example, glibc is much better in this aspect but compiling glibc is a nightmare compared to musl. This is a real blocker from a complete switch but I believe that adding an architecture stub needed for our purposes might not be that hard. There is very little of assembler in general in musl and about one third of architecture-specific code are actually just the syscall numbers. We would also shift the testing/debugging from compilation-time to run-time. So far, if a function was missing, the compilation failed with "undefined function". With libinux, we fail at run-time with "unknown syscall". That complicates the debugging process but to some degree this happens now too as many functions in libposix are defined but empty. Another thing that is not yet solved is threading support. As most of the software in Coastline is single-threaded I simply turned threads off for now (including any locking in libinux). I am still not sure how to properly map our fibrils to normal pthreads and which assumptions musl has about threads. As musl-on-HelenOS is still in the phase of proof-of-concept, I focused on ia32 and amd64 only (I believe ia32 works well, amd64 is not 100% yet). If you want to see the code or even try it out: my branches are on GitHub [helenos-branch,harbours-branch]. I have also put together a small script for quick testing [build]. After compilation, you can run example zlib application via zlib-example or run assembler from binutils. By the way, I was able to upgrade to newest version of binutils after about 2 hours of work (basic syscalls needed by zlib were already added though). There is still some bug in linker so complete compilation is not yet possible. If you want to monitor more what is going on, you can run the application with libinux=debug parameter. This parameter is eaten by libinux and is not propageted to the application itself. For debugging purposes, adding libinux_st=1 causes the syscall dispatcher to print a stacktrace too. The overal status is that printf via musl works, syscalls for opening a file works for common cases; reading and writing from/to a file seems to work too; malloc in musl works via brk syscall; some special syscalls doing nothing (e.g. madvice) were added. It is possible to compile binutils, zlib, xz and GCC. I have not tried to compile Python yet. From the mentioned packages, xz does not run because of missing pipe syscall; ld does not work for reasons unknown and GCC misses few syscalls that I plan to add soon. The code is far from perfect - I focused mostly on getting things running on ia32 - but navigating through the sources should be easy. src/syscalls contains the actual emulation layer, libinux.c is the syscall dispatcher and main.c contains initialization routines. I hope I have not forgotten anything. I would welcome any comments on this. I believe a switch to full-fledged libc is inevitable as we do not have the manpower to make libposix functionally equivalent to a "normal" libc. If we decide to use musl, I would certainly welcome help - especially with ports to other architectures (both that are present in musl, such as mips32 and to those that are missing completely in musl). Cheers, - Vojtech [blog] http://vh.alisma.cz/blog/2017/08/25/rethinking-libposix-in-helenos/ [build] https://github.com/vhotspur/helenos-harbours/wiki/libinux#building-harbours-to-use-musl-libc [comparison] http://www.etalabs.net/compare_libcs.html [harbours-branch] https://github.com/vhotspur/helenos-harbours/tree/libinux [helenos-branch] https://github.com/vhotspur/helenos/tree/libinux [musl] https://www.musl-libc.org/ _______________________________________________ HelenOS-devel mailing list [email protected] http://lists.modry.cz/listinfo/helenos-devel
