Added a zipped archive of diff files referenced by this email. On Sunday, August 2, 2020 at 1:32:49 AM UTC-4 Waldek Kozaczuk wrote:
> This is possibly one of the longest emails I have ever written. So bear > with me ... > > After all the recent patches, I have submitted that replaced many source > files under libc/ (~70) and headers under include/api/ (~25) with their > identical copies under musl/ folder, there is still a lot of stuff left to > do something about. And this is not going to be an easy walk but then after > some substantial research I have done, it seems to be doable. In general, > in order to ease the upgrade to the newer version of musl (probably > 1.1.24), we should first prepare (clean, prune) existing code under > include/api and libc/ and at the same time identify files that do not need > any change (should not be affected by upgrade). > > Before we agree on the exact steps I am proposing later (with some > outstanding questions), let me first do an inventory of what is left under > include/api and libc/. > > So at this point, we have: > > - 93 header files under include/api - (find include/api/ -type f -name > \*.h | wc -l) > - I think most of those originate from musl but were manually > changed. I hope most of them with some effort can be replaced with the > current or new version of musl - any other ideas? > - Out of those, 19 files under include/api/aarch64 for sure come > from newer version fo musl so until the upgrade they should stay as is > - under libc/ (total of 215 files) > - 15 C header files > - 153 C source files > - 6 C++ header files > - 41 C++ source files > > Now let me break those down into following buckets: > > - The files I believe *do NOT need to change *with musl upgrade of > because they originate from somewhere else or were written from scratch > (mostly in C++); see some of my comments/questions next to each one [total > of 56 files] > - > - misc/realpath.c > - vdso/vdso.c // Somehow get rid of it > - process/execle.c > - process/execve.cc > - process/waitpid.cc > - unistd/getppid.c > - unistd/getsid.c > - unistd/getpgid.c > - unistd/getpgrp.c > - unistd/sethostname.c > - math/finite.c > - math/finitef.c > - math/finitel.c > - math/aliases.c // SEE if there is a way to somehow use aliases > without mdifying musl or new musl has those aliases > - misc/uname.c > - linux/makedev.c // some relationship to musl macros > - errno/strerror.c // small file originally taken from musl but > does not look like the one in musl anymore > - libc/libc.hh // depends on "internal/libc.h" > - libc/pthread.hh > - libc/pipe_buffer.hh > - libc/network/__dns.hh > - libc/signal.hh > - libc/arch/x64/ucontext/ucontext.cc > - libc/misc/backtrace.cc > - libc/misc/mntent.cc > - libc/misc/mntent.cc > - libc/network/__dns.cc // Where did it come from? Does anything > have/should come/get upgraded from musl or other place? > - libc/process/execve.cc > - libc/process/waitpid.cc > - libc/stdio/__stdout_write.cc > - libc/stdio/printf-hooks.cc > - libc/unistd/setpgid.cc - just a stub > - libc/unistd/setsid.cc - just a stub > - libc/unistd/sync.cc - just a stub > - libc/af_local.cc > - libc/cxa_thread_atexit.cc > - libc/cpu_set.cc > - libc/dlfcn.cc > - libc/eventfd.cc > - libc/malloc_hooks.cc > - libc/mallopt.cc > - libc/mman.cc > - libc/mount.cc > - libc/notify.cc > - libc/pipe.cc > - libc/pipe_buffer.cc > - libc/pthread.cc > - libc/pthread_barrier.cc > - libc/resource.cc > - libc/sem.cc > - libc/shm.cc > - libc/signal.cc > - libc/time.cc > - libc/timerfd.cc > - libc/user.cc > - libc/arch/x64/setjmp/sigrtmin.c -> musl/src/signal/sigrtmin.c // > there does not seem to be aarch64 version, but OSv has > - The family of *_chk() function that provide small layer of glibc > compability and *should NOT need to change*. Should we move them to > libc/glibc-compat dir (there is already include/glibc-compat directory)?, > total of 17 > - > - __read_chk.c > - stdio/__fprintf_chk.c > - stdio/__fread_chk.c > - stdio/__vfprintf_chk.c > - string/__stpcpy_chk.c > - string/__strcpy_chk.c > - string/__wcscpy_chk.c > - string/__memcpy_chk.c > - string/__strcat_chk.c > - string/__memset_chk.c > - string/__strncat_chk.c > - string/__explicit_bzero_chk.c > - string/__memmove_chk.c > - string/__strncpy_chk.c > - __pread64_chk.cc -> why is it a C++ function? > - internal/_chk_fail.cc > - misc/__longjmp_chk.cc -> why is it a C++ function? > - Also, consider moving out __realpath_chk() out of misc/realpath.c > to its own file. > - Files that are different from current musl equivalent ones but > should most likely be replaced in Makefile with the musl ones (see the > point_to_musl_as_is.diff attachment): > - string/strsignal.c - the current musl one seems to be newer and > most up to date > - no signficant differences it seems > - internal/floatscan.h > - internal/floatscan.c > - internal/intscan.h > - internal/intscan.c > - internal/shgetc.c > - possibly can be replaced with musl ones > - prng/random.c > - network/gai_strerror.cc -> ../musl/src/network/gai_strerror.c > - misc/lockf.cc ../musl/src/misc/lockf.c // why is it C++? seems > to have minimal changes > - Files under arch (21 files) I have not idea if they are subject > to upgrade > - > - libc/arch/aarch64/setjmp/sigsetjmp.s > - libc/arch/aarch64/setjmp/sigrtmin.c > - libc/arch/aarch64/setjmp/block.c > - libc/arch/aarch64/setjmp/longjmp.s > - libc/arch/aarch64/setjmp/siglongjmp.c > - libc/arch/aarch64/setjmp/sigrtmax.c > - libc/arch/aarch64/setjmp/setjmp.s > - libc/arch/aarch64/atomic.h > - libc/arch/x64/ucontext/getcontext.s > - libc/arch/x64/ucontext/ucontext.cc > - libc/arch/x64/ucontext/start_context.s > - libc/arch/x64/ucontext/setcontext.s > - libc/arch/x64/setjmp/sigsetjmp.s > - libc/arch/x64/setjmp/sigrtmin.c > - libc/arch/x64/setjmp/block.c > - libc/arch/x64/setjmp/longjmp.s > - libc/arch/x64/setjmp/siglongjmp.c > - libc/arch/x64/setjmp/sigrtmax.c > - libc/arch/x64/setjmp/setjmp.s > - libc/arch/x64/atomic.h > - libc/arch/arm/src/__aeabi_atexit.c > - locale/ (26 files) - I think most of those come from older musl and > can be replaced with their newer copies in current musl; I think > libc/locale have two files for each family of functions - for example > libc/locale/strcoll.c, libc/locale/strcoll_l.c vs single > musl/src/locale/strcoll.c; see locale.diff attachment for details and this > musl commit as an example - > > https://git.musl-libc.org/cgit/musl/diff/?id=4b0306c83c8c3614afbaf18a18e22d24f335ea04 > - > - > - locale/strcoll_l.c > - locale/wcsftime_l.c > - locale/iswctype_l.c > - locale/freelocale.c > - locale/wcsxfrm_l.c > - locale/uselocale.c > - locale/setlocale.c > - locale/strtold_l.c > - locale/wcsxfrm.c > - locale/strfmon.c > - locale/strcoll.c > - locale/wcscoll.c > - locale/strxfrm.c > - locale/strtof_l.c > - locale/towlower_l.c > - locale/towupper_l.c > - locale/toupper_l.c > - locale/nl_langinfo_l.c > - locale/langinfo.c > - locale/strtod_l.c > - locale/strftime_l.c > - locale/wctype_l.c > - locale/wcscoll_l.c > - locale/duplocale.c > - locale/strxfrm_l.c > - time/ - time functios that I think come from older musl but were not > changed after; with little effort can be replaced (point to) musl > originals; see the time.diff attachment and this musl commit - > > https://git.musl-libc.org/cgit/musl/diff/?id=1cc81f5cb0df2b66a795ff0c26d7bbc4d16e13c6 > - > - time/__tm_to_time.c > - time/__time_to_tm.c > - time/timegm.c > - time/__asctime.c > - time/__time.h > - time/mktime.c > - time/strftime.c > - time/tzset.c > - time/wcsftime.c > - time/localtime_r.c > - time/gmtime_r.c > - time/localtime.c > - time/gmtime.c > - time/ftime.c > - network/ - the diffs to musl > - > - network/if_indextoname.c - SYS_close replaced with close() and > socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) replaced with > socket(AF_UNIX, > SOCK_DGRAM, 0) > - network/gai_strerror.cc - ? > - network/__ipparse.c - includes __dns.hh instead of __dns.h > - network/if_nameindex.c - socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, > 0) replaced with socket(AF_INET, SOCK_DGRAM|SOCK_CLOEXEC, 0); > - network/gethostbyname_r.c - I think it originates from musl and > is identical except for extra gethostbyname() that can probably be > moved to > a separate file? > - network/inet_addr.c - does NOT exist in musl > - network/getifaddrs.c - completely different from musl copy > - network/getnameinfo.c - seems to originate from muls but has > extra code and maybe has to stay as is just like getaddrinfo() > - network/res_init.c - only diff: weak_alias(res_init, __res_init) > - network/if_nametoindex.c - similar to if_indextoname (see above) > - network/__dns.hh/__dns.cc - should stay as is > - network/getaddrinfo.c - seems to originate from muls but quite > different from the current musl version so maybe should stay as is > - network/inet_ntop.c - originates from musl and has not changed > since; but different from current musl > - network/inet_aton.c - does not exist in musl but the only commit > says it was taken from musl-libc > - unistd/ - only files that possibly are subject to upgrade > - ttyname_r.c - implemented by Rean for node js and I do not think > it comes from musl > - ttyname.c is - almost identical to an existing file in musl (see > attached diff) and delegates to ttyname_r() maybe we should point to > musl > one > - I think we have a unit tests for both > - stdlib/ > - > - wcstol.c - besides including different headers it replaces f.lock > = -1 with f.no_locking = true (what is this exactly about?) > - strtol.c - similar as wcstol.c > - strtod.c - similar as wcstol.c > - qsort_r.c - does not exist in current musl (I checked in 1.1.24) > so maybe should stay as is > - string/ - (_chk.c files and others not subject to upgrade skipped) > - > - string/memset.c - seems identical but memset_base() vs memset() - > what is it about? > - string/memmove.c - this is OSv specific so probably should stay > as is > - string/strtok_r.c - seems identical except for '__restrict' vs > 'restrict' and #undef > - string/stresep.c - per commit seems to come from netbsd ( > ftp.tku.edu.tw/NetBSD/NetBSD-current/src/lib/libc/string/stresep.c) > -> should stay as is > - string/__strndup.c - not in musl , seems to come from somewhere > else > - string/rawmemchr.c - not from musl, where does it come from? > - string/strerror_r.c - originates from musl but seems to be > completely different from current musl copy; modified to be both POSIX > and > glibc compliant > - string/memcpy.c - implementation seems to be OSv specfic, so > maybe is not subject to musl upgrade > - string/strlcat.c - slightly different from current musl but has > not been modfied since original import -> probably we should use musl > version > - string/explicit_bzero.c - not from musl it seems; glibc? > - stdio - this one is the largest in terms of changes (but maybe > not) and biggest mess > - > - fmemopen.c - has extra 'if (!libc.threaded) f->lock = -1;' to > deal with some thread related issue - see commit > 3f053b8f950be4b89fcb928f956ddc31ba8ba68a > - ftrylockfile.c - pthread_self()->tid vs mutex_owned(&f->mutex) > - __stdio_read.c - SYS_readv vs readv() mostly > - open_wmemstream.c - extra 'if (!libc.threaded) f->lock = -1;' > just like fmemopen.c > - __fdopen.c - SYS_fcntl vs fcntl() and SYS_ioctl vs ioctl() mostly > - __stdio_close.c - SYS_close vs close() > - vfwscanf.c - quite different fromm current musl but seems to not > have been modified for OSv purposes > so maybe just needs to be > replaced in > makefile to musl copy? > - vdprintf.c - .lock = -1 vs .no_locking = true (what is this > about?) > - vfprintf.c - the only difference is handling of L prefix for long > long int > - __stdio_seek.c - SYS_lseek vs lseek() > - tmpnam.c - SYS_access vs access() and SYS_clock_gettime vs > clock_gettime() > - getc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if > (f->no_locking || !__lockfile(f)) > - fopen.c - SYS_open vs open() and SYS_close vs close() > - vsnprintf.c - .lock = -1 vs .no_locking = true plus extra > handling of some overflow condition > - flockfile.c - seems to have been rewritten for OSv, ftrylockfile > vs mutex_owned() etc > - vfscanf.c - seems to originate from musl but per couple of old > commits from 2013 certian features have been disabled; should we > repoint ot > tp current musl? > - vswscanf.c - '.lock = -1' vs '.no_locking = true' > - stdin.c - modified for all kinds of reasons > - freopen.c - SYS_fcntl vs fcntl() and __dup3() vs dup3() > - sscanf.c - extra weak_alias for __isoc99_sscanf > - tmpfile.c - SYS_open vs open() and SYS_close vs unlink() > - stdout.c - extra '.lock = -1' - what is this about? > - open_memstream.c - extra 'if (!libc.threaded) f->lock = -1;' > - putc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if > (f->no_locking || !__lockfile(f))' > - __lockfile.c - seems to have been heavily adapted for OSv purposes > - remove.c - SYS_unlink vs unlink() plus some POSIX compliance > changes > - fputc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if > (f->no_locking || !__lockfile(f))' > - __stdio_write.c - SYS_writev vs writev() > - stderr.c - extra '.lock = -1,' > - funlockfile.c - includes "pthread_impl.h" > - fgetc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if > (f->no_locking || !__lockfile(f))' > - vsscanf.c - '.read = do_read, .lock = -1' vs '.read = do_read, > .no_locking = true,' plus week_alias for __isoc99_vsscanf > - vswprintf.c - 'f.lock = -1;' vs 'f.no_locking = true;' > - __fopen_rb_ca.c - SYS_open vs open() plus 'f->lock = -1;' (what > is this about?) > > From above, It seems there are 3 categories of files under libc/: > > 1. Brand new implementations of libc functionality mostly in C++ that > will not need to change as part of musl upgrade > 2. Glibc compatibility or implementations not from musl; like above > they will not need to change as part of musl upgrade > 3. Files that originate from musl but need to be adapted for OSv > specific reasons > > > To help us with the upgrade we should add more unit tests. To that extent, > I have found musl unit test - https://wiki.musl-libc.org/libc-test.html and > possibly some unit tests from bionic ( > https://android.googlesource.com/platform/bionic/+/refs/heads/master/tests/stdio_test.cpp > or > > https://android.googlesource.com/platform/bionic/+/refs/heads/master/tests/stdlib_test.cpp). > > The bionic ones would need to be adapted but the musl ones could be used > as-is. > > To minimize effort for future upgrade I was wondering if the following > ideas could help us: > - use some preprocessor macros "magic" to replace for syscall() calls with > regular functions - it seems like many files under libc are original musl > source except for these type of changes > - in cases, we are forced to modify musl originals, maybe instead of > storing modified files under libc/ like we have now, we should store the > *.patch files instead and apply as part of new build process step; would it > be better? > > So I think the upgrade plan should be this: > 1. Add more unit tests > 2. Prune/clean libc to minimize the set of files that are part of category > 3 (see above) > 3. Do the upgrade (per Nadav's original plan): > > - Bring the latest Musl version into a "musl-1.1.24" subdirectory in > OSv. > - Leave the old "musl/" and "libc/" directories as well. > - In Makefile, make a new list of objects (say, "nmusl") which will > take code from musl-1.1.24/ instead of musl/. > - Start to switch individual files and directories from "musl += ..." > to "nmusl += ...". We can start with the math functions needed for > aarch64. Eventually, everything can be converted to nmusl and hopefully, > nothing or little will break and need to be fixed. > - Replace our include/api with musl-1.1.24/include. Would be even > better to drop include/api, and just use musl-1.1.5/include directly if we > can. I think, though, this is low priority, and might take quite a bit of > work ("git log include/api" shows we modified this quite a bit since we > took it from musl). I'm hoping that the old header files will work > correctly also for the newer musl. > - I think this might be tricky one at this point > > > What do you think? > > Waldek > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/32333b6e-25e8-47a0-8be8-a6fb04d795bdn%40googlegroups.com.
categories_to_zip.tar.gz
Description: GNU Zip compressed data
