On Monday, August 3, 2020 at 5:29:57 AM UTC-4 Nadav Har'El wrote:

> On Sun, Aug 2, 2020 at 8:32 AM Waldek Kozaczuk <[email protected]> wrote:
>
>> This is possibly one of the longest emails I have ever written. So bear 
>> with me ...
>>
>
> I'll try :-) Replies inline:
>
Thanks for the reply. 

>
>
>> After all the recent patches, I have submitted that replaced many source 
>> files under libc/ (~70) and headers under include/api/ (~25) with their 
>> identical copies under musl/ folder, there is still a lot of stuff left to 
>> do something about. And this is not going to be an easy walk but then after 
>> some substantial research I have done, it seems to be doable. In general, 
>> in order to ease the upgrade to the newer version of musl (probably 
>> 1.1.24), we should first prepare (clean, prune) existing code under 
>> include/api and libc/ and at the same time identify files that do not need 
>> any change (should not be affected by upgrade).
>
> Before we agree on the exact steps I am proposing later (with some 
>> outstanding questions), let me first do an inventory of what is left under 
>> include/api and libc/.
>>
>> So at this point, we have:
>>
>>    - 93 header files under include/api - (find include/api/ -type f 
>>    -name \*.h | wc -l)
>>       - I think most of those originate from musl but were manually 
>>       changed.
>>    
>> I assume many if not most of these were never manually changed, but were 
> simply copied from an older version of musl, and musl itself
> changed. Just as a random example, I took a look at include/api/utmp.h. 
> "git blame" showed it was added in commit
> 75e72090a and never changed later. This commit is Avi's "libc: import 
> musl's include/ as include/api/". I know Avi's
> style - if this is the commit message it means he took all those Musl 
> files as-is, unmodified. If he wanted to modify files,
> he would have done this in a separate later commit.
>
> So you can revert many of these files (I guess you'll need a "git blame" 
> on each one individually) to links to musl.
> Then just confirm OSv still compiles and we don't rely the specific old 
> versions of musl.
>
>  
>
>>
>>    - 
>>       - I hope most of them with some effort can be replaced with the 
>>       current or new version of musl - any other ideas?
>>       - Out of those, 19 files under include/api/aarch64 for sure come 
>>       from newer version fo musl so until the upgrade they should stay as is
>>    
>> Yes.
>
>>
>>    - under libc/ (total of 215 files)
>>       - 15 C header files
>>       - 153 C source files 
>>       - 6 C++ header files
>>       - 41 C++ source files
>>    
>> Now let me break those down into following buckets:
>>
>>    - The files I believe *do NOT need to change *with musl upgrade of 
>>    because they originate from somewhere else or were written from scratch 
>>    (mostly in C++); see some of my comments/questions next to each one 
>> [total 
>>    of 56 files]
>>       - 
>>       - misc/realpath.c
>>       - vdso/vdso.c // Somehow get rid of it
>>       - process/execle.c
>>       - process/execve.cc
>>       - process/waitpid.cc
>>       - unistd/getppid.c
>>       - unistd/getsid.c
>>       - unistd/getpgid.c
>>       - unistd/getpgrp.c
>>       - unistd/sethostname.c
>>       - math/finite.c
>>    
>>
>>    - 
>>       - math/finitef.c
>>       - math/finitel.c
>>       - math/aliases.c // SEE if there is a way to somehow use aliases 
>>       without mdifying musl or new musl has those aliases
>>    
>>
> Right :-( I think the reason why we did this with function wrappers 
> instead of aliases, is that the weak_alias() macro did not work across 
> source
> files (this is documented in 
> https://gcc.gnu.org/onlinedocs/gcc-4.3.6/gcc/Function-Attributes.html).
> I see in 
> https://stackoverflow.com/questions/7649979/gcc-alias-to-function-outside-of-translation-unit-aka-is-this-even-the-right-t
>  
> there might be
> some workarounds to allow such aliases in the linker script or asm hacks 
> instead of such function wrappers, maybe that's what we should do.
> But it's not urgent now.
>
>>
>>    - 
>>       - linux/makedev.c // some relationship to musl macros
>>       - errno/strerror.c // small file originally taken from musl but 
>>       does not look like the one in musl anymore
>>    
>>
> Right, the problem was that the one in Musl failed to export sys_errlist 
> and sys_nerr, which glibc does.
> Maybe they changed this in a newer version. You can always go back after 
> the upgrade to look for more files which
> aren't needed any more.
>
>>
>>    - 
>>       - libc/libc.hh // depends on "internal/libc.h"
>>       - libc/pthread.hh
>>       - libc/pipe_buffer.hh
>>       - libc/network/__dns.hh
>>       - libc/signal.hh
>>       - libc/arch/x64/ucontext/ucontext.cc
>>       - libc/misc/backtrace.cc
>>       - libc/misc/mntent.cc
>>       - libc/misc/mntent.cc
>>       - libc/network/__dns.cc // Where did it come from? Does anything 
>>       have/should come/get upgraded from musl or other place?
>>       - libc/process/execve.cc
>>       - libc/process/waitpid.cc
>>       - libc/stdio/__stdout_write.cc
>>       - libc/stdio/printf-hooks.cc
>>       - libc/unistd/setpgid.cc - just a stub
>>       - libc/unistd/setsid.cc - just a stub
>>       - libc/unistd/sync.cc - just a stub
>>       - libc/af_local.cc
>>       - libc/cxa_thread_atexit.cc
>>       - libc/cpu_set.cc
>>       - libc/dlfcn.cc
>>       - libc/eventfd.cc
>>       - libc/malloc_hooks.cc
>>       - libc/mallopt.cc
>>       - libc/mman.cc
>>       - libc/mount.cc
>>       - libc/notify.cc
>>       - libc/pipe.cc
>>       - libc/pipe_buffer.cc
>>       - libc/pthread.cc
>>       - libc/pthread_barrier.cc
>>       - libc/resource.cc
>>       - libc/sem.cc
>>       - libc/shm.cc
>>       - libc/signal.cc
>>       - libc/time.cc
>>       - libc/timerfd.cc
>>       - libc/user.cc
>>       - libc/arch/x64/setjmp/sigrtmin.c -> musl/src/signal/sigrtmin.c // 
>>       there does not seem to be aarch64 version, but OSv has
>>       - The family of *_chk() function that provide small layer of glibc 
>>    compability and *should NOT need to change*. Should we move them to 
>>    libc/glibc-compat dir (there is already include/glibc-compat directory)?, 
>>    total of 17
>>
>>
> I don't know if we need a glibc-compact directory - our entire libc is 
> about glibc compatibility... 
>
I think this is a statement worth clarifying. Even though OSv's libc is 
intended to be glibc-compatible isn't is also musl compatible? These many 
Linux programs (including java, node, etc) are built for musl-based 
distributions like alpine. Is there something fundamental that prevents us 
from OSv libc being both musl and glibc compatible at the same time? Is 
musl functionality a subset of glibc one? Or some of the musl files were 
changed to provide glibc flavor very different (not-compatible) from musl? 
Any example of that, if any?

> Maybe it would make sense to put all those _chk() functions in one file 
> instead of 17 small files, but I don't think changing this is important now.
>
>
>>    - 
>>       - 
>>       - __read_chk.c
>>       - stdio/__fprintf_chk.c
>>       - stdio/__fread_chk.c
>>       - stdio/__vfprintf_chk.c
>>       - string/__stpcpy_chk.c
>>       - string/__strcpy_chk.c
>>       - string/__wcscpy_chk.c
>>       - string/__memcpy_chk.c
>>       - string/__strcat_chk.c
>>       - string/__memset_chk.c
>>       - string/__strncat_chk.c
>>       - string/__explicit_bzero_chk.c
>>       - string/__memmove_chk.c
>>       - string/__strncpy_chk.c
>>       - __pread64_chk.cc -> why is it a C++ function?
>>    
>>
> I definitely doesn't need to be. It only includes one "extern C" function, 
> so could have been a ".c" file. But more on this below:
>
>
>>    - 
>>       - internal/_chk_fail.cc
>>       - misc/__longjmp_chk.cc -> why is it a C++ function?
>>    
>> No real reason. I think my thinking at the time that I wrote this was 
> that any new code we write (not a patch to musl) should be a C++ file.
> This makes sense both as a marker (".cc" files are "ours", not Musl) and 
> also in case our implementation needs to use C++ stuff. Which in this case 
> didn't happen, but in other places, does. An obvious example is 
> libc/pipe.cc, which is full of C++ code to implement the C functions 
> pipe(), pipe2().
>
>  
>
>>
>>    - 
>>       - Also, consider moving out __realpath_chk() out of 
>>       misc/realpath.c to its own file.
>>       - Files that are different from current musl equivalent ones but 
>>    should most likely be replaced in Makefile with the musl ones (see the 
>>    point_to_musl_as_is.diff attachment):
>>       - string/strsignal.c - the current musl one seems to be newer and 
>>       most up to date
>>    
>> Right, I am guessing that this is another example of a file we took from 
> an older musl.
> Again, we can upgrade first and update this specific file (and similar to 
> it) later - although you can start with dropping it now and if it works, 
> we'll save time later.
>  
>
>>
>>    - 
>>       - no signficant differences it seems
>>          - internal/floatscan.h
>>          - internal/floatscan.c
>>          - internal/intscan.h
>>          - internal/intscan.c
>>          - internal/shgetc.c
>>       
>> Right. It seems there is an include path issue, but we could potentially 
> solve this in the makefile (if needed) without patching Musl? 
>  
>
>>
>>    - 
>>       - possibly can be replaced with musl ones
>>          - prng/random.c
>>       
>> This was also taken from 2013 musl (commit aab9cfd7f) but then improved 
> by Avi to add the reentrant version (52f214d4e) - I am guessing Musl has 
> today newer versions of both and we can replace ours.
>
>>
>>    - 
>>       - 
>>          - network/gai_strerror.cc -> ../musl/src/network/gai_strerror.c
>>       
>>
> Right. There is no difference - you can already, now, remove 
> network/gai_strerror.cc and add the musl one in the Makefile.
>  
>
>>
>>    - 
>>       - 
>>          - misc/lockf.cc ../musl/src/misc/lockf.c // why is it C++? 
>>          seems to have minimal changes
>>       
>> I think as noted above, Glauber wanted to make this C++ because he 
> thought it's a good idea - and then had to
> make changes to the syntax which is not allowed in C++. See commit 
> bef7533150f1f96a96c3cea7f3793eeaafe73ffd.
> I think you can remove our lockf.cc and change the Makefile to use musl 
> lockf.c.
>  
>
>>
>>    - Files under arch (21 files) I have not idea if they are subject to 
>>    upgrade
>>       - 
>>       - libc/arch/aarch64/setjmp/sigsetjmp.s
>>    
>>
> git blame libc/arch/x64/setjmp/setjmp.s, for example, shows it was just 
> copied from musl in 2013.
> diff libc/arch/x64/setjmp/setjmp.s musl/src/setjmp/x86_64/setjmp.s shows 
> it never changed.
> So you can already now drop this setjmp.s change the Makefile to build the 
> one from Musl.
>
> I don't know if the same is true for all the other files you listed.
>
>
>>    - 
>>       - libc/arch/aarch64/setjmp/sigrtmin.c
>>       - libc/arch/aarch64/setjmp/block.c
>>       - libc/arch/aarch64/setjmp/longjmp.s
>>       - libc/arch/aarch64/setjmp/siglongjmp.c
>>       - libc/arch/aarch64/setjmp/sigrtmax.c
>>       - libc/arch/aarch64/setjmp/setjmp.s
>>       - libc/arch/aarch64/atomic.h
>>       - libc/arch/x64/ucontext/getcontext.s
>>       - libc/arch/x64/ucontext/ucontext.cc
>>       - libc/arch/x64/ucontext/start_context.s
>>       - libc/arch/x64/ucontext/setcontext.s
>>       - libc/arch/x64/setjmp/sigsetjmp.s
>>       - libc/arch/x64/setjmp/sigrtmin.c
>>       - libc/arch/x64/setjmp/block.c
>>       - libc/arch/x64/setjmp/longjmp.s
>>       - libc/arch/x64/setjmp/siglongjmp.c
>>       - libc/arch/x64/setjmp/sigrtmax.c
>>       - libc/arch/x64/setjmp/setjmp.s
>>       - libc/arch/x64/atomic.h
>>       - libc/arch/arm/src/__aeabi_atexit.c
>>    - locale/ (26 files) - I think most of those come from older musl and 
>>    can be replaced with their newer copies in current musl; I think 
>>    libc/locale have two files for each family of functions - for example 
>>    libc/locale/strcoll.c, libc/locale/strcoll_l.c vs single 
>>    musl/src/locale/strcoll.c; see locale.diff attachment for details and 
>> this 
>>    musl commit as an example - 
>>    
>> https://git.musl-libc.org/cgit/musl/diff/?id=4b0306c83c8c3614afbaf18a18e22d24f335ea04
>>       - 
>>       - 
>>       - locale/strcoll_l.c
>>       - locale/wcsftime_l.c
>>       - locale/iswctype_l.c
>>       - locale/freelocale.c
>>       - locale/wcsxfrm_l.c
>>       - locale/uselocale.c
>>       - locale/setlocale.c
>>       - locale/strtold_l.c
>>       - locale/wcsxfrm.c
>>       - locale/strfmon.c
>>       - locale/strcoll.c
>>       - locale/wcscoll.c
>>       - locale/strxfrm.c
>>       - locale/strtof_l.c
>>       - locale/towlower_l.c
>>       - locale/towupper_l.c
>>       - locale/toupper_l.c
>>       - locale/nl_langinfo_l.c
>>       - locale/langinfo.c
>>       - locale/strtod_l.c
>>       - locale/strftime_l.c
>>       - locale/wctype_l.c
>>       - locale/wcscoll_l.c
>>       - locale/duplocale.c
>>       - locale/strxfrm_l.c
>>    - time/ - time functios that I think come from older musl but were 
>>    not changed after; with little effort can be replaced (point to) musl 
>>    originals; see the time.diff attachment and this musl commit - 
>>    
>> https://git.musl-libc.org/cgit/musl/diff/?id=1cc81f5cb0df2b66a795ff0c26d7bbc4d16e13c6
>>    - 
>>       - time/__tm_to_time.c
>>       - time/__time_to_tm.c
>>       - time/timegm.c
>>       - time/__asctime.c
>>       - time/__time.h
>>       - time/mktime.c
>>       - time/strftime.c
>>       - time/tzset.c
>>    
>>
> This file in particular changed a lot between the time we imported it and 
> the version in musl :-(
> I think you should consider upgrading musl before making decisions of each 
> individual file like this, leaving some of them as the old versions until 
> making such a decision.
>
>>
>>    - 
>>       - time/wcsftime.c
>>       - time/localtime_r.c
>>       - time/gmtime_r.c
>>       - time/localtime.c
>>       - time/gmtime.c
>>       - time/ftime.c
>>    - network/ - the diffs to musl
>>    - 
>>       - network/if_indextoname.c - SYS_close replaced with close() and 
>>       socket(AF_UNIX, SOCK_DGRAM|SOCK_CLOEXEC, 0) replaced with 
>> socket(AF_UNIX, 
>>       SOCK_DGRAM, 0)
>>       - network/gai_strerror.cc - ?
>>    
>>
> See discussion above.
>
>
>>    - 
>>       - network/__ipparse.c - includes __dns.hh instead of __dns.h
>>       - network/if_nameindex.c - socket(AF_UNIX, 
>>       SOCK_DGRAM|SOCK_CLOEXEC, 0) replaced with socket(AF_INET, 
>>       SOCK_DGRAM|SOCK_CLOEXEC, 0);
>>       - network/gethostbyname_r.c - I think it originates from musl and 
>>       is identical except for extra gethostbyname() that can probably be 
>> moved to 
>>       a separate file?
>>       - network/inet_addr.c - does NOT exist in musl
>>       - network/getifaddrs.c - completely different from musl copy
>>       - network/getnameinfo.c - seems to originate from muls but has 
>>       extra code and maybe has to stay as is just like getaddrinfo()
>>       - network/res_init.c - only diff: weak_alias(res_init, __res_init)
>>       - network/if_nametoindex.c - similar to if_indextoname (see above)
>>       - network/__dns.hh/__dns.cc - should stay as is 
>>       - network/getaddrinfo.c - seems to originate from muls but quite 
>>       different from the current musl version so maybe should stay as is
>>       - network/inet_ntop.c - originates from musl and has not changed 
>>       since; but different from current musl
>>       - network/inet_aton.c - does not exist in musl but the only commit 
>>       says it was taken from musl-libc
>>    - unistd/ - only files that possibly are subject to upgrade
>>       - ttyname_r.c - implemented by Rean for node js and I do not think 
>>       it comes from musl
>>    
>>
> Right, this is an OSv-only implementation, because it just returns 
> "/dev/console" - we don't have support for ptys.
>
>
>>    - 
>>       - ttyname.c is - almost identical to an existing file in musl (see 
>>       attached diff) and delegates to ttyname_r() maybe we should point to 
>> musl 
>>       one
>>    
>> Right. 
>
>
>>    - 
>>       - I think we have a unit tests for both
>>    - stdlib/
>>       - 
>>       - wcstol.c - besides including different headers it replaces 
>>       f.lock = -1 with f.no_locking = true (what is this exactly about?)
>>    
>> See commit 3f053b8f950be4b89fcb928f956ddc31ba8ba68a which explains this 
> well (and a lot of locking-related changes throughout the stdio code).
>
> Unfortunately, instead of finding a way to reuse musl's stdio locking 
> options, we just changed their code, which was very nice when we did it
> (the result was cleaner) but now will be a mess to upgrading musl :-(
> Maybe you should consider if we can, again, do the upgrade in pieces, and 
> don't upgrade stdio at the first stage.
>
>
>>    - 
>>       - strtol.c - similar as wcstol.c
>>       - strtod.c - similar as wcstol.c
>>       - qsort_r.c - does not exist in current musl (I checked in 1.1.24) 
>>       so maybe should stay as is
>>    - string/ - (_chk.c files and others not subject to upgrade skipped)
>>    - 
>>       - string/memset.c - seems identical but memset_base() vs memset() 
>>       - what is it about?
>>    
>> Two things:
>
>    1. We have an assembly-language memset() that picks the fastest one on 
>    the machine, so this C code isn't used for memset(). We use it as 
>    memset_base() for machines that don't have anything better.
>    2. We wanted to use memset_base() in C++ code (see commit 
>    86471776627d3625c481d7e8fff6614bccec95cc for the explanation) which meant 
>    we need some changes to please both C and C++ compilers.
>    
> So let's keep our modified version. You could, if you want, try to 
> upstream our changes to please the compiler, but I don't know what we can 
> do about the name memset(). I'm surprised, though, Musl don't have the same 
> problem - not wanting to use the C code implementation for memset().
>  
>
>>
>>    - 
>>       - string/memmove.c - this is OSv specific so probably should stay 
>>       as is
>>       - string/strtok_r.c - seems identical except for '__restrict' vs 
>>       'restrict' and #undef
>>       - string/stresep.c - per commit seems to come from netbsd (
>>       ftp.tku.edu.tw/NetBSD/NetBSD-current/src/lib/libc/string/stresep.c) 
>>       -> should stay as is
>>       - string/__strndup.c  - not in musl , seems to come from somewhere 
>>       else
>>    
>>
> Just one of those infernal aliases... As I commented above, I think it 
> would be nice to have them all in one file instead of all over the place. 
>
>
>>    - 
>>       - string/rawmemchr.c - not from musl, where does it come from?
>>    
>> I think  Cristoph wrote it from scratch. If new Musl now have it, we 
> should probably consider using it.
>
>
>>    - 
>>       - string/strerror_r.c - originates from musl but seems to be 
>>       completely different from current musl copy; modified to be both POSIX 
>> and 
>>       glibc compliant
>>    
>> 7053ac3ac021bf5e1508e1e6a68fa1dc7da13525 explains some of the mess there.
> Maybe it's safer not to touch it for now, although maybe the newer Musl 
> solved these issues too...
>
>
>>    - 
>>       - string/memcpy.c - implementation seems to be OSv specfic, so 
>>       maybe is not subject to musl upgrade
>>       - string/strlcat.c - slightly different from current musl but has 
>>       not been modfied since original import -> probably we should use musl 
>>       version
>>    
>> Yes. The only difference is an extern of strlcpy() - I guess at some 
> point (maybe still today?) it was missing from the header file?
>  
>
>>
>>    - 
>>       - string/explicit_bzero.c - not from musl it seems; glibc?
>>       
>> You wrote it - e5515cf0170358bd4588456182540720672f015c :-)
> If Musl never added this (they should, no? it's a glibc function and 
> needed for binary compatibility...) we should keep our own.
>
>
>>    - 
>>       - 
>>    - stdio - this one is the largest in terms of changes (but maybe not) 
>>    and biggest mess
>>    - 
>>       - fmemopen.c - has extra 'if (!libc.threaded) f->lock = -1;' to 
>>       deal with some thread related issue - see commit 
>>       3f053b8f950be4b89fcb928f956ddc31ba8ba68a
>>    
>>
> Yes, see my comments above about locking.
> If you can get away with *NOT* upgrading stdio yet, maybe you should 
> consider starting this way.
>
>>
>>    - 
>>       - ftrylockfile.c - pthread_self()->tid vs mutex_owned(&f->mutex)
>>       - __stdio_read.c - SYS_readv vs readv() mostly
>>    
>> Yes, this change of
> cnt = syscall(SYS_readv, f->fd, iov, 2);
> into
> cnt = readv(f->fd, iov, 2);
>
> Is the saddest, because we'll continue to need making these changes after 
> the upgrade.
>
> But we *can* solve this: we could macros and guaranteed-inlined functions, 
> change things so that
> syscal(SYS_readv, ...) be translated to readv(...).  We can even do this 
> without any changes to the source
> code by adding "--include ..." in the Makefile for the musl files! It will 
> be ugly (readers of the code will not
> understand how this syscall() stuff works), but it will save us a lot of 
> patches.
>
>>
>>    - 
>>       - open_wmemstream.c - extra 'if (!libc.threaded) f->lock = -1;' 
>>       just like fmemopen.c
>>       - __fdopen.c - SYS_fcntl vs fcntl() and SYS_ioctl vs ioctl() mostly
>>       - __stdio_close.c - SYS_close vs close()
>>       - vfwscanf.c - quite different fromm current musl but seems to not 
>>       have been modified for OSv purposes > so maybe just needs to be 
>> replaced in 
>>       makefile to musl copy?
>>       - vdprintf.c - .lock = -1 vs .no_locking = true (what is this 
>>       about?)
>>       - vfprintf.c - the only difference is handling of L prefix for 
>>       long long int
>>       - __stdio_seek.c - SYS_lseek vs lseek()
>>       - tmpnam.c - SYS_access vs access() and SYS_clock_gettime vs 
>>       clock_gettime()
>>       - getc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if 
>>       (f->no_locking || !__lockfile(f))
>>       - fopen.c - SYS_open vs open() and SYS_close vs close()
>>       - vsnprintf.c - .lock = -1 vs .no_locking = true plus extra 
>>       handling of some overflow condition
>>       - flockfile.c - seems to have been rewritten for OSv, ftrylockfile 
>>       vs mutex_owned() etc
>>       - vfscanf.c - seems to originate from musl but per couple of old 
>>       commits from 2013 certian features have been disabled; should we 
>> repoint ot 
>>       tp current musl?
>>       - vswscanf.c - '.lock = -1' vs '.no_locking = true'
>>       - stdin.c - modified for all kinds of reasons
>>    
>> 000ce9a2cb638d00ecafe77780e21ad57331f239 is especially important. If Musl 
> didn't fix this issue too, maybe we
> can propose this fix upstream. It's important for some applications that 
> just assume this.
>  
>
>>
>>    - 
>>       - freopen.c - SYS_fcntl vs fcntl() and __dup3() vs dup3()
>>       - sscanf.c - extra weak_alias for __isoc99_sscanf
>>    
>>
> As I mentioned above would be nice to be able to put those aliases in a 
> separate file (using linker script, asm, etc.) so we don't need to touch 
> existing files :-(
>
>
>>    - 
>>       - tmpfile.c - SYS_open vs open() and SYS_close vs unlink()
>>       - stdout.c - extra '.lock = -1' - what is this about?
>>       - open_memstream.c - extra 'if (!libc.threaded) f->lock = -1;'
>>       - putc.c - 'if (f->lock < 0 || !__lockfile(f))' vs 'if 
>>       (f->no_locking || !__lockfile(f))'
>>       - __lockfile.c - seems to have been heavily adapted for OSv 
>>       purposes
>>       - remove.c - SYS_unlink vs unlink() plus some POSIX compliance 
>>       changes
>>       - fputc.c -  'if (f->lock < 0 || !__lockfile(f))' vs 'if 
>>       (f->no_locking || !__lockfile(f))'
>>       - __stdio_write.c - SYS_writev vs writev()
>>       - stderr.c - extra '.lock = -1,'
>>       - funlockfile.c - includes "pthread_impl.h"
>>       - fgetc.c -  'if (f->lock < 0 || !__lockfile(f))' vs 'if 
>>       (f->no_locking || !__lockfile(f))'
>>       - vsscanf.c - '.read = do_read, .lock = -1' vs '.read = do_read, 
>>       .no_locking = true,' plus week_alias for __isoc99_vsscanf
>>       - vswprintf.c  - 'f.lock = -1;' vs 'f.no_locking = true;'
>>       - __fopen_rb_ca.c - SYS_open vs open() plus 'f->lock = -1;' (what 
>>       is this about?)
>>    
>> From above, It seems there are 3 categories of files under libc/:
>>
>>    1. Brand new implementations of libc functionality mostly in C++ that 
>>    will not need to change as part of musl upgrade
>>    2. Glibc compatibility or implementations not from musl; like above 
>>    they will not need to change as part of musl upgrade
>>    3. Files that originate from musl but need to be adapted for OSv 
>>    specific reasons
>>
>>
>> To help us with the upgrade we should add more unit tests. To that 
>> extent, I have found musl unit test - 
>> https://wiki.musl-libc.org/libc-test.html and possibly some unit tests 
>> from bionic (
>> https://android.googlesource.com/platform/bionic/+/refs/heads/master/tests/stdio_test.cpp
>>  or 
>>
>> https://android.googlesource.com/platform/bionic/+/refs/heads/master/tests/stdlib_test.cpp).
>>  
>> The bionic ones would need to be adapted but the musl ones could be used 
>> as-is.
>>
>
> The musl tests can be very useful to understand what we lose by not 
> upgrading a certain function to use the latest Musl (e.g., using our own 
> memset() instead of theirs), maybe it will find bugs.
>
> One thing the musl tests will not help us to do is to understand what we 
> possibly broke in Linux compatibility by switching old code to the new musl.
>
>>
>> To minimize effort for future upgrade I was wondering if the following 
>> ideas could help us:
>> - use some preprocessor macros "magic" to replace for syscall() calls 
>> with regular functions - it seems like many files under libc are original 
>> musl source except for these type of changes
>>
>
> Yes, this is doable - see ideas above.
>
> - in cases, we are forced to modify musl originals, maybe instead of 
>> storing modified files under libc/ like we have now, we should store the 
>> *.patch files instead and apply as part of new build process step; would it 
>> be better?
>>
>
> Yes, this is a good idea (it doesn't need to be a separate build step - 
> can be in the Makefile - takes musl/something.c, 
> libc/musl_patches/something.c.patch, and creates build/*/gen/something.c 
> and then compiles it.
>
> It will be hard to maintain these patches, though, so we should try to 
> make them as minimal as we can and rarely changing. In some cases, by using 
> "--include", "-D", in the makefile, we might not even need to do these 
> patches.
>  
>
>>
>> So I think the upgrade plan should be this: 
>> 1. Add more unit tests
>> 2. Prune/clean libc to minimize the set of files that are part of 
>> category 3 (see above)
>> 3. Do the upgrade (per Nadav's original plan):
>>
>>    - Bring the latest Musl version into a "musl-1.1.24" subdirectory in 
>>    OSv.
>>
>> Please make it a submodule, pointing to Musl's own github, as we do today 
> with "musl". 
>
>>
>>    - Leave the old "musl/" and "libc/" directories as well.
>>    - In Makefile, make a new list of objects (say, "nmusl") which will 
>>    take code from musl-1.1.24/ instead of musl/.
>>    - Start to switch individual files and directories from "musl += ..." 
>>    to "nmusl += ...".  We can start with the math functions needed for 
>>    aarch64. Eventually, everything can be converted to nmusl and hopefully, 
>>    nothing or little will break and need to be fixed.
>>
>>
>>    - Replace our include/api with musl-1.1.24/include. Would be even 
>>    better to drop include/api, and just use musl-1.1.5/include directly if 
>> we 
>>    can. I think, though, this is low priority, and might take quite a bit of 
>>    work ("git log include/api" shows we modified this quite a bit since we 
>>    took it from musl). I'm hoping that the old header files will work 
>>    correctly also for the newer musl.
>>       - I think this might be tricky one at this point
>>    
>> Some of the files in include/api are *not* links to musl - one random 
> example is include/api/printf.h. So you will not be able to just drop 
> include/api, and might still need to link the invidual files. Perhaps an 
> even better alternative is to put in include/api only our modified files, 
> and and the Makfile add "-I" for *both* musl/include and include/api so we 
> don't need to put them in the same directory.
>  
>
>>
>> What do you think?
>>
>> Waldek
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/ea8c0bb0-9e1b-47be-bc80-4c34f6874614n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/osv-dev/ea8c0bb0-9e1b-47be-bc80-4c34f6874614n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/76953724-bb5d-4da4-abff-a77ddaefdf9dn%40googlegroups.com.

Reply via email to