Re: [osv-dev] Re: Pip packages/using Nix
I have some current work here: https://github.com/mkenigs/osv-apps/commits/nginx-nix. A few things are pretty hacky but I think that's just because of trying to use Make/module.py/nix all at the same time. The image that creates is significantly larger than the current osv nginx one, but I think that could be changed by disabling some of the configure flags nix uses by default. Building on nix also requires some changes in osv: https://github.com/mkenigs/osv/commits/nix-fixes I'm still cleaning those up Would love any feedback! Before I submit a patch I need to fix some of the hacks I made to get it working locally On Monday, December 28, 2020 at 8:18:11 PM UTC-7 Roman Shaposhnik wrote: > So is this getting published some place? ;-) > > Thanks, > Roman. > > On Mon, Dec 14, 2020 at 9:00 AM Matthew Kenigsberg > wrote: > >> Thanks for putting so much effort into figuring that out! Really >> appreciate it, and glad to get it working! >> >> On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg >> wrote: >> >>> That worked!!! Had to set -z relro -z lazy >>> >>> On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com >>> wrote: >>> Hi, Thanks for uploading the files. It definitely has helped me figure out the issue. In essence, all the .so files like libzfs.so are built with Full RELRO (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. Relatedly, some Linux distributions are setup to make gcc effectively use '-z now -z relro' when linking the libraries. On many others like Ubuntu or Fedora they are built with Partial RELRO - '-z relro' by default. As the libraries are loaded by OSv dynamic linker, all jump slot relocations are resolved eagerly (even if they are not used by the code later) if those libraries are marked as 'Full RELRO' (bind_now = true). For non-'Full RELRO' cases, the jump slot relocations are resolved lazily whenever they are accessed 1st time and are handled by 'void* object::resolve_pltgot(unsigned index)` which writes resolved function symbol address in GOT. The problem with Full-RELRO is that if we cannot find a symbol because for example it is not implemented by OSv or is not visible at *this point of linking* we simply ignore it hoping that it will never be used or resolved later. If it is used later, the resolve_pltgot() is called, and if the symbol is found (because the library containing the symbol has been loaded since) we crash because we trying to write to the part of memory - GOT - that has been since read-only protected. Why does this happen exactly? So here is the symbol *bsd_getmntany *not found at the address you were getting original fault at (after adding extra debug statements): ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, addr:0x10040ca0 /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been *0x10040ca8 which match what page fault reports ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, addr:0x10040cb0 Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command line, OSv loads and executes the apps sequentially. So the mkfs.so is first and the dynamic linker will load libzfs.so and relocate and eagerly resolve all symbols and fix the permissions on libzfs.so. One of the symbols in libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is left unresolved (see missing warning above). After mkfs.so, OSv gets to zfs.so and it processes it and executes and some of the code in zfs.so tries to invoke *bsd_getmntany *which gets dynamically resolved and found by resolve_pltgot() BUT when it tries to write to GOT it gets page fault. Having said all that I am sure what exactly the problem is: A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so which effectively prevents those to function properly if build with Full RELRO (would that work on Linux)? B) True limitation of OSv linker which should handle correctly such scenario. For now, the easiest solution (might be true if A is true) is to simply force building those libraries with Partial RELRO like in this patch: diff --git a/Makefile b/Makefile index d1597263..d200dde8 100644 --- a/Makefile +++ b/Makefile @@ -345,7 +345,7 @@ $(out)/%.o: %.s $(makedir) $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s) -%.so: EXTRA_FLAGS = -fPIC -shared +%.so: EXTRA_FLAGS = -fPIC -shared -z relro %.so: %.o $(makedir) $(q-build-so) Please let me know if it works, Waldek PS. Also verify that running '
Re: [osv-dev] Re: Pip packages/using Nix
So is this getting published some place? ;-) Thanks, Roman. On Mon, Dec 14, 2020 at 9:00 AM Matthew Kenigsberg < matthewkenigsb...@gmail.com> wrote: > Thanks for putting so much effort into figuring that out! Really > appreciate it, and glad to get it working! > > On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg > wrote: > >> That worked!!! Had to set -z relro -z lazy >> >> On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com >> wrote: >> >>> Hi, >>> >>> Thanks for uploading the files. It definitely has helped me figure out >>> the issue. >>> >>> In essence, all the .so files like libzfs.so are built with Full RELRO >>> (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. >>> Relatedly, some Linux distributions are setup to make gcc effectively use >>> '-z now -z relro' when linking the libraries. On many others like Ubuntu or >>> Fedora they are built with Partial RELRO - '-z relro' by default. >>> >>> As the libraries are loaded by OSv dynamic linker, all jump slot >>> relocations are resolved eagerly (even if they are not used by the code >>> later) if those libraries are marked as 'Full RELRO' (bind_now = true). For >>> non-'Full RELRO' cases, the jump slot relocations are resolved lazily >>> whenever they are accessed 1st time and are handled by 'void* >>> object::resolve_pltgot(unsigned index)` which writes resolved function >>> symbol address in GOT. >>> >>> The problem with Full-RELRO is that if we cannot find a symbol because >>> for example it is not implemented by OSv or is not visible at *this >>> point of linking* we simply ignore it hoping that it will never be used >>> or resolved later. If it is used later, the resolve_pltgot() is called, and >>> if the symbol is found (because the library containing the symbol has been >>> loaded since) we crash because we trying to write to the part of memory - >>> GOT - that has been since read-only protected. >>> >>> Why does this happen exactly? >>> >>> So here is the symbol *bsd_getmntany *not found at the address you were >>> getting original fault at (after adding extra debug statements): >>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, >>> addr:0x10040ca0 >>> /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been >>> *0x10040ca8 >>> which match what page fault reports >>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, >>> addr:0x10040cb0 >>> >>> Please note that both mkfs.so and zfs.so depend on libzfs.so. Per >>> command line, OSv loads and executes the apps sequentially. So the mkfs.so >>> is first and the dynamic linker will load libzfs.so and relocate and >>> eagerly resolve all symbols and fix the permissions on libzfs.so. One of >>> the symbols in libzfs.so is *bsd_getmntany* which is actually part of >>> zfs.so which is left unresolved (see missing warning above). >>> >>> After mkfs.so, OSv gets to zfs.so and it processes it and executes and >>> some of the code in zfs.so tries to invoke *bsd_getmntany *which gets >>> dynamically resolved and found by resolve_pltgot() BUT when it tries to >>> write to GOT it gets page fault. >>> >>> Having said all that I am sure what exactly the problem is: >>> A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so >>> which effectively prevents those to function properly if build with Full >>> RELRO (would that work on Linux)? >>> B) True limitation of OSv linker which should handle correctly such >>> scenario. >>> >>> For now, the easiest solution (might be true if A is true) is to simply >>> force building those libraries with Partial RELRO like in this patch: >>> >>> diff --git a/Makefile b/Makefile >>> index d1597263..d200dde8 100644 >>> --- a/Makefile >>> +++ b/Makefile >>> @@ -345,7 +345,7 @@ $(out)/%.o: %.s >>> $(makedir) >>> $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s) >>> >>> -%.so: EXTRA_FLAGS = -fPIC -shared >>> +%.so: EXTRA_FLAGS = -fPIC -shared -z relro >>> %.so: %.o >>> $(makedir) >>> $(q-build-so) >>> >>> Please let me know if it works, >>> Waldek >>> >>> PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' >>> does not show anything anymore. >>> >>> On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg >>> wrote: >>> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what I have in /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1: https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2 Get a different error if I comment out core/elf.cc:1429: (gdb) bt #0 0x4039eef2 in processor::cli_hlt () at arch/x64/processor.hh:247 #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 #2 osv::halt () at arch/x64/power.cc:26 #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at runtime.cc:132 #4 0x40202989 in abort () at
Re: [osv-dev] Re: Pip packages/using Nix
Thanks for putting so much effort into figuring that out! Really appreciate it, and glad to get it working! On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg wrote: > That worked!!! Had to set -z relro -z lazy > > On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com > wrote: > >> Hi, >> >> Thanks for uploading the files. It definitely has helped me figure out >> the issue. >> >> In essence, all the .so files like libzfs.so are built with Full RELRO >> (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. >> Relatedly, some Linux distributions are setup to make gcc effectively use >> '-z now -z relro' when linking the libraries. On many others like Ubuntu or >> Fedora they are built with Partial RELRO - '-z relro' by default. >> >> As the libraries are loaded by OSv dynamic linker, all jump slot >> relocations are resolved eagerly (even if they are not used by the code >> later) if those libraries are marked as 'Full RELRO' (bind_now = true). For >> non-'Full RELRO' cases, the jump slot relocations are resolved lazily >> whenever they are accessed 1st time and are handled by 'void* >> object::resolve_pltgot(unsigned index)` which writes resolved function >> symbol address in GOT. >> >> The problem with Full-RELRO is that if we cannot find a symbol because >> for example it is not implemented by OSv or is not visible at *this >> point of linking* we simply ignore it hoping that it will never be used >> or resolved later. If it is used later, the resolve_pltgot() is called, and >> if the symbol is found (because the library containing the symbol has been >> loaded since) we crash because we trying to write to the part of memory - >> GOT - that has been since read-only protected. >> >> Why does this happen exactly? >> >> So here is the symbol *bsd_getmntany *not found at the address you were >> getting original fault at (after adding extra debug statements): >> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, >> addr:0x10040ca0 >> /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been >> *0x10040ca8 >> which match what page fault reports >> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, >> addr:0x10040cb0 >> >> Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command >> line, OSv loads and executes the apps sequentially. So the mkfs.so is first >> and the dynamic linker will load libzfs.so and relocate and eagerly resolve >> all symbols and fix the permissions on libzfs.so. One of the symbols in >> libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is >> left unresolved (see missing warning above). >> >> After mkfs.so, OSv gets to zfs.so and it processes it and executes and >> some of the code in zfs.so tries to invoke *bsd_getmntany *which gets >> dynamically resolved and found by resolve_pltgot() BUT when it tries to >> write to GOT it gets page fault. >> >> Having said all that I am sure what exactly the problem is: >> A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so >> which effectively prevents those to function properly if build with Full >> RELRO (would that work on Linux)? >> B) True limitation of OSv linker which should handle correctly such >> scenario. >> >> For now, the easiest solution (might be true if A is true) is to simply >> force building those libraries with Partial RELRO like in this patch: >> >> diff --git a/Makefile b/Makefile >> index d1597263..d200dde8 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -345,7 +345,7 @@ $(out)/%.o: %.s >> $(makedir) >> $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s) >> >> -%.so: EXTRA_FLAGS = -fPIC -shared >> +%.so: EXTRA_FLAGS = -fPIC -shared -z relro >> %.so: %.o >> $(makedir) >> $(q-build-so) >> >> Please let me know if it works, >> Waldek >> >> PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' >> does not show anything anymore. >> >> On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote: >> >>> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what >>> I have in >>> /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1: >>> https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2 >>> >>> Get a different error if I comment out core/elf.cc:1429: >>> >>> (gdb) bt >>> #0 0x4039eef2 in processor::cli_hlt () at >>> arch/x64/processor.hh:247 >>> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 >>> #2 osv::halt () at arch/x64/power.cc:26 >>> #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at >>> runtime.cc:132 >>> #4 0x40202989 in abort () at runtime.cc:98 >>> #5 0x40218943 in osv::generate_signal (siginfo=..., >>> ef=0x8191c068) at libc/signal.cc:124 >>> #6 0x404745ff in osv::handle_mmap_fault (addr=, >>> sig=, ef=) >>> at libc/signal.cc:139
Re: [osv-dev] Re: Pip packages/using Nix
That worked!!! Had to set -z relro -z lazy On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com wrote: > Hi, > > Thanks for uploading the files. It definitely has helped me figure out the > issue. > > In essence, all the .so files like libzfs.so are built with Full RELRO > (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. > Relatedly, some Linux distributions are setup to make gcc effectively use > '-z now -z relro' when linking the libraries. On many others like Ubuntu or > Fedora they are built with Partial RELRO - '-z relro' by default. > > As the libraries are loaded by OSv dynamic linker, all jump slot > relocations are resolved eagerly (even if they are not used by the code > later) if those libraries are marked as 'Full RELRO' (bind_now = true). For > non-'Full RELRO' cases, the jump slot relocations are resolved lazily > whenever they are accessed 1st time and are handled by 'void* > object::resolve_pltgot(unsigned index)` which writes resolved function > symbol address in GOT. > > The problem with Full-RELRO is that if we cannot find a symbol because for > example it is not implemented by OSv or is not visible at *this point of > linking* we simply ignore it hoping that it will never be used or > resolved later. If it is used later, the resolve_pltgot() is called, and if > the symbol is found (because the library containing the symbol has been > loaded since) we crash because we trying to write to the part of memory - > GOT - that has been since read-only protected. > > Why does this happen exactly? > > So here is the symbol *bsd_getmntany *not found at the address you were > getting original fault at (after adding extra debug statements): > ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, > addr:0x10040ca0 > /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been > *0x10040ca8 > which match what page fault reports > ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, > addr:0x10040cb0 > > Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command > line, OSv loads and executes the apps sequentially. So the mkfs.so is first > and the dynamic linker will load libzfs.so and relocate and eagerly resolve > all symbols and fix the permissions on libzfs.so. One of the symbols in > libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is > left unresolved (see missing warning above). > > After mkfs.so, OSv gets to zfs.so and it processes it and executes and > some of the code in zfs.so tries to invoke *bsd_getmntany *which gets > dynamically resolved and found by resolve_pltgot() BUT when it tries to > write to GOT it gets page fault. > > Having said all that I am sure what exactly the problem is: > A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so > which effectively prevents those to function properly if build with Full > RELRO (would that work on Linux)? > B) True limitation of OSv linker which should handle correctly such > scenario. > > For now, the easiest solution (might be true if A is true) is to simply > force building those libraries with Partial RELRO like in this patch: > > diff --git a/Makefile b/Makefile > index d1597263..d200dde8 100644 > --- a/Makefile > +++ b/Makefile > @@ -345,7 +345,7 @@ $(out)/%.o: %.s > $(makedir) > $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s) > > -%.so: EXTRA_FLAGS = -fPIC -shared > +%.so: EXTRA_FLAGS = -fPIC -shared -z relro > %.so: %.o > $(makedir) > $(q-build-so) > > Please let me know if it works, > Waldek > > PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' does > not show anything anymore. > > On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote: > >> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what >> I have in >> /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1: >> https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2 >> >> Get a different error if I comment out core/elf.cc:1429: >> >> (gdb) bt >> #0 0x4039eef2 in processor::cli_hlt () at >> arch/x64/processor.hh:247 >> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 >> #2 osv::halt () at arch/x64/power.cc:26 >> #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at >> runtime.cc:132 >> #4 0x40202989 in abort () at runtime.cc:98 >> #5 0x40218943 in osv::generate_signal (siginfo=..., >> ef=0x8191c068) at libc/signal.cc:124 >> #6 0x404745ff in osv::handle_mmap_fault (addr=, >> sig=, ef=) >> at libc/signal.cc:139 >> #7 0x40347872 in mmu::vm_fault (addr=17592187039744, >> addr@entry=17592187040376, >> ef=ef@entry=0x8191c068) at core/mmu.cc:1336 >> #8 0x403992e3 in page_fault (ef=0x8191c068) at >> arch/x64/mmu.cc:42 >> #9 >> #10 0x101554c4 in usage
Re: [osv-dev] Re: Pip packages/using Nix
Hi, Thanks for uploading the files. It definitely has helped me figure out the issue. In essence, all the .so files like libzfs.so are built with Full RELRO (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. Relatedly, some Linux distributions are setup to make gcc effectively use '-z now -z relro' when linking the libraries. On many others like Ubuntu or Fedora they are built with Partial RELRO - '-z relro' by default. As the libraries are loaded by OSv dynamic linker, all jump slot relocations are resolved eagerly (even if they are not used by the code later) if those libraries are marked as 'Full RELRO' (bind_now = true). For non-'Full RELRO' cases, the jump slot relocations are resolved lazily whenever they are accessed 1st time and are handled by 'void* object::resolve_pltgot(unsigned index)` which writes resolved function symbol address in GOT. The problem with Full-RELRO is that if we cannot find a symbol because for example it is not implemented by OSv or is not visible at *this point of linking* we simply ignore it hoping that it will never be used or resolved later. If it is used later, the resolve_pltgot() is called, and if the symbol is found (because the library containing the symbol has been loaded since) we crash because we trying to write to the part of memory - GOT - that has been since read-only protected. Why does this happen exactly? So here is the symbol *bsd_getmntany *not found at the address you were getting original fault at (after adding extra debug statements): ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, addr:0x10040ca0 /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been *0x10040ca8 which match what page fault reports ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, addr:0x10040cb0 Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command line, OSv loads and executes the apps sequentially. So the mkfs.so is first and the dynamic linker will load libzfs.so and relocate and eagerly resolve all symbols and fix the permissions on libzfs.so. One of the symbols in libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is left unresolved (see missing warning above). After mkfs.so, OSv gets to zfs.so and it processes it and executes and some of the code in zfs.so tries to invoke *bsd_getmntany *which gets dynamically resolved and found by resolve_pltgot() BUT when it tries to write to GOT it gets page fault. Having said all that I am sure what exactly the problem is: A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so which effectively prevents those to function properly if build with Full RELRO (would that work on Linux)? B) True limitation of OSv linker which should handle correctly such scenario. For now, the easiest solution (might be true if A is true) is to simply force building those libraries with Partial RELRO like in this patch: diff --git a/Makefile b/Makefile index d1597263..d200dde8 100644 --- a/Makefile +++ b/Makefile @@ -345,7 +345,7 @@ $(out)/%.o: %.s $(makedir) $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s) -%.so: EXTRA_FLAGS = -fPIC -shared +%.so: EXTRA_FLAGS = -fPIC -shared -z relro %.so: %.o $(makedir) $(q-build-so) Please let me know if it works, Waldek PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' does not show anything anymore. On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote: > Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what > I have in > /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1: > https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2 > > Get a different error if I comment out core/elf.cc:1429: > > (gdb) bt > #0 0x4039eef2 in processor::cli_hlt () at arch/x64/processor.hh:247 > #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 > #2 osv::halt () at arch/x64/power.cc:26 > #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at > runtime.cc:132 > #4 0x40202989 in abort () at runtime.cc:98 > #5 0x40218943 in osv::generate_signal (siginfo=..., > ef=0x8191c068) at libc/signal.cc:124 > #6 0x404745ff in osv::handle_mmap_fault (addr=, > sig=, ef=) > at libc/signal.cc:139 > #7 0x40347872 in mmu::vm_fault (addr=17592187039744, > addr@entry=17592187040376, > ef=ef@entry=0x8191c068) at core/mmu.cc:1336 > #8 0x403992e3 in page_fault (ef=0x8191c068) at > arch/x64/mmu.cc:42 > #9 > #10 0x101554c4 in usage (requested=requested@entry=false) > at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:424 > #11 0x10152025 in main (argc=5, argv=0xa0f19400) > at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:6676 > #12 0x4043c311 in osv::application::run_main > (this=0xa0ee0c10) > at >
Re: [osv-dev] Re: Pip packages/using Nix
Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what I have in /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1: https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2 Get a different error if I comment out core/elf.cc:1429: (gdb) bt #0 0x4039eef2 in processor::cli_hlt () at arch/x64/processor.hh:247 #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48 #2 osv::halt () at arch/x64/power.cc:26 #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at runtime.cc:132 #4 0x40202989 in abort () at runtime.cc:98 #5 0x40218943 in osv::generate_signal (siginfo=..., ef=0x8191c068) at libc/signal.cc:124 #6 0x404745ff in osv::handle_mmap_fault (addr=, sig=, ef=) at libc/signal.cc:139 #7 0x40347872 in mmu::vm_fault (addr=17592187039744, addr@entry=17592187040376, ef=ef@entry=0x8191c068) at core/mmu.cc:1336 #8 0x403992e3 in page_fault (ef=0x8191c068) at arch/x64/mmu.cc:42 #9 #10 0x101554c4 in usage (requested=requested@entry=false) at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:424 #11 0x10152025 in main (argc=5, argv=0xa0f19400) at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:6676 #12 0x4043c311 in osv::application::run_main (this=0xa0ee0c10) at /nix/store/h31cy7jm6g7cfqbhc5pm4rf9c53i3qfb-gcc-9.3.0/include/c++/9.3.0/bits/stl_vector.h:915 #13 0x4022452f in osv::application::main (this=0xa0ee0c10) at core/app.cc:320 #14 0x4043c539 in osv::applicationoperator() (__closure=0x0, app=) at core/app.cc:233 #15 osv::application_FUN(void *) () at core/app.cc:235 #16 0x40470d58 in pthread_private::pthreadoperator() (__closure=0xa1067f00) at libc/pthread.cc:115 #17 std::_Function_handler >::_M_invoke(const std::_Any_data &) (__functor=...) at /nix/store/h31cy7jm6g7cfqbhc5pm4rf9c53i3qfb-gcc-9.3.0/include/c++/9.3.0/bits/std_function.h:300 #18 0x404074fd in sched::thread_main_c (t=0x81917040) at arch/x64/arch-switch.hh:325 #19 0x403990d3 in thread_main () at arch/x64/entry.S:113 On Tuesday, December 8, 2020 at 2:34:19 PM UTC-7 jwkoz...@gmail.com wrote: > I wonder if we have a bug in core/elf.cc::fix_permissions() or logic > around. And we might be making the wrong part of the mapping readable based > on GNU_RELRO header. I wonder if you are able to create ZFS image by > temporarily commenting out the line 1429 of core/elf.cc: > > ef->fix_permissions(); > Also, would it possible to get copies of those binaries: > /libenviron.so: libenviron.so > /libvdso.so: libvdso.so > /zpool.so: zpool.so > /libzfs.so: libzfs.so > /libuutil.so: libuutil.so > /zfs.so: zfs.so > /tools/mkfs.so: tools/mkfs/mkfs.so > /tools/cpiod.so: tools/cpiod/cpiod.so > /tools/mount-fs.so: tools/mount/mount-fs.so > /tools/umount.so: tools/mount/umount.so > /usr/lib/libgcc_s.so.1: %(libgcc_s_dir)s/libgcc_s.so.1 > > Ideally, the stripped versions. That would help me to re-create the > problem and investigate further. > > On Tuesday, December 8, 2020 at 1:25:10 PM UTC-5 Matthew Kenigsberg wrote: > >> [nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so >> >> Elf file type is DYN (Shared object file) >> Entry point 0xc8f0 >> There are 8 program headers, starting at offset 64 >> >> Program Headers: >> Type Offset VirtAddr PhysAddr >> FileSiz MemSiz Flags Align >> LOAD 0x 0x 0x >> 0xa9b8 0xa9b8 R 0x1000 >> LOAD 0xb000 0xb000 0xb000 >> 0x0001e0a9 0x0001e0a9 R E 0x1000 >> LOAD 0x0002a000 0x0002a000 0x0002a000 >> 0x93a0 0x93a0 R 0x1000 >> LOAD 0x00034010 0x00035010 0x00035010 >> 0x1810 0x2c20 RW 0x1000 >> DYNAMIC 0x000340e0 0x000350e0 0x000350e0 >> 0x0210 0x0210 RW 0x8 >> GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768 >> 0x0d04 0x0d04 R 0x4 >> GNU_STACK 0x 0x 0x >> 0x 0x RW 0x10 >> GNU_RELRO 0x00034010 0x00035010 0x00035010 >> 0x0ff0 0x0ff0 R 0x1 >> >> Section to Segment mapping: >> Segment Sections... >> 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn >> .rela.plt >> 01 .init .plt .plt.got .text .fini >> 02 .rodata .eh_frame_hdr .eh_frame >> 03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss >> 04 .dynamic >> 05 .eh_frame_hdr >> 06 >> 07 .init_array .fini_array .data.rel.ro .dynamic .got >> >> On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com >> wrote: >> >>> Back to why it is failing. >>> >>> Based on what you sent us: >>> .. >>> 0x1000b000 0x10016000
Re: [osv-dev] Re: Pip packages/using Nix
I wonder if we have a bug in core/elf.cc::fix_permissions() or logic around. And we might be making the wrong part of the mapping readable based on GNU_RELRO header. I wonder if you are able to create ZFS image by temporarily commenting out the line 1429 of core/elf.cc: ef->fix_permissions(); Also, would it possible to get copies of those binaries: /libenviron.so: libenviron.so /libvdso.so: libvdso.so /zpool.so: zpool.so /libzfs.so: libzfs.so /libuutil.so: libuutil.so /zfs.so: zfs.so /tools/mkfs.so: tools/mkfs/mkfs.so /tools/cpiod.so: tools/cpiod/cpiod.so /tools/mount-fs.so: tools/mount/mount-fs.so /tools/umount.so: tools/mount/umount.so /usr/lib/libgcc_s.so.1: %(libgcc_s_dir)s/libgcc_s.so.1 Ideally, the stripped versions. That would help me to re-create the problem and investigate further. On Tuesday, December 8, 2020 at 1:25:10 PM UTC-5 Matthew Kenigsberg wrote: > [nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so > > Elf file type is DYN (Shared object file) > Entry point 0xc8f0 > There are 8 program headers, starting at offset 64 > > Program Headers: > Type Offset VirtAddr PhysAddr > FileSiz MemSiz Flags Align > LOAD 0x 0x 0x > 0xa9b8 0xa9b8 R 0x1000 > LOAD 0xb000 0xb000 0xb000 > 0x0001e0a9 0x0001e0a9 R E 0x1000 > LOAD 0x0002a000 0x0002a000 0x0002a000 > 0x93a0 0x93a0 R 0x1000 > LOAD 0x00034010 0x00035010 0x00035010 > 0x1810 0x2c20 RW 0x1000 > DYNAMIC 0x000340e0 0x000350e0 0x000350e0 > 0x0210 0x0210 RW 0x8 > GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768 > 0x0d04 0x0d04 R 0x4 > GNU_STACK 0x 0x 0x > 0x 0x RW 0x10 > GNU_RELRO 0x00034010 0x00035010 0x00035010 > 0x0ff0 0x0ff0 R 0x1 > > Section to Segment mapping: > Segment Sections... > 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn > .rela.plt > 01 .init .plt .plt.got .text .fini > 02 .rodata .eh_frame_hdr .eh_frame > 03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss > 04 .dynamic > 05 .eh_frame_hdr > 06 > 07 .init_array .fini_array .data.rel.ro .dynamic .got > > On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com > wrote: > >> Back to why it is failing. >> >> Based on what you sent us: >> .. >> 0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r >> offset=0x path=/libzfs.so >> 0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx >> offset=0xb000 path=/libzfs.so >> 0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r >> offset=0x0002a000 path=/libzfs.so >> >> *0x1004 0x10041000 [4.0 kB] flags=fmF perm=r >> offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 >> [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so >> .. >> >> The page fault in arch_relocate_jump_slot() is caused by an attempt to >> write at the address 0x10040ca8 which falls into a 4th mapping range >> from the above which can only be read from. So that is the permission >> fault. The question is why the address is in that range? The address should >> be somewhere in GOT in the 5th range - 0x10041000 >> 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 >> path=/libzfs.so which has read/write permission. >> >> On Ubuntu host when I run the same command like and add extra debugging >> to print statements like this: >> >> ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, >> addr:0x1007a8d0 >> They all print addresses within the range 0x1007a000 - 0x1007b000 >> which are read-write permitted as they should be: >> 0x10044000 0x1004e000 [40.0 kB]flags=fmF >> perm=roffset=0x path=/libzfs.so >> 0x1004e000 0x1006f000 [132.0 kB] flags=fmF >> perm=rx offset=0xa000 path=/libzfs.so >> 0x1006f000 0x10079000 [40.0 kB]flags=fmF >> perm=roffset=0x0002b000 path=/libzfs.so >> 0x10079000 0x1007a000 [4.0 kB] flags=fmF >> perm=roffset=0x00034000 path=/libzfs.so >> *0x1007a000 0x1007c000 [8.0 kB] flags=fmF >> perm=rw offset=0x00035000 path=/libzfs.so * >> >> I wonder if we have a bug when calculating where each segment should be >> mapped: >> >> 400 void file::load_segment(const Elf64_Phdr& phdr) >> 401 { >> 402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size); >> 403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart; >> 404 ulong filesz = align_up(filesz_unaligned, mmu::page_size); >> 405 ulong memsz =
Re: [osv-dev] Re: Pip packages/using Nix
[nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so Elf file type is DYN (Shared object file) Entry point 0xc8f0 There are 8 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x 0x 0x 0xa9b8 0xa9b8 R 0x1000 LOAD 0xb000 0xb000 0xb000 0x0001e0a9 0x0001e0a9 R E 0x1000 LOAD 0x0002a000 0x0002a000 0x0002a000 0x93a0 0x93a0 R 0x1000 LOAD 0x00034010 0x00035010 0x00035010 0x1810 0x2c20 RW 0x1000 DYNAMIC 0x000340e0 0x000350e0 0x000350e0 0x0210 0x0210 RW 0x8 GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768 0x0d04 0x0d04 R 0x4 GNU_STACK 0x 0x 0x 0x 0x RW 0x10 GNU_RELRO 0x00034010 0x00035010 0x00035010 0x0ff0 0x0ff0 R 0x1 Section to Segment mapping: Segment Sections... 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt 01 .init .plt .plt.got .text .fini 02 .rodata .eh_frame_hdr .eh_frame 03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 04 .dynamic 05 .eh_frame_hdr 06 07 .init_array .fini_array .data.rel.ro .dynamic .got On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com wrote: > Back to why it is failing. > > Based on what you sent us: > .. > 0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r > offset=0x path=/libzfs.so > 0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx > offset=0xb000 path=/libzfs.so > 0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r > offset=0x0002a000 path=/libzfs.so > > *0x1004 0x10041000 [4.0 kB] flags=fmF perm=r > offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 > [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so > .. > > The page fault in arch_relocate_jump_slot() is caused by an attempt to > write at the address 0x10040ca8 which falls into a 4th mapping range > from the above which can only be read from. So that is the permission > fault. The question is why the address is in that range? The address should > be somewhere in GOT in the 5th range - 0x10041000 > 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 > path=/libzfs.so which has read/write permission. > > On Ubuntu host when I run the same command like and add extra debugging to > print statements like this: > > ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, > addr:0x1007a8d0 > They all print addresses within the range 0x1007a000 - 0x1007b000 > which are read-write permitted as they should be: > 0x10044000 0x1004e000 [40.0 kB]flags=fmF > perm=roffset=0x path=/libzfs.so > 0x1004e000 0x1006f000 [132.0 kB] flags=fmF > perm=rx offset=0xa000 path=/libzfs.so > 0x1006f000 0x10079000 [40.0 kB]flags=fmF > perm=roffset=0x0002b000 path=/libzfs.so > 0x10079000 0x1007a000 [4.0 kB] flags=fmF > perm=roffset=0x00034000 path=/libzfs.so > *0x1007a000 0x1007c000 [8.0 kB] flags=fmF > perm=rw offset=0x00035000 path=/libzfs.so * > > I wonder if we have a bug when calculating where each segment should be > mapped: > > 400 void file::load_segment(const Elf64_Phdr& phdr) > 401 { > 402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size); > 403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart; > 404 ulong filesz = align_up(filesz_unaligned, mmu::page_size); > 405 ulong memsz = align_up(phdr.p_vaddr + phdr.p_memsz, > mmu::page_size) - vstart; > 406 > 407 unsigned perm = get_segment_mmap_permissions(phdr); > 408 > 409 auto flag = mmu::mmap_fixed | (mlocked() ? mmu::mmap_populate : > 0); > 410 mmu::map_file(_base + vstart, filesz, flag, perm, _f, > align_down(phdr.p_offset, mmu::page_size)); > 411 if (phdr.p_filesz != phdr.p_memsz) { > 412 assert(perm & mmu::perm_write); > 413 memset(_base + vstart + filesz_unaligned, 0, filesz - > filesz_unaligned); > 414 if (memsz != filesz) { > 415 mmu::map_anon(_base + vstart + filesz, memsz - filesz, > flag, perm); > 416 } > 417 } > 418 elf_debug("Loaded and mapped PT_LOAD segment at: %018p of size: > 0x%x\n", _base + vstart, filesz); > 419 } > > BTW I am also interested what the output of this readelf command for your > libzfs.so is. Mine is this: > > readelf -l build/release/libzfs-stripped.so > > Elf file type is DYN (Shared object file) > Entry point 0xd1d0
Re: [osv-dev] Re: Pip packages/using Nix
Back to why it is failing. Based on what you sent us: .. 0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r offset=0x path=/libzfs.so 0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx offset=0xb000 path=/libzfs.so 0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r offset=0x0002a000 path=/libzfs.so *0x1004 0x10041000 [4.0 kB] flags=fmF perm=r offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so .. The page fault in arch_relocate_jump_slot() is caused by an attempt to write at the address 0x10040ca8 which falls into a 4th mapping range from the above which can only be read from. So that is the permission fault. The question is why the address is in that range? The address should be somewhere in GOT in the 5th range - 0x10041000 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so which has read/write permission. On Ubuntu host when I run the same command like and add extra debugging to print statements like this: ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, addr:0x1007a8d0 They all print addresses within the range 0x1007a000 - 0x1007b000 which are read-write permitted as they should be: 0x10044000 0x1004e000 [40.0 kB]flags=fmF perm=roffset=0x path=/libzfs.so 0x1004e000 0x1006f000 [132.0 kB] flags=fmF perm=rx offset=0xa000 path=/libzfs.so 0x1006f000 0x10079000 [40.0 kB]flags=fmF perm=roffset=0x0002b000 path=/libzfs.so 0x10079000 0x1007a000 [4.0 kB] flags=fmF perm=roffset=0x00034000 path=/libzfs.so *0x1007a000 0x1007c000 [8.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so * I wonder if we have a bug when calculating where each segment should be mapped: 400 void file::load_segment(const Elf64_Phdr& phdr) 401 { 402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size); 403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart; 404 ulong filesz = align_up(filesz_unaligned, mmu::page_size); 405 ulong memsz = align_up(phdr.p_vaddr + phdr.p_memsz, mmu::page_size) - vstart; 406 407 unsigned perm = get_segment_mmap_permissions(phdr); 408 409 auto flag = mmu::mmap_fixed | (mlocked() ? mmu::mmap_populate : 0); 410 mmu::map_file(_base + vstart, filesz, flag, perm, _f, align_down(phdr.p_offset, mmu::page_size)); 411 if (phdr.p_filesz != phdr.p_memsz) { 412 assert(perm & mmu::perm_write); 413 memset(_base + vstart + filesz_unaligned, 0, filesz - filesz_unaligned); 414 if (memsz != filesz) { 415 mmu::map_anon(_base + vstart + filesz, memsz - filesz, flag, perm); 416 } 417 } 418 elf_debug("Loaded and mapped PT_LOAD segment at: %018p of size: 0x%x\n", _base + vstart, filesz); 419 } BTW I am also interested what the output of this readelf command for your libzfs.so is. Mine is this: readelf -l build/release/libzfs-stripped.so Elf file type is DYN (Shared object file) Entry point 0xd1d0 There are 11 program headers, starting at offset 64 Program Headers: Type Offset VirtAddr PhysAddr FileSizMemSiz Flags Align LOAD 0x 0x 0x 0x98e0 0x98e0 R 0x1000 LOAD 0xa000 0xa000 0xa000 0x000201a1 0x000201a1 R E0x1000 LOAD 0x0002b000 0x0002b000 0x0002b000 0x9258 0x9258 R 0x1000 LOAD 0x00034cb0 0x00035cb0 0x00035cb0 0x17f0 0x2c00 RW 0x1000 DYNAMIC0x00034d80 0x00035d80 0x00035d80 0x01d0 0x01d0 RW 0x8 NOTE 0x02a8 0x02a8 0x02a8 0x0020 0x0020 R 0x8 NOTE 0x02c8 0x02c8 0x02c8 0x0024 0x0024 R 0x4 GNU_PROPERTY 0x02a8 0x02a8 0x02a8 0x0020 0x0020 R 0x8 GNU_EH_FRAME 0x0002f768 0x0002f768 0x0002f768 0x0cec 0x0cec R 0x4 GNU_STACK 0x 0x 0x 0x 0x RW 0x10 GNU_RELRO 0x00034cb0 0x00035cb0 0x00035cb0
Re: [osv-dev] Re: Pip packages/using Nix
My gdb is not the strongest but if I hbreak on arch_relocate_jump_slot looks like _pathname is /libzfs.so, eventually /zpool.so, and then a single /libzfs.so before continue hangs On Tuesday, December 8, 2020 at 9:11:15 AM UTC-7 Matthew Kenigsberg wrote: > Nix is a package manager, and NixOS is an operating system built > completely around the package manager. So every library is stored somewhere > in /nix/store, like for example on Nix there is never anything like > /lib64/ld-linux-x86-64.so. > It would be /nix/store/.../ld-linux-x86-64.so. I could install the package > manager on a different OS, in which case I might have both /lib64 and > /nix/store, but on NixOS I'll just have the latter. Does that make sense? Not > sure if that's messing up something with linking. Guessing I can't > reproduce the error on any other OS, but happy to try. > On Tuesday, December 8, 2020 at 9:05:11 AM UTC-7 Matthew Kenigsberg wrote: > >> (gdb) connect >> abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at >> runtime.cc:105 >> 105 do {} while (true); >> (gdb) osv syms >> manifest.find_file: path=/libvdso.so, found file=libvdso.so >> /home/matthew/osv/build/release.x64/libvdso.so 0x1000 >> add symbol table from file >> "/home/matthew/osv/build/release.x64/libvdso.so" at >> .text_addr = 0x10001040 >> .hash_addr = 0x11c8 >> .gnu.hash_addr = 0x1200 >> .dynsym_addr = 0x1238 >> .dynstr_addr = 0x12f8 >> .gnu.version_addr = 0x13be >> .gnu.version_d_addr = 0x13d0 >> .rela.plt_addr = 0x1408 >> .plt_addr = 0x10001000 >> .eh_frame_addr = 0x10002000 >> .dynamic_addr = 0x10003e60 >> .got_addr = 0x10003fd0 >> .comment_addr = 0x1000 >> .debug_aranges_addr = 0x1000 >> .debug_info_addr = 0x1000 >> .debug_abbrev_addr = 0x1000 >> .debug_line_addr = 0x1000 >> .debug_str_addr = 0x1000 >> .debug_loc_addr = 0x1000 >> .symtab_addr = 0x1000 >> .strtab_addr = 0x1000 >> warning: section .comment not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_aranges not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_info not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_abbrev not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_line not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_str not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .debug_loc not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .symtab not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> warning: section .strtab not found in >> /home/matthew/osv/build/release.x64/libvdso.so >> manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000 >> add symbol table from file >> "/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at >> .text_addr = 0x10006250 >> .hash_addr = 0x10004200 >> .gnu.hash_addr = 0x10004360 >> .dynsym_addr = 0x100043c0 >> .dynstr_addr = 0x10004840 >> .gnu.version_addr = 0x10005092 >> .gnu.version_r_addr = 0x100050f8 >> .rela.dyn_addr = 0x10005148 >> .rela.plt_addr = 0x10005298 >> .init_addr = 0x10006000 >> .plt_addr = 0x10006020 >> .plt.got_addr = 0x10006240 >> .fini_addr = 0x1000737c >> .rodata_addr = 0x10008000 >> .eh_frame_hdr_addr = 0x1000817c >> .eh_frame_addr = 0x10008210 >> .gcc_except_table_addr = 0x10008530 >> .init_array_addr = 0x10009c60 >> .fini_array_addr = 0x10009c70 >> .dynamic_addr = 0x10009c78 >> .got_addr = 0x10009e98 >> .data_addr = 0x1000a000 >> .bss_addr = 0x1000a010 >> .comment_addr = 0x10004000 >> .debug_aranges_addr = 0x10004000 >> .debug_info_addr = 0x10004000 >> .debug_abbrev_addr = 0x10004000 >> --Type for more, q to quit, c to continue without paging--c >> .debug_line_addr = 0x10004000 >> .debug_str_addr = 0x10004000 >> .debug_loc_addr = 0x10004000 >> .debug_ranges_addr = 0x10004000 >> .symtab_addr = 0x10004000 >> .strtab_addr = 0x10004000 >> warning: section .comment not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning: section .debug_aranges not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning: section .debug_info not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning: section .debug_abbrev not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning: section .debug_line not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning: section .debug_str not found in >> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so >> warning:
Re: [osv-dev] Re: Pip packages/using Nix
Nix is a package manager, and NixOS is an operating system built completely around the package manager. So every library is stored somewhere in /nix/store, like for example on Nix there is never anything like /lib64/ld-linux-x86-64.so. It would be /nix/store/.../ld-linux-x86-64.so. I could install the package manager on a different OS, in which case I might have both /lib64 and /nix/store, but on NixOS I'll just have the latter. Does that make sense? Not sure if that's messing up something with linking. Guessing I can't reproduce the error on any other OS, but happy to try. On Tuesday, December 8, 2020 at 9:05:11 AM UTC-7 Matthew Kenigsberg wrote: > (gdb) connect > abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at > runtime.cc:105 > 105 do {} while (true); > (gdb) osv syms > manifest.find_file: path=/libvdso.so, found file=libvdso.so > /home/matthew/osv/build/release.x64/libvdso.so 0x1000 > add symbol table from file > "/home/matthew/osv/build/release.x64/libvdso.so" at > .text_addr = 0x10001040 > .hash_addr = 0x11c8 > .gnu.hash_addr = 0x1200 > .dynsym_addr = 0x1238 > .dynstr_addr = 0x12f8 > .gnu.version_addr = 0x13be > .gnu.version_d_addr = 0x13d0 > .rela.plt_addr = 0x1408 > .plt_addr = 0x10001000 > .eh_frame_addr = 0x10002000 > .dynamic_addr = 0x10003e60 > .got_addr = 0x10003fd0 > .comment_addr = 0x1000 > .debug_aranges_addr = 0x1000 > .debug_info_addr = 0x1000 > .debug_abbrev_addr = 0x1000 > .debug_line_addr = 0x1000 > .debug_str_addr = 0x1000 > .debug_loc_addr = 0x1000 > .symtab_addr = 0x1000 > .strtab_addr = 0x1000 > warning: section .comment not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_aranges not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_info not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_abbrev not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_line not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_str not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .debug_loc not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .symtab not found in > /home/matthew/osv/build/release.x64/libvdso.so > warning: section .strtab not found in > /home/matthew/osv/build/release.x64/libvdso.so > manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000 > add symbol table from file > "/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at > .text_addr = 0x10006250 > .hash_addr = 0x10004200 > .gnu.hash_addr = 0x10004360 > .dynsym_addr = 0x100043c0 > .dynstr_addr = 0x10004840 > .gnu.version_addr = 0x10005092 > .gnu.version_r_addr = 0x100050f8 > .rela.dyn_addr = 0x10005148 > .rela.plt_addr = 0x10005298 > .init_addr = 0x10006000 > .plt_addr = 0x10006020 > .plt.got_addr = 0x10006240 > .fini_addr = 0x1000737c > .rodata_addr = 0x10008000 > .eh_frame_hdr_addr = 0x1000817c > .eh_frame_addr = 0x10008210 > .gcc_except_table_addr = 0x10008530 > .init_array_addr = 0x10009c60 > .fini_array_addr = 0x10009c70 > .dynamic_addr = 0x10009c78 > .got_addr = 0x10009e98 > .data_addr = 0x1000a000 > .bss_addr = 0x1000a010 > .comment_addr = 0x10004000 > .debug_aranges_addr = 0x10004000 > .debug_info_addr = 0x10004000 > .debug_abbrev_addr = 0x10004000 > --Type for more, q to quit, c to continue without paging--c > .debug_line_addr = 0x10004000 > .debug_str_addr = 0x10004000 > .debug_loc_addr = 0x10004000 > .debug_ranges_addr = 0x10004000 > .symtab_addr = 0x10004000 > .strtab_addr = 0x10004000 > warning: section .comment not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_aranges not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_info not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_abbrev not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_line not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_str not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_loc not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .debug_ranges not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .symtab not found in > /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so > warning: section .strtab not found in >
Re: [osv-dev] Re: Pip packages/using Nix
(gdb) connect abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at runtime.cc:105 105 do {} while (true); (gdb) osv syms manifest.find_file: path=/libvdso.so, found file=libvdso.so /home/matthew/osv/build/release.x64/libvdso.so 0x1000 add symbol table from file "/home/matthew/osv/build/release.x64/libvdso.so" at .text_addr = 0x10001040 .hash_addr = 0x11c8 .gnu.hash_addr = 0x1200 .dynsym_addr = 0x1238 .dynstr_addr = 0x12f8 .gnu.version_addr = 0x13be .gnu.version_d_addr = 0x13d0 .rela.plt_addr = 0x1408 .plt_addr = 0x10001000 .eh_frame_addr = 0x10002000 .dynamic_addr = 0x10003e60 .got_addr = 0x10003fd0 .comment_addr = 0x1000 .debug_aranges_addr = 0x1000 .debug_info_addr = 0x1000 .debug_abbrev_addr = 0x1000 .debug_line_addr = 0x1000 .debug_str_addr = 0x1000 .debug_loc_addr = 0x1000 .symtab_addr = 0x1000 .strtab_addr = 0x1000 warning: section .comment not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_aranges not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_info not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_abbrev not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_line not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_str not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .debug_loc not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .symtab not found in /home/matthew/osv/build/release.x64/libvdso.so warning: section .strtab not found in /home/matthew/osv/build/release.x64/libvdso.so manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000 add symbol table from file "/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at .text_addr = 0x10006250 .hash_addr = 0x10004200 .gnu.hash_addr = 0x10004360 .dynsym_addr = 0x100043c0 .dynstr_addr = 0x10004840 .gnu.version_addr = 0x10005092 .gnu.version_r_addr = 0x100050f8 .rela.dyn_addr = 0x10005148 .rela.plt_addr = 0x10005298 .init_addr = 0x10006000 .plt_addr = 0x10006020 .plt.got_addr = 0x10006240 .fini_addr = 0x1000737c .rodata_addr = 0x10008000 .eh_frame_hdr_addr = 0x1000817c .eh_frame_addr = 0x10008210 .gcc_except_table_addr = 0x10008530 .init_array_addr = 0x10009c60 .fini_array_addr = 0x10009c70 .dynamic_addr = 0x10009c78 .got_addr = 0x10009e98 .data_addr = 0x1000a000 .bss_addr = 0x1000a010 .comment_addr = 0x10004000 .debug_aranges_addr = 0x10004000 .debug_info_addr = 0x10004000 .debug_abbrev_addr = 0x10004000 --Type for more, q to quit, c to continue without paging--c .debug_line_addr = 0x10004000 .debug_str_addr = 0x10004000 .debug_loc_addr = 0x10004000 .debug_ranges_addr = 0x10004000 .symtab_addr = 0x10004000 .strtab_addr = 0x10004000 warning: section .comment not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_aranges not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_info not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_abbrev not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_line not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_str not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_loc not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .debug_ranges not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .symtab not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so warning: section .strtab not found in /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so manifest.find_file: path=/libzfs.so, found file=libzfs.so /home/matthew/osv/build/release.x64/libzfs.so 0x1000b000 add symbol table from file "/home/matthew/osv/build/release.x64/libzfs.so" at .text_addr = 0x100178f0 .hash_addr = 0x1000b200 .gnu.hash_addr = 0x1000c318 .dynsym_addr = 0x1000cd78 .dynstr_addr = 0x10010300 .gnu.version_addr = 0x100124e2 .gnu.version_r_addr = 0x10012958 .rela.dyn_addr = 0x100129b8 .rela.plt_addr = 0x100134b0 .init_addr = 0x10016000 .plt_addr = 0x10016020 .plt.got_addr = 0x100178e0 .fini_addr = 0x100340a0 .rodata_addr = 0x10035000 .eh_frame_hdr_addr = 0x10039768 .eh_frame_addr = 0x1003a470 .init_array_addr = 0x10040010 .fini_array_addr = 0x10040018 .data.rel.ro_addr = 0x10040020 .dynamic_addr = 0x100400e0 .got_addr = 0x100402f0 .data_addr =
Re: [osv-dev] Re: Pip packages/using Nix
It would be also nice to understand if we are crashing on the 1st arch_relocate_jump_slot() for libfzs.so or is it a specific JUMP_SLOT that causes this crash? On Tuesday, December 8, 2020 at 10:39:06 AM UTC-5 Waldek Kozaczuk wrote: > After you connect with gdb can you run 'osv mmap' and send us the output. > Make sure you run 'osv syms' before it and dump backtrace after. Please see > https://github.com/cloudius-systems/osv/wiki/Debugging-OSv for any > details. > > BTW can you build and run OSv ZFS image on the host without NIX? As I > understand NIX is really just a layer on top of any Linux distribution, no? > I am afraid I do not still understand what exactly NiX is I guess. > > > On Monday, December 7, 2020 at 2:58:40 PM UTC-5 Matthew Kenigsberg wrote: > >> (gdb) frame 18 >> #18 0x4039c95a in elf::object::arch_relocate_jump_slot >> (this=this@entry=0xa110fa00, sym=..., >> addr=addr@entry=0x10040ca8, addend=addend@entry=0) at >> arch/x64/arch-elf.cc:172 >> 172*static_cast(addr) = sym.relocated_addr(); >> (gdb) print _pathname >> $14 = {static npos = 18446744073709551615, >> _M_dataplus = {> = >> {<__gnu_cxx::new_allocator> = {}, }, >> _M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, { >> _M_local_buf = "/libzfs.so\000\000\000\000\000", >> _M_allocated_capacity = 3347131623889529903}} >> >> Also been wondering if nix using nonstandard paths is causing problems, >> like for libc: >> [nix-shell:~/osv/build/release]$ ldd libzfs.so >> linux-vdso.so.1 (0x7ffcedbb9000) >> libuutil.so => not found >> libc.so.6 => >> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 >> (0x7f7594f38000) >> >> >> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2 >> >> (0x7f7595131000) >> On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote: >> >>> It might be easier to simply print '_pathname' value if you switch to >>> the right frame in gdb. It would be nice to confirm that the problem we >>> have is with zpool.so and that might lead to understanding why this crash >>> happens. Maybe the is something wrong with building zpool.so. >>> >>> BTW based on this fragment of the stacktrace: >>> >>> #6 0x4035cb07 in elf::program::>> elf::program::modules_list&)>::operator() ( >>> __closure=, __closure=, >>> ml=...) at core/elf.cc:1620 >>> #7 elf::program::with_modules>> const*):: > >>> (f=..., this=0xa0097e70) at include/osv/elf.hh:702 >>> #8 elf::program::lookup_addr (this=0xa0097e70, >>> addr=addr@entry=0x100254ce) at core/elf.cc:1617 >>> #9 0x404357cc in osv::lookup_name_demangled >>> (addr=addr@entry=0x100254ce, >>> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) >>> at core/demangle.cc:47 >>> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 >>> >>> It seems we have a bug (or need of improvement) in print_backtrace() to >>> make it NOT try to demangle names like "???+19630095" which causes >>> follow-up fault. >>> >>> At the same time, it is strange that we crash at line 983 which seems to >>> indicate something goes wrong when processing zpool.so. >>> >>> 981 if (dynamic_exists(DT_HASH)) { >>> >>> 982 auto hashtab = dynamic_ptr(DT_HASH); >>> >>> *983 return hashtab[1];* >>> >>> 984 } >>> >>> On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote: >>> Can you run the ROFS image you built? Also as I understand it NIX is a package manager but what Linux distribution are you using? As far as ZFS goes could you enable ELF debugging - change this line: conf-debug_elf=0 To conf-debug_elf=1 In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think you may also need to change the script upload_manifest.py to peeped ‘—verbose’ to the command line with cpiod.so It should show more info about elf loading. It may still be necessary to add extra printouts to capture which exact elf it is crashing on in arch_relocate_jump(). In worst case I would need a copy of your loader-stripped.elf and possibly all the other files like cpiod.so, zfs.so that go into the bootfs part of the image. Regards, Waldek On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg wrote: > After forcing it to use the right path for libz.so.1, it's working > with rofs, but still having the same issue when using zfs, even after I > correct the path for libz. > > On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg > wrote: > >> gcc version 9.3.0 (GCC) >> QEMU emulator version 5.1.0 >> >> Running with fs=rofs I get the error: >> Traceback (most recent call last): >> File "/home/matthew/osv/scripts/gen-rofs-img.py",
Re: [osv-dev] Re: Pip packages/using Nix
After you connect with gdb can you run 'osv mmap' and send us the output. Make sure you run 'osv syms' before it and dump backtrace after. Please see https://github.com/cloudius-systems/osv/wiki/Debugging-OSv for any details. BTW can you build and run OSv ZFS image on the host without NIX? As I understand NIX is really just a layer on top of any Linux distribution, no? I am afraid I do not still understand what exactly NiX is I guess. On Monday, December 7, 2020 at 2:58:40 PM UTC-5 Matthew Kenigsberg wrote: > (gdb) frame 18 > #18 0x4039c95a in elf::object::arch_relocate_jump_slot > (this=this@entry=0xa110fa00, sym=..., > addr=addr@entry=0x10040ca8, addend=addend@entry=0) at > arch/x64/arch-elf.cc:172 > 172*static_cast(addr) = sym.relocated_addr(); > (gdb) print _pathname > $14 = {static npos = 18446744073709551615, > _M_dataplus = {> = > {<__gnu_cxx::new_allocator> = {}, }, > _M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, { > _M_local_buf = "/libzfs.so\000\000\000\000\000", _M_allocated_capacity > = 3347131623889529903}} > > Also been wondering if nix using nonstandard paths is causing problems, > like for libc: > [nix-shell:~/osv/build/release]$ ldd libzfs.so > linux-vdso.so.1 (0x7ffcedbb9000) > libuutil.so => not found > libc.so.6 => > /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 > (0x7f7594f38000) > > > /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2 > > (0x7f7595131000) > On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote: > >> It might be easier to simply print '_pathname' value if you switch to the >> right frame in gdb. It would be nice to confirm that the problem we have is >> with zpool.so and that might lead to understanding why this crash happens. >> Maybe the is something wrong with building zpool.so. >> >> BTW based on this fragment of the stacktrace: >> >> #6 0x4035cb07 in elf::program::> elf::program::modules_list&)>::operator() ( >> __closure=, __closure=, ml=...) >> at core/elf.cc:1620 >> #7 elf::program::with_modules> const*):: > >> (f=..., this=0xa0097e70) at include/osv/elf.hh:702 >> #8 elf::program::lookup_addr (this=0xa0097e70, >> addr=addr@entry=0x100254ce) at core/elf.cc:1617 >> #9 0x404357cc in osv::lookup_name_demangled >> (addr=addr@entry=0x100254ce, >> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) >> at core/demangle.cc:47 >> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 >> >> It seems we have a bug (or need of improvement) in print_backtrace() to >> make it NOT try to demangle names like "???+19630095" which causes >> follow-up fault. >> >> At the same time, it is strange that we crash at line 983 which seems to >> indicate something goes wrong when processing zpool.so. >> >> 981 if (dynamic_exists(DT_HASH)) { >> >> 982 auto hashtab = dynamic_ptr(DT_HASH); >> >> *983 return hashtab[1];* >> >> 984 } >> >> On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote: >> >>> Can you run the ROFS image you built? Also as I understand it NIX is a >>> package manager but what Linux distribution are you using? >>> >>> As far as ZFS goes could you enable ELF debugging - change this line: >>> >>> conf-debug_elf=0 >>> >>> To >>> >>> conf-debug_elf=1 >>> >>> In conf/base.mk, delete core/elf.o and force rebuild the kernel. I >>> think you may also need to change the script upload_manifest.py to peeped >>> ‘—verbose’ to the command line with cpiod.so >>> >>> It should show more info about elf loading. It may still be necessary to >>> add extra printouts to capture which exact elf it is crashing on in >>> arch_relocate_jump(). >>> >>> In worst case I would need a copy of your loader-stripped.elf and >>> possibly all the other files like cpiod.so, zfs.so that go into the bootfs >>> part of the image. >>> >>> Regards, >>> Waldek >>> >>> >>> On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg >>> wrote: >>> After forcing it to use the right path for libz.so.1, it's working with rofs, but still having the same issue when using zfs, even after I correct the path for libz. On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg wrote: > gcc version 9.3.0 (GCC) > QEMU emulator version 5.1.0 > > Running with fs=rofs I get the error: > Traceback (most recent call last): > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in > > main() > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main > gen_image(outfile, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in > gen_image > system_structure_block, bytes_written = write_fs(fp, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py",
Re: [osv-dev] Re: Pip packages/using Nix
(gdb) frame 18 #18 0x4039c95a in elf::object::arch_relocate_jump_slot (this=this@entry=0xa110fa00, sym=..., addr=addr@entry=0x10040ca8, addend=addend@entry=0) at arch/x64/arch-elf.cc:172 172*static_cast(addr) = sym.relocated_addr(); (gdb) print _pathname $14 = {static npos = 18446744073709551615, _M_dataplus = {> = {<__gnu_cxx::new_allocator> = {}, }, _M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, { _M_local_buf = "/libzfs.so\000\000\000\000\000", _M_allocated_capacity = 3347131623889529903}} Also been wondering if nix using nonstandard paths is causing problems, like for libc: [nix-shell:~/osv/build/release]$ ldd libzfs.so linux-vdso.so.1 (0x7ffcedbb9000) libuutil.so => not found libc.so.6 => /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 (0x7f7594f38000) /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2 (0x7f7595131000) On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote: > It might be easier to simply print '_pathname' value if you switch to the > right frame in gdb. It would be nice to confirm that the problem we have is > with zpool.so and that might lead to understanding why this crash happens. > Maybe the is something wrong with building zpool.so. > > BTW based on this fragment of the stacktrace: > > #6 0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() ( > __closure=, __closure=, ml=...) > at core/elf.cc:1620 > #7 elf::program::with_modules const*):: > > (f=..., this=0xa0097e70) at include/osv/elf.hh:702 > #8 elf::program::lookup_addr (this=0xa0097e70, > addr=addr@entry=0x100254ce) at core/elf.cc:1617 > #9 0x404357cc in osv::lookup_name_demangled > (addr=addr@entry=0x100254ce, > buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) > at core/demangle.cc:47 > #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 > > It seems we have a bug (or need of improvement) in print_backtrace() to > make it NOT try to demangle names like "???+19630095" which causes > follow-up fault. > > At the same time, it is strange that we crash at line 983 which seems to > indicate something goes wrong when processing zpool.so. > > 981 if (dynamic_exists(DT_HASH)) { > > 982 auto hashtab = dynamic_ptr(DT_HASH); > > *983 return hashtab[1];* > > 984 } > > On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote: > >> Can you run the ROFS image you built? Also as I understand it NIX is a >> package manager but what Linux distribution are you using? >> >> As far as ZFS goes could you enable ELF debugging - change this line: >> >> conf-debug_elf=0 >> >> To >> >> conf-debug_elf=1 >> >> In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think >> you may also need to change the script upload_manifest.py to peeped >> ‘—verbose’ to the command line with cpiod.so >> >> It should show more info about elf loading. It may still be necessary to >> add extra printouts to capture which exact elf it is crashing on in >> arch_relocate_jump(). >> >> In worst case I would need a copy of your loader-stripped.elf and >> possibly all the other files like cpiod.so, zfs.so that go into the bootfs >> part of the image. >> >> Regards, >> Waldek >> >> >> On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg >> wrote: >> >>> After forcing it to use the right path for libz.so.1, it's working with >>> rofs, but still having the same issue when using zfs, even after I correct >>> the path for libz. >>> >>> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg >>> wrote: >>> gcc version 9.3.0 (GCC) QEMU emulator version 5.1.0 Running with fs=rofs I get the error: Traceback (most recent call last): File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in main() File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main gen_image(outfile, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image system_structure_block, bytes_written = write_fs(fp, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs count, directory_entries_index = write_dir(fp, manifest.get(''), '', manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir count, directory_entries_index = write_dir(fp, val, dirpath + '/' + entry, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir count, directory_entries_index = write_dir(fp, val, dirpath + '/' + entry, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir inode.count = write_file(fp, val) File
Re: [osv-dev] Re: Pip packages/using Nix
It might be easier to simply print '_pathname' value if you switch to the right frame in gdb. It would be nice to confirm that the problem we have is with zpool.so and that might lead to understanding why this crash happens. Maybe the is something wrong with building zpool.so. BTW based on this fragment of the stacktrace: #6 0x4035cb07 in elf::programoperator() ( __closure=, __closure=, ml=...) at core/elf.cc:1620 #7 elf::program::with_modules > (f=..., this=0xa0097e70) at include/osv/elf.hh:702 #8 elf::program::lookup_addr (this=0xa0097e70, addr=addr@entry=0x100254ce) at core/elf.cc:1617 #9 0x404357cc in osv::lookup_name_demangled (addr=addr@entry=0x100254ce, buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) at core/demangle.cc:47 #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 It seems we have a bug (or need of improvement) in print_backtrace() to make it NOT try to demangle names like "???+19630095" which causes follow-up fault. At the same time, it is strange that we crash at line 983 which seems to indicate something goes wrong when processing zpool.so. 981 if (dynamic_exists(DT_HASH)) { 982 auto hashtab = dynamic_ptr(DT_HASH); *983 return hashtab[1];* 984 } On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote: > Can you run the ROFS image you built? Also as I understand it NIX is a > package manager but what Linux distribution are you using? > > As far as ZFS goes could you enable ELF debugging - change this line: > > conf-debug_elf=0 > > To > > conf-debug_elf=1 > > In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think > you may also need to change the script upload_manifest.py to peeped > ‘—verbose’ to the command line with cpiod.so > > It should show more info about elf loading. It may still be necessary to > add extra printouts to capture which exact elf it is crashing on in > arch_relocate_jump(). > > In worst case I would need a copy of your loader-stripped.elf and possibly > all the other files like cpiod.so, zfs.so that go into the bootfs part of > the image. > > Regards, > Waldek > > > On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg > wrote: > >> After forcing it to use the right path for libz.so.1, it's working with >> rofs, but still having the same issue when using zfs, even after I correct >> the path for libz. >> >> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg >> wrote: >> >>> gcc version 9.3.0 (GCC) >>> QEMU emulator version 5.1.0 >>> >>> Running with fs=rofs I get the error: >>> Traceback (most recent call last): >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in >>> main() >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main >>> gen_image(outfile, manifest) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in >>> gen_image >>> system_structure_block, bytes_written = write_fs(fp, manifest) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs >>> count, directory_entries_index = write_dir(fp, manifest.get(''), '', >>> manifest) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in >>> write_dir >>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + >>> entry, manifest) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in >>> write_dir >>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + >>> entry, manifest) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in >>> write_dir >>> inode.count = write_file(fp, val) >>> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in >>> write_file >>> with open(path, 'rb') as f: >>> FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1' >>> >>> I think that's from this line in usr.manifest? >>> /usr/lib/libz.so.1: libz.so.1 >>> >>> Don't have zlib in the manifest without fs=rofs, and I think zpool uses >>> it? >>> >>> Looking into it... >>> On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com >>> wrote: >>> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the code fragment where it happens: 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, Elf64_Sxword addend) 170 { 171 if (sym.symbol) { 172 *static_cast(addr) = sym.relocated_addr(); 173 return true; 174 } else { 175 return false; 176 } 177 } It looks like writing at the addr 0x10040ca8 in line 172 caused the fault. Why? And then the 2nd page fault in the gdb backtrace as the 1st one was being handled (not sure if that is a bug or just a state of loading of a program). 981 if (dynamic_exists(DT_HASH)) { 982
Re: [osv-dev] Re: Pip packages/using Nix
Can you run the ROFS image you built? Also as I understand it NIX is a package manager but what Linux distribution are you using? As far as ZFS goes could you enable ELF debugging - change this line: conf-debug_elf=0 To conf-debug_elf=1 In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think you may also need to change the script upload_manifest.py to peeped ‘—verbose’ to the command line with cpiod.so It should show more info about elf loading. It may still be necessary to add extra printouts to capture which exact elf it is crashing on in arch_relocate_jump(). In worst case I would need a copy of your loader-stripped.elf and possibly all the other files like cpiod.so, zfs.so that go into the bootfs part of the image. Regards, Waldek On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg wrote: > After forcing it to use the right path for libz.so.1, it's working with > rofs, but still having the same issue when using zfs, even after I correct > the path for libz. > > On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg wrote: > >> gcc version 9.3.0 (GCC) >> QEMU emulator version 5.1.0 >> >> Running with fs=rofs I get the error: >> Traceback (most recent call last): >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in >> main() >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main >> gen_image(outfile, manifest) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image >> system_structure_block, bytes_written = write_fs(fp, manifest) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs >> count, directory_entries_index = write_dir(fp, manifest.get(''), '', >> manifest) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir >> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + >> entry, manifest) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir >> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + >> entry, manifest) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir >> inode.count = write_file(fp, val) >> File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in >> write_file >> with open(path, 'rb') as f: >> FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1' >> >> I think that's from this line in usr.manifest? >> /usr/lib/libz.so.1: libz.so.1 >> >> Don't have zlib in the manifest without fs=rofs, and I think zpool uses >> it? >> >> Looking into it... >> On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com >> wrote: >> >>> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the >>> code fragment where it happens: >>> >>> 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, >>> Elf64_Sxword addend) >>> >>> 170 { >>> >>> 171 if (sym.symbol) { >>> >>> 172 *static_cast(addr) = sym.relocated_addr(); >>> >>> 173 return true; >>> >>> 174 } else { >>> >>> 175 return false; >>> >>> 176 } >>> >>> 177 } >>> It looks like writing at the addr 0x10040ca8 in line 172 caused the >>> fault. Why? >>> >>> And then the 2nd page fault in the gdb backtrace as the 1st one was >>> being handled (not sure if that is a bug or just a state of loading of a >>> program). >>> >>> 981 if (dynamic_exists(DT_HASH)) { >>> >>> 982 auto hashtab = dynamic_ptr(DT_HASH); >>> >>> 983 return hashtab[1]; >>> >>> 984 } >>> Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or >>> something? >>> >>> Can you try to do the same with ROFS? >>> >>> fs=rofs >>> On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg >>> wrote: >>> Struggling to get scripts/build to run on NixOS because I'm getting a page fault. NixOS does keep shared libraries in nonstandard locations, not sure if that's breaking something. More details below, but any ideas? As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: run_cmd("/zpool.so", zpool_args); The error from scripts/build: OSv v0.55.0-145-g97f17a7a eth0: 192.168.122.15 Booted up in 154.38 ms Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set compression=off osv Running mkfs... page fault outside application, addr: 0x10040ca8 [registers] RIP: 0x4039c25a RFL: 0x00010202 CS: 0x0008 SS: 0x0010 RAX: 0x1007a340 RBX: 0x10040ca8 RCX: 0x1006abb0 RDX: 0x0002 RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: 0x201f6f30 R8: 0xa0a68460 R9: 0xa0f18da0 R10: 0x R11: 0x409dd380 R12: 0xa0f18c00 R13: 0xa0f18da0 R14: 0x R15: 0x409dd380 RSP:
Re: [osv-dev] Re: Pip packages/using Nix
After forcing it to use the right path for libz.so.1, it's working with rofs, but still having the same issue when using zfs, even after I correct the path for libz. On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg wrote: > gcc version 9.3.0 (GCC) > QEMU emulator version 5.1.0 > > Running with fs=rofs I get the error: > Traceback (most recent call last): > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in > main() > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main > gen_image(outfile, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image > system_structure_block, bytes_written = write_fs(fp, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs > count, directory_entries_index = write_dir(fp, manifest.get(''), '', > manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir > count, directory_entries_index = write_dir(fp, val, dirpath + '/' + > entry, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir > count, directory_entries_index = write_dir(fp, val, dirpath + '/' + > entry, manifest) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir > inode.count = write_file(fp, val) > File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in write_file > with open(path, 'rb') as f: > FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1' > > I think that's from this line in usr.manifest? > /usr/lib/libz.so.1: libz.so.1 > > Don't have zlib in the manifest without fs=rofs, and I think zpool uses it? > > Looking into it... > On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com > wrote: > >> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the >> code fragment where it happens: >> >> 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, >> Elf64_Sxword addend) >> >> 170 { >> >> 171 if (sym.symbol) { >> >> 172 *static_cast(addr) = sym.relocated_addr(); >> >> 173 return true; >> >> 174 } else { >> >> 175 return false; >> >> 176 } >> >> 177 } >> It looks like writing at the addr 0x10040ca8 in line 172 caused the >> fault. Why? >> >> And then the 2nd page fault in the gdb backtrace as the 1st one was being >> handled (not sure if that is a bug or just a state of loading of a program). >> >> 981 if (dynamic_exists(DT_HASH)) { >> >> 982 auto hashtab = dynamic_ptr(DT_HASH); >> >> 983 return hashtab[1]; >> >> 984 } >> Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or >> something? >> >> Can you try to do the same with ROFS? >> >> fs=rofs >> On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg >> wrote: >> >>> Struggling to get scripts/build to run on NixOS because I'm getting a >>> page fault. NixOS does keep shared libraries in nonstandard locations, not >>> sure if that's breaking something. More details below, but any ideas? >>> >>> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: >>> run_cmd("/zpool.so", zpool_args); >>> >>> The error from scripts/build: >>> >>> OSv v0.55.0-145-g97f17a7a >>> eth0: 192.168.122.15 >>> Booted up in 154.38 ms >>> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set >>> compression=off osv >>> Running mkfs... >>> page fault outside application, addr: 0x10040ca8 >>> [registers] >>> RIP: 0x4039c25a >>> >>> RFL: 0x00010202 CS: 0x0008 SS: 0x0010 >>> RAX: 0x1007a340 RBX: 0x10040ca8 RCX: >>> 0x1006abb0 RDX: 0x0002 >>> RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: >>> 0x201f6f30 R8: 0xa0a68460 >>> R9: 0xa0f18da0 R10: 0x R11: >>> 0x409dd380 R12: 0xa0f18c00 >>> R13: 0xa0f18da0 R14: 0x R15: >>> 0x409dd380 RSP: 0x201f6f20 >>> Aborted >>> >>> [backtrace] >>> 0x403458d3 >>> 0x403477ce >>> 0x40398ba2 >>> 0x40397a16 >>> 0x40360a13 >>> 0x40360c38 >>> 0x4039764f >>> 0xa12b880f >>> >>> Trying to get a backtrace after connecting with gdb: >>> (gdb) bt >>> #0 abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: >>> %d)\n") at runtime.cc:105 >>> #1 0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 >>> "ef->rflags & processor::rflags_if", >>> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, >>> func=func@entry=0x40672d1a "page_fault") >>> at runtime.cc:139 >>> #2 0x40398c05 in page_fault (ef=0x80015048) at >>> arch/x64/arch-cpu.hh:107 >>> #3 >>> #4 0x4035c879 in elf::object::symtab_len >>> (this=0xa0f18c00) at core/elf.cc:983 >>> #5 0x4035c938 in
Re: [osv-dev] Re: Pip packages/using Nix
gcc version 9.3.0 (GCC) QEMU emulator version 5.1.0 Running with fs=rofs I get the error: Traceback (most recent call last): File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in main() File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main gen_image(outfile, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image system_structure_block, bytes_written = write_fs(fp, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs count, directory_entries_index = write_dir(fp, manifest.get(''), '', manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir count, directory_entries_index = write_dir(fp, val, dirpath + '/' + entry, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir count, directory_entries_index = write_dir(fp, val, dirpath + '/' + entry, manifest) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir inode.count = write_file(fp, val) File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in write_file with open(path, 'rb') as f: FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1' I think that's from this line in usr.manifest? /usr/lib/libz.so.1: libz.so.1 Don't have zlib in the manifest without fs=rofs, and I think zpool uses it? Looking into it... On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com wrote: > I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the code > fragment where it happens: > > 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, > Elf64_Sxword addend) > > 170 { > > 171 if (sym.symbol) { > > 172 *static_cast(addr) = sym.relocated_addr(); > > 173 return true; > > 174 } else { > > 175 return false; > > 176 } > > 177 } > It looks like writing at the addr 0x10040ca8 in line 172 caused the > fault. Why? > > And then the 2nd page fault in the gdb backtrace as the 1st one was being > handled (not sure if that is a bug or just a state of loading of a program). > > 981 if (dynamic_exists(DT_HASH)) { > > 982 auto hashtab = dynamic_ptr(DT_HASH); > > 983 return hashtab[1]; > > 984 } > Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or > something? > > Can you try to do the same with ROFS? > > fs=rofs > On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote: > >> Struggling to get scripts/build to run on NixOS because I'm getting a >> page fault. NixOS does keep shared libraries in nonstandard locations, not >> sure if that's breaking something. More details below, but any ideas? >> >> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: >> run_cmd("/zpool.so", zpool_args); >> >> The error from scripts/build: >> >> OSv v0.55.0-145-g97f17a7a >> eth0: 192.168.122.15 >> Booted up in 154.38 ms >> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set >> compression=off osv >> Running mkfs... >> page fault outside application, addr: 0x10040ca8 >> [registers] >> RIP: 0x4039c25a >> >> RFL: 0x00010202 CS: 0x0008 SS: 0x0010 >> RAX: 0x1007a340 RBX: 0x10040ca8 RCX: >> 0x1006abb0 RDX: 0x0002 >> RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: >> 0x201f6f30 R8: 0xa0a68460 >> R9: 0xa0f18da0 R10: 0x R11: >> 0x409dd380 R12: 0xa0f18c00 >> R13: 0xa0f18da0 R14: 0x R15: >> 0x409dd380 RSP: 0x201f6f20 >> Aborted >> >> [backtrace] >> 0x403458d3 >> 0x403477ce >> 0x40398ba2 >> 0x40397a16 >> 0x40360a13 >> 0x40360c38 >> 0x4039764f >> 0xa12b880f >> >> Trying to get a backtrace after connecting with gdb: >> (gdb) bt >> #0 abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: >> %d)\n") at runtime.cc:105 >> #1 0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 >> "ef->rflags & processor::rflags_if", >> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, >> func=func@entry=0x40672d1a "page_fault") >> at runtime.cc:139 >> #2 0x40398c05 in page_fault (ef=0x80015048) at >> arch/x64/arch-cpu.hh:107 >> #3 >> #4 0x4035c879 in elf::object::symtab_len >> (this=0xa0f18c00) at core/elf.cc:983 >> #5 0x4035c938 in elf::object::lookup_addr >> (this=0xa0f18c00, addr=addr@entry=0x100254ce) >> at core/elf.cc:1015 >> #6 0x4035cb07 in elf::program::> elf::program::modules_list&)>::operator() ( >> __closure=, __closure=, ml=...) >> at core/elf.cc:1620 >> #7 elf::program::with_modules> const*):: > >> (f=..., this=0xa0097e70) at include/osv/elf.hh:702 >> #8 elf::program::lookup_addr
Re: [osv-dev] Re: Pip packages/using Nix
I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the code fragment where it happens: 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, Elf64_Sxword addend) 170 { 171 if (sym.symbol) { 172 *static_cast(addr) = sym.relocated_addr(); 173 return true; 174 } else { 175 return false; 176 } 177 } It looks like writing at the addr 0x10040ca8 in line 172 caused the fault. Why? And then the 2nd page fault in the gdb backtrace as the 1st one was being handled (not sure if that is a bug or just a state of loading of a program). 981 if (dynamic_exists(DT_HASH)) { 982 auto hashtab = dynamic_ptr(DT_HASH); 983 return hashtab[1]; 984 } Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or something? Can you try to do the same with ROFS? fs=rofs On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote: > Struggling to get scripts/build to run on NixOS because I'm getting a page > fault. NixOS does keep shared libraries in nonstandard locations, not sure > if that's breaking something. More details below, but any ideas? > > As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: > run_cmd("/zpool.so", zpool_args); > > The error from scripts/build: > > OSv v0.55.0-145-g97f17a7a > eth0: 192.168.122.15 > Booted up in 154.38 ms > Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set > compression=off osv > Running mkfs... > page fault outside application, addr: 0x10040ca8 > [registers] > RIP: 0x4039c25a > > RFL: 0x00010202 CS: 0x0008 SS: 0x0010 > RAX: 0x1007a340 RBX: 0x10040ca8 RCX: 0x1006abb0 > RDX: 0x0002 > RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: 0x201f6f30 > R8: 0xa0a68460 > R9: 0xa0f18da0 R10: 0x R11: 0x409dd380 > R12: 0xa0f18c00 > R13: 0xa0f18da0 R14: 0x R15: 0x409dd380 > RSP: 0x201f6f20 > Aborted > > [backtrace] > 0x403458d3 > 0x403477ce > 0x40398ba2 > 0x40397a16 > 0x40360a13 > 0x40360c38 > 0x4039764f > 0xa12b880f > > Trying to get a backtrace after connecting with gdb: > (gdb) bt > #0 abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") > at runtime.cc:105 > #1 0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 > "ef->rflags & processor::rflags_if", > file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, > func=func@entry=0x40672d1a "page_fault") > at runtime.cc:139 > #2 0x40398c05 in page_fault (ef=0x80015048) at > arch/x64/arch-cpu.hh:107 > #3 > #4 0x4035c879 in elf::object::symtab_len > (this=0xa0f18c00) at core/elf.cc:983 > #5 0x4035c938 in elf::object::lookup_addr > (this=0xa0f18c00, addr=addr@entry=0x100254ce) > at core/elf.cc:1015 > #6 0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() ( > __closure=, __closure=, ml=...) > at core/elf.cc:1620 > #7 elf::program::with_modules const*):: > > (f=..., this=0xa0097e70) at include/osv/elf.hh:702 > #8 elf::program::lookup_addr (this=0xa0097e70, > addr=addr@entry=0x100254ce) at core/elf.cc:1617 > #9 0x404357cc in osv::lookup_name_demangled > (addr=addr@entry=0x100254ce, > buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) > at core/demangle.cc:47 > #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 > #11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at > runtime.cc:121 > #12 0x40202989 in abort () at runtime.cc:98 > #13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, > addr=) at core/mmu.cc:1314 > #14 mmu::vm_sigsegv (addr=, ef=0x81215068) at > core/mmu.cc:1308 > #15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, > ef=ef@entry=0x81215068) > at core/mmu.cc:1328 > #16 0x40398ba3 in page_fault (ef=0x81215068) at > arch/x64/mmu.cc:42 > #17 > #18 0x4039c25a in elf::object::arch_relocate_jump_slot > (this=this@entry=0xa0f18c00, sym=..., > addr=addr@entry=0x10040ca8, addend=addend@entry=0) at > arch/x64/arch-elf.cc:172 > #19 0x40360a14 in elf::object::resolve_pltgot > (this=0xa0f18c00, index=) > at core/elf.cc:843 > #20 0x40360c39 in elf_resolve_pltgot (index=308, > obj=0xa0f18c00) at core/elf.cc:1860 > #21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47 > #22 0x100254cf in ?? () > #23 0xa12b8800 in ?? () > #24 0x201f74a0 in ?? () > #25 0x100254cf in ?? () > #26 0x201f7480 in ?? () > #27 0x403f241c in calloc (nmemb=, size= out>) at
Re: [osv-dev] Re: Pip packages/using Nix
Which version of GCC and QEMU/KV< are you using? On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote: > Struggling to get scripts/build to run on NixOS because I'm getting a page > fault. NixOS does keep shared libraries in nonstandard locations, not sure > if that's breaking something. More details below, but any ideas? > > As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: > run_cmd("/zpool.so", zpool_args); > > The error from scripts/build: > > OSv v0.55.0-145-g97f17a7a > eth0: 192.168.122.15 > Booted up in 154.38 ms > Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set > compression=off osv > Running mkfs... > page fault outside application, addr: 0x10040ca8 > [registers] > RIP: 0x4039c25a > > RFL: 0x00010202 CS: 0x0008 SS: 0x0010 > RAX: 0x1007a340 RBX: 0x10040ca8 RCX: 0x1006abb0 > RDX: 0x0002 > RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: 0x201f6f30 > R8: 0xa0a68460 > R9: 0xa0f18da0 R10: 0x R11: 0x409dd380 > R12: 0xa0f18c00 > R13: 0xa0f18da0 R14: 0x R15: 0x409dd380 > RSP: 0x201f6f20 > Aborted > > [backtrace] > 0x403458d3 > 0x403477ce > 0x40398ba2 > 0x40397a16 > 0x40360a13 > 0x40360c38 > 0x4039764f > 0xa12b880f > > Trying to get a backtrace after connecting with gdb: > (gdb) bt > #0 abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") > at runtime.cc:105 > #1 0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 > "ef->rflags & processor::rflags_if", > file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, > func=func@entry=0x40672d1a "page_fault") > at runtime.cc:139 > #2 0x40398c05 in page_fault (ef=0x80015048) at > arch/x64/arch-cpu.hh:107 > #3 > #4 0x4035c879 in elf::object::symtab_len > (this=0xa0f18c00) at core/elf.cc:983 > #5 0x4035c938 in elf::object::lookup_addr > (this=0xa0f18c00, addr=addr@entry=0x100254ce) > at core/elf.cc:1015 > #6 0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() ( > __closure=, __closure=, ml=...) > at core/elf.cc:1620 > #7 elf::program::with_modules const*):: > > (f=..., this=0xa0097e70) at include/osv/elf.hh:702 > #8 elf::program::lookup_addr (this=0xa0097e70, > addr=addr@entry=0x100254ce) at core/elf.cc:1617 > #9 0x404357cc in osv::lookup_name_demangled > (addr=addr@entry=0x100254ce, > buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) > at core/demangle.cc:47 > #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 > #11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at > runtime.cc:121 > #12 0x40202989 in abort () at runtime.cc:98 > #13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, > addr=) at core/mmu.cc:1314 > #14 mmu::vm_sigsegv (addr=, ef=0x81215068) at > core/mmu.cc:1308 > #15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, > ef=ef@entry=0x81215068) > at core/mmu.cc:1328 > #16 0x40398ba3 in page_fault (ef=0x81215068) at > arch/x64/mmu.cc:42 > #17 > #18 0x4039c25a in elf::object::arch_relocate_jump_slot > (this=this@entry=0xa0f18c00, sym=..., > addr=addr@entry=0x10040ca8, addend=addend@entry=0) at > arch/x64/arch-elf.cc:172 > #19 0x40360a14 in elf::object::resolve_pltgot > (this=0xa0f18c00, index=) > at core/elf.cc:843 > #20 0x40360c39 in elf_resolve_pltgot (index=308, > obj=0xa0f18c00) at core/elf.cc:1860 > #21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47 > #22 0x100254cf in ?? () > #23 0xa12b8800 in ?? () > #24 0x201f74a0 in ?? () > #25 0x100254cf in ?? () > #26 0x201f7480 in ?? () > #27 0x403f241c in calloc (nmemb=, size= out>) at core/mempool.cc:1811 > #28 0x90a98000 in ?? () > #29 0x in ?? () > On Saturday, November 28, 2020 at 1:39:46 PM UTC-7 Matthew Kenigsberg > wrote: > >> Hi, >> >> I'll send something, might take a bit before I find time to work on it >> though. >> >> Thanks, >> Matthew >> >> On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote: >> >>> On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk >>> wrote: >>> > >>> > Hey, >>> > >>> > Send a patch with a new app that could demonstrate it, please, if you >>> can. I would like to see it. Sounds like a nice improvement. >>> >>> FWIW: I'd love to see it too -- been meaning to play with Nix and this >>> gives me a perfect excuse ;-) >>> >>> Thanks, >>> Roman. >>> >> -- You received this message because you are subscribed to the Google
Re: [osv-dev] Re: Pip packages/using Nix
Struggling to get scripts/build to run on NixOS because I'm getting a page fault. NixOS does keep shared libraries in nonstandard locations, not sure if that's breaking something. More details below, but any ideas? As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71: run_cmd("/zpool.so", zpool_args); The error from scripts/build: OSv v0.55.0-145-g97f17a7a eth0: 192.168.122.15 Booted up in 154.38 ms Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set compression=off osv Running mkfs... page fault outside application, addr: 0x10040ca8 [registers] RIP: 0x4039c25a RFL: 0x00010202 CS: 0x0008 SS: 0x0010 RAX: 0x1007a340 RBX: 0x10040ca8 RCX: 0x1006abb0 RDX: 0x0002 RSI: 0x201f6f70 RDI: 0xa1058c00 RBP: 0x201f6f30 R8: 0xa0a68460 R9: 0xa0f18da0 R10: 0x R11: 0x409dd380 R12: 0xa0f18c00 R13: 0xa0f18da0 R14: 0x R15: 0x409dd380 RSP: 0x201f6f20 Aborted [backtrace] 0x403458d3 0x403477ce 0x40398ba2 0x40397a16 0x40360a13 0x40360c38 0x4039764f 0xa12b880f Trying to get a backtrace after connecting with gdb: (gdb) bt #0 abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") at runtime.cc:105 #1 0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 "ef->rflags & processor::rflags_if", file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, func=func@entry=0x40672d1a "page_fault") at runtime.cc:139 #2 0x40398c05 in page_fault (ef=0x80015048) at arch/x64/arch-cpu.hh:107 #3 #4 0x4035c879 in elf::object::symtab_len (this=0xa0f18c00) at core/elf.cc:983 #5 0x4035c938 in elf::object::lookup_addr (this=0xa0f18c00, addr=addr@entry=0x100254ce) at core/elf.cc:1015 #6 0x4035cb07 in elf::programoperator() ( __closure=, __closure=, ml=...) at core/elf.cc:1620 #7 elf::program::with_modules > (f=..., this=0xa0097e70) at include/osv/elf.hh:702 #8 elf::program::lookup_addr (this=0xa0097e70, addr=addr@entry=0x100254ce) at core/elf.cc:1617 #9 0x404357cc in osv::lookup_name_demangled (addr=addr@entry=0x100254ce, buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) at core/demangle.cc:47 #10 0x4023c4e0 in print_backtrace () at runtime.cc:85 #11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at runtime.cc:121 #12 0x40202989 in abort () at runtime.cc:98 #13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, addr=) at core/mmu.cc:1314 #14 mmu::vm_sigsegv (addr=, ef=0x81215068) at core/mmu.cc:1308 #15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, ef=ef@entry=0x81215068) at core/mmu.cc:1328 #16 0x40398ba3 in page_fault (ef=0x81215068) at arch/x64/mmu.cc:42 #17 #18 0x4039c25a in elf::object::arch_relocate_jump_slot (this=this@entry=0xa0f18c00, sym=..., addr=addr@entry=0x10040ca8, addend=addend@entry=0) at arch/x64/arch-elf.cc:172 #19 0x40360a14 in elf::object::resolve_pltgot (this=0xa0f18c00, index=) at core/elf.cc:843 #20 0x40360c39 in elf_resolve_pltgot (index=308, obj=0xa0f18c00) at core/elf.cc:1860 #21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47 #22 0x100254cf in ?? () #23 0xa12b8800 in ?? () #24 0x201f74a0 in ?? () #25 0x100254cf in ?? () #26 0x201f7480 in ?? () #27 0x403f241c in calloc (nmemb=, size=) at core/mempool.cc:1811 #28 0x90a98000 in ?? () #29 0x in ?? () On Saturday, November 28, 2020 at 1:39:46 PM UTC-7 Matthew Kenigsberg wrote: > Hi, > > I'll send something, might take a bit before I find time to work on it > though. > > Thanks, > Matthew > > On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote: > >> On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk >> wrote: >> > >> > Hey, >> > >> > Send a patch with a new app that could demonstrate it, please, if you >> can. I would like to see it. Sounds like a nice improvement. >> >> FWIW: I'd love to see it too -- been meaning to play with Nix and this >> gives me a perfect excuse ;-) >> >> Thanks, >> Roman. >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/601b48fd-64a9-48fc-996f-dc0933bddf98n%40googlegroups.com.
Re: [osv-dev] Re: Pip packages/using Nix
Hi, I'll send something, might take a bit before I find time to work on it though. Thanks, Matthew On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote: > On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk > wrote: > > > > Hey, > > > > Send a patch with a new app that could demonstrate it, please, if you > can. I would like to see it. Sounds like a nice improvement. > > FWIW: I'd love to see it too -- been meaning to play with Nix and this > gives me a perfect excuse ;-) > > Thanks, > Roman. > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/51562e56-b201-42cd-b2d0-b4965511153an%40googlegroups.com.
Re: [osv-dev] Re: Pip packages/using Nix
On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk wrote: > > Hey, > > Send a patch with a new app that could demonstrate it, please, if you can. I > would like to see it. Sounds like a nice improvement. FWIW: I'd love to see it too -- been meaning to play with Nix and this gives me a perfect excuse ;-) Thanks, Roman. -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/CA%2BULb%2BseTV8sOL7Sq4gfUga%3Dy7wuGirWhoHoBStsaCDsji5-cA%40mail.gmail.com.
[osv-dev] Re: Pip packages/using Nix
Hey, Send a patch with a new app that could demonstrate it, please, if you can. I would like to see it. Sounds like a nice improvement. Waldek On Monday, November 23, 2020 at 7:36:49 PM UTC-5 Matthew Kenigsberg wrote: > Hi, > > That definitely helped, thanks for the response! > > Haven't had time to look at this in depth, so I feel like I'm not > qualified to know if my own suggestion is actually helpful. But looking at > osv-apps/python-from-host/Makefile, it seems like nix could do things a bit > more cleanly. Nix does manage both the shared libraries and any python > dependencies, so rather than having two runs of manifest_from_host.sh and > an rsync, I can just tell nix I need python and some pip packages. Running > a single command will then give me every path I need. Tested this with > python3.8 and Flask and it worked. Personally I find that workflow a little > simpler than having to figure out what directories pip packages are > installed in. > > Another advantage would be that nixpkgs has 60,000 packages, and although > I'm sure there's plenty of compatibility issues, I think at times it would > be a lot easier to take advantage of the work that's already been done to > create all those packages, like in this case Flask works without > modification. Haven't used mpm much, so maybe it does things nix can't, but > I would guess nix could solve a lot of the same problems. > > Sorry if I'm making suggestions about something I don't understand, just > thought I'd bounce my ideas off someone who does understand them. Happy to > explain anything in more depth or demo what I mean. > > Thanks, > Matthew > On Saturday, November 7, 2020 at 10:15:28 PM UTC-7 jwkoz...@gmail.com > wrote: > >> Hi Matthew, >> >> I am not familiar with nix and how exactly it would fit. If you look at >> the osv-apps repo there are many examples of python 2/3 apps. All of those >> are driven by module.py and optional makefiles to do a job of collecting >> relevant files to the final OSv image as scripts/build, scripts/module.py >> orchestrates it all. Alternatively, there is capstan with its package* >> command and *.mpm archives. I am not sure where and how nix would fit into >> this. >> >> Now the purpose of manifest_from_host.sh is quite simple - given a Linux >> shared library/-ies or executable or a directory with those, find all >> *dependant* shared libraries based on information in DT_NEEDED elf header. >> As you can see it is not specific to Python. On other hand >> manifest_from_host.sh is not indended and can not find all other >> dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense >> is that would still need to run manifest_from_host.sh against files built >> by nix but I might be wrong. >> >> Another alternative to building OSv images could be using Docker images >> and unpacking them to create corresponding OSv image. For an example look >> at >> https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker >> >> which uses undocker tool. Another alternative would be then to use Python >> docker image in a similar way. And possibly combine it with capstan. >> >> I hope it helps a bit, >> Waldek >> >> >> >> On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg >> wrote: >> >>> >>> Hi, >>> >>> Is there a recommended way to add pip packages to osv images? >>> >>> Was trying to figure that out and I also have a suggestion: I started >>> using nix, which is a package manager, and it seems like it could be a >>> really good tool for the job. It keeps track of every dependency for a >>> piece of software, and copying all the files to run an image with python38 >>> and some pip packages only takes one command. I think using nix could >>> also make manifest_from_host.sh unnecessary (?) >>> >>> Anyways, is there an easier way to be using pip packages? >>> >>> Thanks, >>> Matthew >>> >> -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/9df31db3-9d34-469b-99b1-1091c653b7dbn%40googlegroups.com.
[osv-dev] Re: Pip packages/using Nix
Hi, That definitely helped, thanks for the response! Haven't had time to look at this in depth, so I feel like I'm not qualified to know if my own suggestion is actually helpful. But looking at osv-apps/python-from-host/Makefile, it seems like nix could do things a bit more cleanly. Nix does manage both the shared libraries and any python dependencies, so rather than having two runs of manifest_from_host.sh and an rsync, I can just tell nix I need python and some pip packages. Running a single command will then give me every path I need. Tested this with python3.8 and Flask and it worked. Personally I find that workflow a little simpler than having to figure out what directories pip packages are installed in. Another advantage would be that nixpkgs has 60,000 packages, and although I'm sure there's plenty of compatibility issues, I think at times it would be a lot easier to take advantage of the work that's already been done to create all those packages, like in this case Flask works without modification. Haven't used mpm much, so maybe it does things nix can't, but I would guess nix could solve a lot of the same problems. Sorry if I'm making suggestions about something I don't understand, just thought I'd bounce my ideas off someone who does understand them. Happy to explain anything in more depth or demo what I mean. Thanks, Matthew On Saturday, November 7, 2020 at 10:15:28 PM UTC-7 jwkoz...@gmail.com wrote: > Hi Matthew, > > I am not familiar with nix and how exactly it would fit. If you look at > the osv-apps repo there are many examples of python 2/3 apps. All of those > are driven by module.py and optional makefiles to do a job of collecting > relevant files to the final OSv image as scripts/build, scripts/module.py > orchestrates it all. Alternatively, there is capstan with its package* > command and *.mpm archives. I am not sure where and how nix would fit into > this. > > Now the purpose of manifest_from_host.sh is quite simple - given a Linux > shared library/-ies or executable or a directory with those, find all > *dependant* shared libraries based on information in DT_NEEDED elf header. > As you can see it is not specific to Python. On other hand > manifest_from_host.sh is not indended and can not find all other > dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense > is that would still need to run manifest_from_host.sh against files built > by nix but I might be wrong. > > Another alternative to building OSv images could be using Docker images > and unpacking them to create corresponding OSv image. For an example look > at > https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker > > which uses undocker tool. Another alternative would be then to use Python > docker image in a similar way. And possibly combine it with capstan. > > I hope it helps a bit, > Waldek > > > > On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg wrote: > >> >> Hi, >> >> Is there a recommended way to add pip packages to osv images? >> >> Was trying to figure that out and I also have a suggestion: I started >> using nix, which is a package manager, and it seems like it could be a >> really good tool for the job. It keeps track of every dependency for a >> piece of software, and copying all the files to run an image with python38 >> and some pip packages only takes one command. I think using nix could >> also make manifest_from_host.sh unnecessary (?) >> >> Anyways, is there an easier way to be using pip packages? >> >> Thanks, >> Matthew >> > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/fae5d0a1-9886-41ea-8156-0d7ee2f68059n%40googlegroups.com.
[osv-dev] Re: Pip packages/using Nix
Hi Matthew, I am not familiar with nix and how exactly it would fit. If you look at the osv-apps repo there are many examples of python 2/3 apps. All of those are driven by module.py and optional makefiles to do a job of collecting relevant files to the final OSv image as scripts/build, scripts/module.py orchestrates it all. Alternatively, there is capstan with its package* command and *.mpm archives. I am not sure where and how nix would fit into this. Now the purpose of manifest_from_host.sh is quite simple - given a Linux shared library/-ies or executable or a directory with those, find all *dependant* shared libraries based on information in DT_NEEDED elf header. As you can see it is not specific to Python. On other hand manifest_from_host.sh is not indended and can not find all other dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense is that would still need to run manifest_from_host.sh against files built by nix but I might be wrong. Another alternative to building OSv images could be using Docker images and unpacking them to create corresponding OSv image. For an example look at https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker which uses undocker tool. Another alternative would be then to use Python docker image in a similar way. And possibly combine it with capstan. I hope it helps a bit, Waldek On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg wrote: > > Hi, > > Is there a recommended way to add pip packages to osv images? > > Was trying to figure that out and I also have a suggestion: I started > using nix, which is a package manager, and it seems like it could be a > really good tool for the job. It keeps track of every dependency for a > piece of software, and copying all the files to run an image with python38 > and some pip packages only takes one command. I think using nix could > also make manifest_from_host.sh unnecessary (?) > > Anyways, is there an easier way to be using pip packages? > > Thanks, > Matthew > -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to osv-dev+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/0b8be52b-1b43-4069-9c69-3a320b8046e4n%40googlegroups.com.