Re: [osv-dev] Re: Pip packages/using Nix

2020-12-29 Thread Matthew Kenigsberg
I have some current work here: 
https://github.com/mkenigs/osv-apps/commits/nginx-nix. A few things are 
pretty hacky but I think that's just because of trying to use 
Make/module.py/nix all at the same time.

The image that creates is significantly larger than the current osv nginx 
one, but I think that could be changed by disabling some of the configure 
flags nix uses by default.

Building on nix also requires some changes in osv: 
https://github.com/mkenigs/osv/commits/nix-fixes
I'm still cleaning those up

Would love any feedback! Before I submit a patch I need to fix some of the 
hacks I made to get it working locally
On Monday, December 28, 2020 at 8:18:11 PM UTC-7 Roman Shaposhnik wrote:

> So is this getting published some place? ;-)
>
> Thanks,
> Roman.
>
> On Mon, Dec 14, 2020 at 9:00 AM Matthew Kenigsberg  
> wrote:
>
>> Thanks for putting so much effort into figuring that out! Really 
>> appreciate it, and glad to get it working!
>>
>> On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg 
>> wrote:
>>
>>> That worked!!! Had to set -z relro -z lazy
>>>
>>> On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com 
>>> wrote:
>>>
 Hi,

 Thanks for uploading the files. It definitely has helped me figure out 
 the issue. 

 In essence, all the .so files like libzfs.so are built with Full RELRO 
 (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. 
 Relatedly, some Linux distributions are setup to make gcc effectively use 
 '-z now -z relro' when linking the libraries. On many others like Ubuntu 
 or 
 Fedora they are built with Partial RELRO - '-z relro' by default.

 As the libraries are loaded by OSv dynamic linker, all jump slot 
 relocations are resolved eagerly (even if they are not used by the code 
 later) if those libraries are marked as 'Full RELRO' (bind_now = true). 
 For 
 non-'Full RELRO' cases, the jump slot relocations are resolved lazily 
 whenever they are accessed 1st time and are handled by 'void* 
 object::resolve_pltgot(unsigned index)` which writes resolved function 
 symbol address in GOT.

 The problem with Full-RELRO is that if we cannot find a symbol because 
 for example it is not implemented by OSv or is not visible at *this 
 point of linking* we simply ignore it hoping that it will never be 
 used or resolved later. If it is used later, the resolve_pltgot() is 
 called, and if the symbol is found (because the library containing the 
 symbol has been loaded since) we crash because we trying to write to the 
 part of memory - GOT - that has been since read-only protected.

 Why does this happen exactly?

 So here is the symbol *bsd_getmntany *not found at the address you 
 were getting original fault at (after adding extra debug statements):
 ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
 addr:0x10040ca0
 /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been 
 *0x10040ca8 
 which match what page fault reports
 ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
 addr:0x10040cb0

 Please note that both mkfs.so and zfs.so depend on libzfs.so. Per 
 command line, OSv loads and executes the apps sequentially. So the mkfs.so 
 is first and the dynamic linker will load libzfs.so and relocate and 
 eagerly resolve all symbols and fix the permissions on libzfs.so. One of 
 the symbols in libzfs.so is *bsd_getmntany* which is actually part of 
 zfs.so which is left unresolved (see missing warning above).

 After mkfs.so, OSv gets to zfs.so and it processes it and executes and 
 some of the code in zfs.so tries to invoke *bsd_getmntany *which gets 
 dynamically resolved and found by resolve_pltgot() BUT when it tries to 
 write to GOT it gets page fault.

 Having said all that I am sure what exactly the problem is:
 A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so 
 which effectively prevents those to function properly if build with Full 
 RELRO (would that work on Linux)?
 B) True limitation of OSv linker which should handle correctly such 
 scenario.

 For now, the easiest solution (might be true if A is true) is to simply 
 force building those libraries with Partial RELRO like in this patch:

 diff --git a/Makefile b/Makefile
 index d1597263..d200dde8 100644
 --- a/Makefile
 +++ b/Makefile
 @@ -345,7 +345,7 @@ $(out)/%.o: %.s
 $(makedir)
 $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS 
 $*.s)
  
 -%.so: EXTRA_FLAGS = -fPIC -shared
 +%.so: EXTRA_FLAGS = -fPIC -shared -z relro
  %.so: %.o
 $(makedir)
 $(q-build-so)

 Please let me know if it works,
 Waldek

 PS. Also verify that running ' 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-28 Thread Roman Shaposhnik
So is this getting published some place? ;-)

Thanks,
Roman.

On Mon, Dec 14, 2020 at 9:00 AM Matthew Kenigsberg <
matthewkenigsb...@gmail.com> wrote:

> Thanks for putting so much effort into figuring that out! Really
> appreciate it, and glad to get it working!
>
> On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg
> wrote:
>
>> That worked!!! Had to set -z relro -z lazy
>>
>> On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com
>> wrote:
>>
>>> Hi,
>>>
>>> Thanks for uploading the files. It definitely has helped me figure out
>>> the issue.
>>>
>>> In essence, all the .so files like libzfs.so are built with Full RELRO
>>> (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like.
>>> Relatedly, some Linux distributions are setup to make gcc effectively use
>>> '-z now -z relro' when linking the libraries. On many others like Ubuntu or
>>> Fedora they are built with Partial RELRO - '-z relro' by default.
>>>
>>> As the libraries are loaded by OSv dynamic linker, all jump slot
>>> relocations are resolved eagerly (even if they are not used by the code
>>> later) if those libraries are marked as 'Full RELRO' (bind_now = true). For
>>> non-'Full RELRO' cases, the jump slot relocations are resolved lazily
>>> whenever they are accessed 1st time and are handled by 'void*
>>> object::resolve_pltgot(unsigned index)` which writes resolved function
>>> symbol address in GOT.
>>>
>>> The problem with Full-RELRO is that if we cannot find a symbol because
>>> for example it is not implemented by OSv or is not visible at *this
>>> point of linking* we simply ignore it hoping that it will never be used
>>> or resolved later. If it is used later, the resolve_pltgot() is called, and
>>> if the symbol is found (because the library containing the symbol has been
>>> loaded since) we crash because we trying to write to the part of memory -
>>> GOT - that has been since read-only protected.
>>>
>>> Why does this happen exactly?
>>>
>>> So here is the symbol *bsd_getmntany *not found at the address you were
>>> getting original fault at (after adding extra debug statements):
>>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot,
>>> addr:0x10040ca0
>>> /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been 
>>> *0x10040ca8
>>> which match what page fault reports
>>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot,
>>> addr:0x10040cb0
>>>
>>> Please note that both mkfs.so and zfs.so depend on libzfs.so. Per
>>> command line, OSv loads and executes the apps sequentially. So the mkfs.so
>>> is first and the dynamic linker will load libzfs.so and relocate and
>>> eagerly resolve all symbols and fix the permissions on libzfs.so. One of
>>> the symbols in libzfs.so is *bsd_getmntany* which is actually part of
>>> zfs.so which is left unresolved (see missing warning above).
>>>
>>> After mkfs.so, OSv gets to zfs.so and it processes it and executes and
>>> some of the code in zfs.so tries to invoke *bsd_getmntany *which gets
>>> dynamically resolved and found by resolve_pltgot() BUT when it tries to
>>> write to GOT it gets page fault.
>>>
>>> Having said all that I am sure what exactly the problem is:
>>> A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so
>>> which effectively prevents those to function properly if build with Full
>>> RELRO (would that work on Linux)?
>>> B) True limitation of OSv linker which should handle correctly such
>>> scenario.
>>>
>>> For now, the easiest solution (might be true if A is true) is to simply
>>> force building those libraries with Partial RELRO like in this patch:
>>>
>>> diff --git a/Makefile b/Makefile
>>> index d1597263..d200dde8 100644
>>> --- a/Makefile
>>> +++ b/Makefile
>>> @@ -345,7 +345,7 @@ $(out)/%.o: %.s
>>> $(makedir)
>>> $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s)
>>>
>>> -%.so: EXTRA_FLAGS = -fPIC -shared
>>> +%.so: EXTRA_FLAGS = -fPIC -shared -z relro
>>>  %.so: %.o
>>> $(makedir)
>>> $(q-build-so)
>>>
>>> Please let me know if it works,
>>> Waldek
>>>
>>> PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW'
>>> does not show anything anymore.
>>>
>>> On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg
>>> wrote:
>>>
 Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what
 I have in
 /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1:
 https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2

 Get a different error if I comment out core/elf.cc:1429:

 (gdb) bt
 #0 0x4039eef2 in processor::cli_hlt () at
 arch/x64/processor.hh:247
 #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
 #2 osv::halt () at arch/x64/power.cc:26
 #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n")
 at runtime.cc:132
 #4 0x40202989 in abort () at 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-14 Thread Matthew Kenigsberg
Thanks for putting so much effort into figuring that out! Really appreciate 
it, and glad to get it working!

On Wednesday, December 9, 2020 at 3:45:30 PM UTC-7 Matthew Kenigsberg wrote:

> That worked!!! Had to set -z relro -z lazy
>
> On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com 
> wrote:
>
>> Hi,
>>
>> Thanks for uploading the files. It definitely has helped me figure out 
>> the issue. 
>>
>> In essence, all the .so files like libzfs.so are built with Full RELRO 
>> (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. 
>> Relatedly, some Linux distributions are setup to make gcc effectively use 
>> '-z now -z relro' when linking the libraries. On many others like Ubuntu or 
>> Fedora they are built with Partial RELRO - '-z relro' by default.
>>
>> As the libraries are loaded by OSv dynamic linker, all jump slot 
>> relocations are resolved eagerly (even if they are not used by the code 
>> later) if those libraries are marked as 'Full RELRO' (bind_now = true). For 
>> non-'Full RELRO' cases, the jump slot relocations are resolved lazily 
>> whenever they are accessed 1st time and are handled by 'void* 
>> object::resolve_pltgot(unsigned index)` which writes resolved function 
>> symbol address in GOT.
>>
>> The problem with Full-RELRO is that if we cannot find a symbol because 
>> for example it is not implemented by OSv or is not visible at *this 
>> point of linking* we simply ignore it hoping that it will never be used 
>> or resolved later. If it is used later, the resolve_pltgot() is called, and 
>> if the symbol is found (because the library containing the symbol has been 
>> loaded since) we crash because we trying to write to the part of memory - 
>> GOT - that has been since read-only protected.
>>
>> Why does this happen exactly?
>>
>> So here is the symbol *bsd_getmntany *not found at the address you were 
>> getting original fault at (after adding extra debug statements):
>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
>> addr:0x10040ca0
>> /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been 
>> *0x10040ca8 
>> which match what page fault reports
>> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
>> addr:0x10040cb0
>>
>> Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command 
>> line, OSv loads and executes the apps sequentially. So the mkfs.so is first 
>> and the dynamic linker will load libzfs.so and relocate and eagerly resolve 
>> all symbols and fix the permissions on libzfs.so. One of the symbols in 
>> libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is 
>> left unresolved (see missing warning above).
>>
>> After mkfs.so, OSv gets to zfs.so and it processes it and executes and 
>> some of the code in zfs.so tries to invoke *bsd_getmntany *which gets 
>> dynamically resolved and found by resolve_pltgot() BUT when it tries to 
>> write to GOT it gets page fault.
>>
>> Having said all that I am sure what exactly the problem is:
>> A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so 
>> which effectively prevents those to function properly if build with Full 
>> RELRO (would that work on Linux)?
>> B) True limitation of OSv linker which should handle correctly such 
>> scenario.
>>
>> For now, the easiest solution (might be true if A is true) is to simply 
>> force building those libraries with Partial RELRO like in this patch:
>>
>> diff --git a/Makefile b/Makefile
>> index d1597263..d200dde8 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -345,7 +345,7 @@ $(out)/%.o: %.s
>> $(makedir)
>> $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s)
>>  
>> -%.so: EXTRA_FLAGS = -fPIC -shared
>> +%.so: EXTRA_FLAGS = -fPIC -shared -z relro
>>  %.so: %.o
>> $(makedir)
>> $(q-build-so)
>>
>> Please let me know if it works,
>> Waldek
>>
>> PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' 
>> does not show anything anymore.
>>
>> On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote:
>>
>>> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what 
>>> I have in 
>>> /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1:
>>> https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2
>>>
>>> Get a different error if I comment out core/elf.cc:1429:
>>>
>>> (gdb) bt
>>> #0 0x4039eef2 in processor::cli_hlt () at 
>>> arch/x64/processor.hh:247
>>> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
>>> #2 osv::halt () at arch/x64/power.cc:26
>>> #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at 
>>> runtime.cc:132
>>> #4 0x40202989 in abort () at runtime.cc:98
>>> #5 0x40218943 in osv::generate_signal (siginfo=..., 
>>> ef=0x8191c068) at libc/signal.cc:124
>>> #6 0x404745ff in osv::handle_mmap_fault (addr=, 
>>> sig=, ef=)
>>> at libc/signal.cc:139

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-09 Thread Matthew Kenigsberg
That worked!!! Had to set -z relro -z lazy

On Wednesday, December 9, 2020 at 12:30:55 PM UTC-7 jwkoz...@gmail.com 
wrote:

> Hi,
>
> Thanks for uploading the files. It definitely has helped me figure out the 
> issue. 
>
> In essence, all the .so files like libzfs.so are built with Full RELRO 
> (run 'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. 
> Relatedly, some Linux distributions are setup to make gcc effectively use 
> '-z now -z relro' when linking the libraries. On many others like Ubuntu or 
> Fedora they are built with Partial RELRO - '-z relro' by default.
>
> As the libraries are loaded by OSv dynamic linker, all jump slot 
> relocations are resolved eagerly (even if they are not used by the code 
> later) if those libraries are marked as 'Full RELRO' (bind_now = true). For 
> non-'Full RELRO' cases, the jump slot relocations are resolved lazily 
> whenever they are accessed 1st time and are handled by 'void* 
> object::resolve_pltgot(unsigned index)` which writes resolved function 
> symbol address in GOT.
>
> The problem with Full-RELRO is that if we cannot find a symbol because for 
> example it is not implemented by OSv or is not visible at *this point of 
> linking* we simply ignore it hoping that it will never be used or 
> resolved later. If it is used later, the resolve_pltgot() is called, and if 
> the symbol is found (because the library containing the symbol has been 
> loaded since) we crash because we trying to write to the part of memory - 
> GOT - that has been since read-only protected.
>
> Why does this happen exactly?
>
> So here is the symbol *bsd_getmntany *not found at the address you were 
> getting original fault at (after adding extra debug statements):
> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
> addr:0x10040ca0
> /libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been 
> *0x10040ca8 
> which match what page fault reports
> ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
> addr:0x10040cb0
>
> Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command 
> line, OSv loads and executes the apps sequentially. So the mkfs.so is first 
> and the dynamic linker will load libzfs.so and relocate and eagerly resolve 
> all symbols and fix the permissions on libzfs.so. One of the symbols in 
> libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is 
> left unresolved (see missing warning above).
>
> After mkfs.so, OSv gets to zfs.so and it processes it and executes and 
> some of the code in zfs.so tries to invoke *bsd_getmntany *which gets 
> dynamically resolved and found by resolve_pltgot() BUT when it tries to 
> write to GOT it gets page fault.
>
> Having said all that I am sure what exactly the problem is:
> A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so 
> which effectively prevents those to function properly if build with Full 
> RELRO (would that work on Linux)?
> B) True limitation of OSv linker which should handle correctly such 
> scenario.
>
> For now, the easiest solution (might be true if A is true) is to simply 
> force building those libraries with Partial RELRO like in this patch:
>
> diff --git a/Makefile b/Makefile
> index d1597263..d200dde8 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -345,7 +345,7 @@ $(out)/%.o: %.s
> $(makedir)
> $(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s)
>  
> -%.so: EXTRA_FLAGS = -fPIC -shared
> +%.so: EXTRA_FLAGS = -fPIC -shared -z relro
>  %.so: %.o
> $(makedir)
> $(q-build-so)
>
> Please let me know if it works,
> Waldek
>
> PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' does 
> not show anything anymore.
>
> On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote:
>
>> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what 
>> I have in 
>> /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1:
>> https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2
>>
>> Get a different error if I comment out core/elf.cc:1429:
>>
>> (gdb) bt
>> #0 0x4039eef2 in processor::cli_hlt () at 
>> arch/x64/processor.hh:247
>> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
>> #2 osv::halt () at arch/x64/power.cc:26
>> #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at 
>> runtime.cc:132
>> #4 0x40202989 in abort () at runtime.cc:98
>> #5 0x40218943 in osv::generate_signal (siginfo=..., 
>> ef=0x8191c068) at libc/signal.cc:124
>> #6 0x404745ff in osv::handle_mmap_fault (addr=, 
>> sig=, ef=)
>> at libc/signal.cc:139
>> #7 0x40347872 in mmu::vm_fault (addr=17592187039744, 
>> addr@entry=17592187040376,
>> ef=ef@entry=0x8191c068) at core/mmu.cc:1336
>> #8 0x403992e3 in page_fault (ef=0x8191c068) at 
>> arch/x64/mmu.cc:42
>> #9 
>> #10 0x101554c4 in usage 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-09 Thread Waldek Kozaczuk
Hi,

Thanks for uploading the files. It definitely has helped me figure out the 
issue. 

In essence, all the .so files like libzfs.so are built with Full RELRO (run 
'readelf -a libzfso.so | grep BIND_NOW) on Nix OS it looks like. Relatedly, 
some Linux distributions are setup to make gcc effectively use '-z now -z 
relro' when linking the libraries. On many others like Ubuntu or Fedora 
they are built with Partial RELRO - '-z relro' by default.

As the libraries are loaded by OSv dynamic linker, all jump slot 
relocations are resolved eagerly (even if they are not used by the code 
later) if those libraries are marked as 'Full RELRO' (bind_now = true). For 
non-'Full RELRO' cases, the jump slot relocations are resolved lazily 
whenever they are accessed 1st time and are handled by 'void* 
object::resolve_pltgot(unsigned index)` which writes resolved function 
symbol address in GOT.

The problem with Full-RELRO is that if we cannot find a symbol because for 
example it is not implemented by OSv or is not visible at *this point of 
linking* we simply ignore it hoping that it will never be used or resolved 
later. If it is used later, the resolve_pltgot() is called, and if the 
symbol is found (because the library containing the symbol has been loaded 
since) we crash because we trying to write to the part of memory - GOT - 
that has been since read-only protected.

Why does this happen exactly?

So here is the symbol *bsd_getmntany *not found at the address you were 
getting original fault at (after adding extra debug statements):
ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
addr:0x10040ca0
/libzfs.so: ignoring missing symbol *bsd_getmntany //Would have been 
*0x10040ca8 
which match what page fault reports
ELF [tid:28, mod:3, /libzfs.so]: arch_relocate_jump_slot, 
addr:0x10040cb0

Please note that both mkfs.so and zfs.so depend on libzfs.so. Per command 
line, OSv loads and executes the apps sequentially. So the mkfs.so is first 
and the dynamic linker will load libzfs.so and relocate and eagerly resolve 
all symbols and fix the permissions on libzfs.so. One of the symbols in 
libzfs.so is *bsd_getmntany* which is actually part of zfs.so which is left 
unresolved (see missing warning above).

After mkfs.so, OSv gets to zfs.so and it processes it and executes and some 
of the code in zfs.so tries to invoke *bsd_getmntany *which gets 
dynamically resolved and found by resolve_pltgot() BUT when it tries to 
write to GOT it gets page fault.

Having said all that I am sure what exactly the problem is:
A) Invalid or abnormal dependency between libzfs.so, mkfs.so and zfs.so 
which effectively prevents those to function properly if build with Full 
RELRO (would that work on Linux)?
B) True limitation of OSv linker which should handle correctly such 
scenario.

For now, the easiest solution (might be true if A is true) is to simply 
force building those libraries with Partial RELRO like in this patch:

diff --git a/Makefile b/Makefile
index d1597263..d200dde8 100644
--- a/Makefile
+++ b/Makefile
@@ -345,7 +345,7 @@ $(out)/%.o: %.s
$(makedir)
$(call quiet, $(CXX) $(CXXFLAGS) $(ASFLAGS) -c -o $@ $<, AS $*.s)
 
-%.so: EXTRA_FLAGS = -fPIC -shared
+%.so: EXTRA_FLAGS = -fPIC -shared -z relro
 %.so: %.o
$(makedir)
$(q-build-so)

Please let me know if it works,
Waldek

PS. Also verify that running ' readelf -a libzfso.so | grep BIND_NOW' does 
not show anything anymore.

On Tuesday, December 8, 2020 at 5:08:18 PM UTC-5 Matthew Kenigsberg wrote:

> Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what 
> I have in 
> /nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1:
> https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2
>
> Get a different error if I comment out core/elf.cc:1429:
>
> (gdb) bt
> #0 0x4039eef2 in processor::cli_hlt () at arch/x64/processor.hh:247
> #1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
> #2 osv::halt () at arch/x64/power.cc:26
> #3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at 
> runtime.cc:132
> #4 0x40202989 in abort () at runtime.cc:98
> #5 0x40218943 in osv::generate_signal (siginfo=..., 
> ef=0x8191c068) at libc/signal.cc:124
> #6 0x404745ff in osv::handle_mmap_fault (addr=, 
> sig=, ef=)
> at libc/signal.cc:139
> #7 0x40347872 in mmu::vm_fault (addr=17592187039744, 
> addr@entry=17592187040376,
> ef=ef@entry=0x8191c068) at core/mmu.cc:1336
> #8 0x403992e3 in page_fault (ef=0x8191c068) at 
> arch/x64/mmu.cc:42
> #9 
> #10 0x101554c4 in usage (requested=requested@entry=false)
> at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:424
> #11 0x10152025 in main (argc=5, argv=0xa0f19400)
> at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:6676
> #12 0x4043c311 in osv::application::run_main 
> (this=0xa0ee0c10)
> at 
> 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Matthew Kenigsberg
Not completely sure where libgcc_s.so.1 is coming from, but I uploaded what 
I have in 
/nix/store/vran8acwir59772hj4vscr7zribvp7l5-gcc-9.3.0-lib/lib/libgcc_s.so.1:
https://drive.google.com/drive/folders/1rM6g-FrzwFpuHr2wX9-J21DzSjyQXGg2

Get a different error if I comment out core/elf.cc:1429:

(gdb) bt
#0 0x4039eef2 in processor::cli_hlt () at arch/x64/processor.hh:247
#1 arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2 osv::halt () at arch/x64/power.cc:26
#3 0x4023c73f in abort (fmt=fmt@entry=0x40645aff "Aborted\n") at 
runtime.cc:132
#4 0x40202989 in abort () at runtime.cc:98
#5 0x40218943 in osv::generate_signal (siginfo=..., 
ef=0x8191c068) at libc/signal.cc:124
#6 0x404745ff in osv::handle_mmap_fault (addr=, 
sig=, ef=)
at libc/signal.cc:139
#7 0x40347872 in mmu::vm_fault (addr=17592187039744, 
addr@entry=17592187040376,
ef=ef@entry=0x8191c068) at core/mmu.cc:1336
#8 0x403992e3 in page_fault (ef=0x8191c068) at 
arch/x64/mmu.cc:42
#9 
#10 0x101554c4 in usage (requested=requested@entry=false)
at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:424
#11 0x10152025 in main (argc=5, argv=0xa0f19400)
at bsd/cddl/contrib/opensolaris/cmd/zfs/zfs_main.c:6676
#12 0x4043c311 in osv::application::run_main 
(this=0xa0ee0c10)
at 
/nix/store/h31cy7jm6g7cfqbhc5pm4rf9c53i3qfb-gcc-9.3.0/include/c++/9.3.0/bits/stl_vector.h:915
#13 0x4022452f in osv::application::main (this=0xa0ee0c10) 
at core/app.cc:320
#14 0x4043c539 in osv::applicationoperator() 
(__closure=0x0, app=)
at core/app.cc:233
#15 osv::application_FUN(void *) () at core/app.cc:235
#16 0x40470d58 in pthread_private::pthreadoperator() 
(__closure=0xa1067f00)
at libc/pthread.cc:115
#17 std::_Function_handler >::_M_invoke(const 
std::_Any_data &) (__functor=...)
at 
/nix/store/h31cy7jm6g7cfqbhc5pm4rf9c53i3qfb-gcc-9.3.0/include/c++/9.3.0/bits/std_function.h:300
#18 0x404074fd in sched::thread_main_c (t=0x81917040) at 
arch/x64/arch-switch.hh:325
#19 0x403990d3 in thread_main () at arch/x64/entry.S:113


On Tuesday, December 8, 2020 at 2:34:19 PM UTC-7 jwkoz...@gmail.com wrote:

> I wonder if we have a bug in core/elf.cc::fix_permissions() or logic 
> around. And we might be making the wrong part of the mapping readable based 
> on GNU_RELRO header. I wonder if you are able to create ZFS image by 
> temporarily commenting out the line 1429 of core/elf.cc:
>
> ef->fix_permissions();
> Also, would it possible to get copies of those binaries:
> /libenviron.so: libenviron.so
> /libvdso.so: libvdso.so
> /zpool.so: zpool.so
> /libzfs.so: libzfs.so
> /libuutil.so: libuutil.so
> /zfs.so: zfs.so
> /tools/mkfs.so: tools/mkfs/mkfs.so
> /tools/cpiod.so: tools/cpiod/cpiod.so
> /tools/mount-fs.so: tools/mount/mount-fs.so
> /tools/umount.so: tools/mount/umount.so
> /usr/lib/libgcc_s.so.1: %(libgcc_s_dir)s/libgcc_s.so.1
>
> Ideally, the stripped versions. That would help me to re-create the 
> problem and investigate further.
>
> On Tuesday, December 8, 2020 at 1:25:10 PM UTC-5 Matthew Kenigsberg wrote:
>
>> [nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so
>>
>> Elf file type is DYN (Shared object file)
>> Entry point 0xc8f0
>> There are 8 program headers, starting at offset 64
>>
>> Program Headers:
>> Type Offset VirtAddr PhysAddr
>> FileSiz MemSiz Flags Align
>> LOAD 0x 0x 0x
>> 0xa9b8 0xa9b8 R 0x1000
>> LOAD 0xb000 0xb000 0xb000
>> 0x0001e0a9 0x0001e0a9 R E 0x1000
>> LOAD 0x0002a000 0x0002a000 0x0002a000
>> 0x93a0 0x93a0 R 0x1000
>> LOAD 0x00034010 0x00035010 0x00035010
>> 0x1810 0x2c20 RW 0x1000
>> DYNAMIC 0x000340e0 0x000350e0 0x000350e0
>> 0x0210 0x0210 RW 0x8
>> GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768
>> 0x0d04 0x0d04 R 0x4
>> GNU_STACK 0x 0x 0x
>> 0x 0x RW 0x10
>> GNU_RELRO 0x00034010 0x00035010 0x00035010
>> 0x0ff0 0x0ff0 R 0x1
>>
>> Section to Segment mapping:
>> Segment Sections...
>> 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn 
>> .rela.plt 
>> 01 .init .plt .plt.got .text .fini 
>> 02 .rodata .eh_frame_hdr .eh_frame 
>> 03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 
>> 04 .dynamic 
>> 05 .eh_frame_hdr 
>> 06 
>> 07 .init_array .fini_array .data.rel.ro .dynamic .got
>>
>> On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com 
>> wrote:
>>
>>> Back to why it is failing.
>>>
>>> Based on what you sent us:
>>> ..
>>> 0x1000b000 0x10016000 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Waldek Kozaczuk
I wonder if we have a bug in core/elf.cc::fix_permissions() or logic 
around. And we might be making the wrong part of the mapping readable based 
on GNU_RELRO header. I wonder if you are able to create ZFS image by 
temporarily commenting out the line 1429 of core/elf.cc:

ef->fix_permissions();
Also, would it possible to get copies of those binaries:
/libenviron.so: libenviron.so
/libvdso.so: libvdso.so
/zpool.so: zpool.so
/libzfs.so: libzfs.so
/libuutil.so: libuutil.so
/zfs.so: zfs.so
/tools/mkfs.so: tools/mkfs/mkfs.so
/tools/cpiod.so: tools/cpiod/cpiod.so
/tools/mount-fs.so: tools/mount/mount-fs.so
/tools/umount.so: tools/mount/umount.so
/usr/lib/libgcc_s.so.1: %(libgcc_s_dir)s/libgcc_s.so.1

Ideally, the stripped versions. That would help me to re-create the problem 
and investigate further.

On Tuesday, December 8, 2020 at 1:25:10 PM UTC-5 Matthew Kenigsberg wrote:

> [nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so
>
> Elf file type is DYN (Shared object file)
> Entry point 0xc8f0
> There are 8 program headers, starting at offset 64
>
> Program Headers:
> Type Offset VirtAddr PhysAddr
> FileSiz MemSiz Flags Align
> LOAD 0x 0x 0x
> 0xa9b8 0xa9b8 R 0x1000
> LOAD 0xb000 0xb000 0xb000
> 0x0001e0a9 0x0001e0a9 R E 0x1000
> LOAD 0x0002a000 0x0002a000 0x0002a000
> 0x93a0 0x93a0 R 0x1000
> LOAD 0x00034010 0x00035010 0x00035010
> 0x1810 0x2c20 RW 0x1000
> DYNAMIC 0x000340e0 0x000350e0 0x000350e0
> 0x0210 0x0210 RW 0x8
> GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768
> 0x0d04 0x0d04 R 0x4
> GNU_STACK 0x 0x 0x
> 0x 0x RW 0x10
> GNU_RELRO 0x00034010 0x00035010 0x00035010
> 0x0ff0 0x0ff0 R 0x1
>
> Section to Segment mapping:
> Segment Sections...
> 00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn 
> .rela.plt 
> 01 .init .plt .plt.got .text .fini 
> 02 .rodata .eh_frame_hdr .eh_frame 
> 03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 
> 04 .dynamic 
> 05 .eh_frame_hdr 
> 06 
> 07 .init_array .fini_array .data.rel.ro .dynamic .got
>
> On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com 
> wrote:
>
>> Back to why it is failing.
>>
>> Based on what you sent us:
>> ..
>> 0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r 
>> offset=0x path=/libzfs.so
>> 0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx 
>> offset=0xb000 path=/libzfs.so
>> 0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r 
>> offset=0x0002a000 path=/libzfs.so
>>
>> *0x1004 0x10041000 [4.0 kB] flags=fmF perm=r 
>> offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 
>> [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so
>> ..
>>
>> The page fault in arch_relocate_jump_slot() is caused by an attempt to 
>> write at the address 0x10040ca8 which falls into a 4th mapping range 
>> from the above which can only be read from. So that is the permission 
>> fault. The question is why the address is in that range? The address should 
>> be somewhere in GOT in the 5th range - 0x10041000 
>> 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 
>> path=/libzfs.so which has read/write permission.
>>
>> On Ubuntu host when I run the same command like and add extra debugging 
>> to print statements like this:
>>
>> ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, 
>> addr:0x1007a8d0
>> They all print addresses within the range 0x1007a000 - 0x1007b000 
>> which are read-write permitted as they should be:
>> 0x10044000 0x1004e000 [40.0 kB]flags=fmF  
>> perm=roffset=0x path=/libzfs.so
>> 0x1004e000 0x1006f000 [132.0 kB]   flags=fmF  
>> perm=rx   offset=0xa000 path=/libzfs.so
>> 0x1006f000 0x10079000 [40.0 kB]flags=fmF  
>> perm=roffset=0x0002b000 path=/libzfs.so
>> 0x10079000 0x1007a000 [4.0 kB] flags=fmF  
>> perm=roffset=0x00034000 path=/libzfs.so
>> *0x1007a000 0x1007c000 [8.0 kB] flags=fmF  
>> perm=rw   offset=0x00035000 path=/libzfs.so *
>>  
>> I wonder if we have a bug when calculating where each segment should be 
>> mapped:
>>
>> 400 void file::load_segment(const Elf64_Phdr& phdr)
>>  401 {
>>  402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size);
>>  403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart;
>>  404 ulong filesz = align_up(filesz_unaligned, mmu::page_size);
>>  405 ulong memsz = 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Matthew Kenigsberg
[nix-shell:~/osv]$ readelf -l build/release/libzfs-stripped.so

Elf file type is DYN (Shared object file)
Entry point 0xc8f0
There are 8 program headers, starting at offset 64

Program Headers:
Type Offset VirtAddr PhysAddr
FileSiz MemSiz Flags Align
LOAD 0x 0x 0x
0xa9b8 0xa9b8 R 0x1000
LOAD 0xb000 0xb000 0xb000
0x0001e0a9 0x0001e0a9 R E 0x1000
LOAD 0x0002a000 0x0002a000 0x0002a000
0x93a0 0x93a0 R 0x1000
LOAD 0x00034010 0x00035010 0x00035010
0x1810 0x2c20 RW 0x1000
DYNAMIC 0x000340e0 0x000350e0 0x000350e0
0x0210 0x0210 RW 0x8
GNU_EH_FRAME 0x0002e768 0x0002e768 0x0002e768
0x0d04 0x0d04 R 0x4
GNU_STACK 0x 0x 0x
0x 0x RW 0x10
GNU_RELRO 0x00034010 0x00035010 0x00035010
0x0ff0 0x0ff0 R 0x1

Section to Segment mapping:
Segment Sections...
00 .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn 
.rela.plt 
01 .init .plt .plt.got .text .fini 
02 .rodata .eh_frame_hdr .eh_frame 
03 .init_array .fini_array .data.rel.ro .dynamic .got .data .bss 
04 .dynamic 
05 .eh_frame_hdr 
06 
07 .init_array .fini_array .data.rel.ro .dynamic .got

On Tuesday, December 8, 2020 at 11:17:46 AM UTC-7 jwkoz...@gmail.com wrote:

> Back to why it is failing.
>
> Based on what you sent us:
> ..
> 0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r 
> offset=0x path=/libzfs.so
> 0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx 
> offset=0xb000 path=/libzfs.so
> 0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r 
> offset=0x0002a000 path=/libzfs.so
>
> *0x1004 0x10041000 [4.0 kB] flags=fmF perm=r 
> offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 
> [4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so
> ..
>
> The page fault in arch_relocate_jump_slot() is caused by an attempt to 
> write at the address 0x10040ca8 which falls into a 4th mapping range 
> from the above which can only be read from. So that is the permission 
> fault. The question is why the address is in that range? The address should 
> be somewhere in GOT in the 5th range - 0x10041000 
> 0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 
> path=/libzfs.so which has read/write permission.
>
> On Ubuntu host when I run the same command like and add extra debugging to 
> print statements like this:
>
> ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, 
> addr:0x1007a8d0
> They all print addresses within the range 0x1007a000 - 0x1007b000 
> which are read-write permitted as they should be:
> 0x10044000 0x1004e000 [40.0 kB]flags=fmF  
> perm=roffset=0x path=/libzfs.so
> 0x1004e000 0x1006f000 [132.0 kB]   flags=fmF  
> perm=rx   offset=0xa000 path=/libzfs.so
> 0x1006f000 0x10079000 [40.0 kB]flags=fmF  
> perm=roffset=0x0002b000 path=/libzfs.so
> 0x10079000 0x1007a000 [4.0 kB] flags=fmF  
> perm=roffset=0x00034000 path=/libzfs.so
> *0x1007a000 0x1007c000 [8.0 kB] flags=fmF  
> perm=rw   offset=0x00035000 path=/libzfs.so *
>  
> I wonder if we have a bug when calculating where each segment should be 
> mapped:
>
> 400 void file::load_segment(const Elf64_Phdr& phdr)
>  401 {
>  402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size);
>  403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart;
>  404 ulong filesz = align_up(filesz_unaligned, mmu::page_size);
>  405 ulong memsz = align_up(phdr.p_vaddr + phdr.p_memsz, 
> mmu::page_size) - vstart;
>  406 
>  407 unsigned perm = get_segment_mmap_permissions(phdr);
>  408 
>  409 auto flag = mmu::mmap_fixed | (mlocked() ? mmu::mmap_populate : 
> 0);
>  410 mmu::map_file(_base + vstart, filesz, flag, perm, _f, 
> align_down(phdr.p_offset, mmu::page_size));
>  411 if (phdr.p_filesz != phdr.p_memsz) {
>  412 assert(perm & mmu::perm_write);
>  413 memset(_base + vstart + filesz_unaligned, 0, filesz - 
> filesz_unaligned);
>  414 if (memsz != filesz) {
>  415 mmu::map_anon(_base + vstart + filesz, memsz - filesz, 
> flag, perm);
>  416 }
>  417 }
>  418 elf_debug("Loaded and mapped PT_LOAD segment at: %018p of size: 
> 0x%x\n", _base + vstart, filesz);
>  419 }
>
> BTW I am also interested what the output of this readelf command for your 
> libzfs.so is. Mine is this:
>
>  readelf -l build/release/libzfs-stripped.so 
>
> Elf file type is DYN (Shared object file)
> Entry point 0xd1d0

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Waldek Kozaczuk
Back to why it is failing.

Based on what you sent us:
..
0x1000b000 0x10016000 [44.0 kB] flags=fmF perm=r 
offset=0x path=/libzfs.so
0x10016000 0x10035000 [124.0 kB] flags=fmF perm=rx 
offset=0xb000 path=/libzfs.so
0x10035000 0x1003f000 [40.0 kB] flags=fmF perm=r 
offset=0x0002a000 path=/libzfs.so

*0x1004 0x10041000 [4.0 kB] flags=fmF perm=r 
offset=0x00034000 path=/libzfs.so*0x10041000 0x10042000 
[4.0 kB] flags=fmF perm=rw offset=0x00035000 path=/libzfs.so
..

The page fault in arch_relocate_jump_slot() is caused by an attempt to 
write at the address 0x10040ca8 which falls into a 4th mapping range 
from the above which can only be read from. So that is the permission 
fault. The question is why the address is in that range? The address should 
be somewhere in GOT in the 5th range - 0x10041000 
0x10042000 [4.0 kB] flags=fmF perm=rw offset=0x00035000 
path=/libzfs.so which has read/write permission.

On Ubuntu host when I run the same command like and add extra debugging to 
print statements like this:

ELF [tid:36, mod:5, /*libzfs*.so]: arch_relocate_jump_slot, 
addr:0x1007a8d0
They all print addresses within the range 0x1007a000 - 0x1007b000 
which are read-write permitted as they should be:
0x10044000 0x1004e000 [40.0 kB]flags=fmF  
perm=roffset=0x path=/libzfs.so
0x1004e000 0x1006f000 [132.0 kB]   flags=fmF  
perm=rx   offset=0xa000 path=/libzfs.so
0x1006f000 0x10079000 [40.0 kB]flags=fmF  
perm=roffset=0x0002b000 path=/libzfs.so
0x10079000 0x1007a000 [4.0 kB] flags=fmF  
perm=roffset=0x00034000 path=/libzfs.so
*0x1007a000 0x1007c000 [8.0 kB] flags=fmF  
perm=rw   offset=0x00035000 path=/libzfs.so *
 
I wonder if we have a bug when calculating where each segment should be 
mapped:

400 void file::load_segment(const Elf64_Phdr& phdr)
 401 {
 402 ulong vstart = align_down(phdr.p_vaddr, mmu::page_size);
 403 ulong filesz_unaligned = phdr.p_vaddr + phdr.p_filesz - vstart;
 404 ulong filesz = align_up(filesz_unaligned, mmu::page_size);
 405 ulong memsz = align_up(phdr.p_vaddr + phdr.p_memsz, 
mmu::page_size) - vstart;
 406 
 407 unsigned perm = get_segment_mmap_permissions(phdr);
 408 
 409 auto flag = mmu::mmap_fixed | (mlocked() ? mmu::mmap_populate : 0);
 410 mmu::map_file(_base + vstart, filesz, flag, perm, _f, 
align_down(phdr.p_offset, mmu::page_size));
 411 if (phdr.p_filesz != phdr.p_memsz) {
 412 assert(perm & mmu::perm_write);
 413 memset(_base + vstart + filesz_unaligned, 0, filesz - 
filesz_unaligned);
 414 if (memsz != filesz) {
 415 mmu::map_anon(_base + vstart + filesz, memsz - filesz, 
flag, perm);
 416 }
 417 }
 418 elf_debug("Loaded and mapped PT_LOAD segment at: %018p of size: 
0x%x\n", _base + vstart, filesz);
 419 }

BTW I am also interested what the output of this readelf command for your 
libzfs.so is. Mine is this:

 readelf -l build/release/libzfs-stripped.so 

Elf file type is DYN (Shared object file)
Entry point 0xd1d0
There are 11 program headers, starting at offset 64

Program Headers:
  Type   Offset VirtAddr   PhysAddr
 FileSizMemSiz  Flags  Align
  LOAD   0x 0x 0x
 0x98e0 0x98e0  R  0x1000
  LOAD   0xa000 0xa000 0xa000
 0x000201a1 0x000201a1  R E0x1000
  LOAD   0x0002b000 0x0002b000 0x0002b000
 0x9258 0x9258  R  0x1000
  LOAD   0x00034cb0 0x00035cb0 0x00035cb0
 0x17f0 0x2c00  RW 0x1000
  DYNAMIC0x00034d80 0x00035d80 0x00035d80
 0x01d0 0x01d0  RW 0x8
  NOTE   0x02a8 0x02a8 0x02a8
 0x0020 0x0020  R  0x8
  NOTE   0x02c8 0x02c8 0x02c8
 0x0024 0x0024  R  0x4
  GNU_PROPERTY   0x02a8 0x02a8 0x02a8
 0x0020 0x0020  R  0x8
  GNU_EH_FRAME   0x0002f768 0x0002f768 0x0002f768
 0x0cec 0x0cec  R  0x4
  GNU_STACK  0x 0x 0x
 0x 0x  RW 0x10
  GNU_RELRO  0x00034cb0 0x00035cb0 0x00035cb0
 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Matthew Kenigsberg
My gdb is not the strongest but if I hbreak on arch_relocate_jump_slot 
looks like _pathname is /libzfs.so, eventually /zpool.so, and then a single 
/libzfs.so before continue hangs

On Tuesday, December 8, 2020 at 9:11:15 AM UTC-7 Matthew Kenigsberg wrote:

> Nix is a package manager, and NixOS is an operating system built 
> completely around the package manager. So every library is stored somewhere 
> in /nix/store, like for example on Nix there is never anything like 
> /lib64/ld-linux-x86-64.so. 
> It would be /nix/store/.../ld-linux-x86-64.so. I could install the package 
> manager on a different OS, in which case I might have both /lib64 and 
> /nix/store, but on NixOS I'll just have the latter. Does that make sense? Not 
> sure if that's messing up something with linking. Guessing I can't 
> reproduce the error on any other OS, but happy to try.
> On Tuesday, December 8, 2020 at 9:05:11 AM UTC-7 Matthew Kenigsberg wrote:
>
>> (gdb) connect
>> abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at 
>> runtime.cc:105
>> 105 do {} while (true);
>> (gdb) osv syms
>> manifest.find_file: path=/libvdso.so, found file=libvdso.so
>> /home/matthew/osv/build/release.x64/libvdso.so 0x1000
>> add symbol table from file 
>> "/home/matthew/osv/build/release.x64/libvdso.so" at
>> .text_addr = 0x10001040
>> .hash_addr = 0x11c8
>> .gnu.hash_addr = 0x1200
>> .dynsym_addr = 0x1238
>> .dynstr_addr = 0x12f8
>> .gnu.version_addr = 0x13be
>> .gnu.version_d_addr = 0x13d0
>> .rela.plt_addr = 0x1408
>> .plt_addr = 0x10001000
>> .eh_frame_addr = 0x10002000
>> .dynamic_addr = 0x10003e60
>> .got_addr = 0x10003fd0
>> .comment_addr = 0x1000
>> .debug_aranges_addr = 0x1000
>> .debug_info_addr = 0x1000
>> .debug_abbrev_addr = 0x1000
>> .debug_line_addr = 0x1000
>> .debug_str_addr = 0x1000
>> .debug_loc_addr = 0x1000
>> .symtab_addr = 0x1000
>> .strtab_addr = 0x1000
>> warning: section .comment not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_aranges not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_info not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_abbrev not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_line not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_str not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .debug_loc not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .symtab not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> warning: section .strtab not found in 
>> /home/matthew/osv/build/release.x64/libvdso.so
>> manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000
>> add symbol table from file 
>> "/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at
>> .text_addr = 0x10006250
>> .hash_addr = 0x10004200
>> .gnu.hash_addr = 0x10004360
>> .dynsym_addr = 0x100043c0
>> .dynstr_addr = 0x10004840
>> .gnu.version_addr = 0x10005092
>> .gnu.version_r_addr = 0x100050f8
>> .rela.dyn_addr = 0x10005148
>> .rela.plt_addr = 0x10005298
>> .init_addr = 0x10006000
>> .plt_addr = 0x10006020
>> .plt.got_addr = 0x10006240
>> .fini_addr = 0x1000737c
>> .rodata_addr = 0x10008000
>> .eh_frame_hdr_addr = 0x1000817c
>> .eh_frame_addr = 0x10008210
>> .gcc_except_table_addr = 0x10008530
>> .init_array_addr = 0x10009c60
>> .fini_array_addr = 0x10009c70
>> .dynamic_addr = 0x10009c78
>> .got_addr = 0x10009e98
>> .data_addr = 0x1000a000
>> .bss_addr = 0x1000a010
>> .comment_addr = 0x10004000
>> .debug_aranges_addr = 0x10004000
>> .debug_info_addr = 0x10004000
>> .debug_abbrev_addr = 0x10004000
>> --Type  for more, q to quit, c to continue without paging--c
>> .debug_line_addr = 0x10004000
>> .debug_str_addr = 0x10004000
>> .debug_loc_addr = 0x10004000
>> .debug_ranges_addr = 0x10004000
>> .symtab_addr = 0x10004000
>> .strtab_addr = 0x10004000
>> warning: section .comment not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: section .debug_aranges not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: section .debug_info not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: section .debug_abbrev not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: section .debug_line not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: section .debug_str not found in 
>> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
>> warning: 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Matthew Kenigsberg
Nix is a package manager, and NixOS is an operating system built completely 
around the package manager. So every library is stored somewhere in 
/nix/store, like for example on Nix there is never anything like 
/lib64/ld-linux-x86-64.so. 
It would be /nix/store/.../ld-linux-x86-64.so. I could install the package 
manager on a different OS, in which case I might have both /lib64 and 
/nix/store, but on NixOS I'll just have the latter. Does that make sense? Not 
sure if that's messing up something with linking. Guessing I can't 
reproduce the error on any other OS, but happy to try.
On Tuesday, December 8, 2020 at 9:05:11 AM UTC-7 Matthew Kenigsberg wrote:

> (gdb) connect
> abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at 
> runtime.cc:105
> 105 do {} while (true);
> (gdb) osv syms
> manifest.find_file: path=/libvdso.so, found file=libvdso.so
> /home/matthew/osv/build/release.x64/libvdso.so 0x1000
> add symbol table from file 
> "/home/matthew/osv/build/release.x64/libvdso.so" at
> .text_addr = 0x10001040
> .hash_addr = 0x11c8
> .gnu.hash_addr = 0x1200
> .dynsym_addr = 0x1238
> .dynstr_addr = 0x12f8
> .gnu.version_addr = 0x13be
> .gnu.version_d_addr = 0x13d0
> .rela.plt_addr = 0x1408
> .plt_addr = 0x10001000
> .eh_frame_addr = 0x10002000
> .dynamic_addr = 0x10003e60
> .got_addr = 0x10003fd0
> .comment_addr = 0x1000
> .debug_aranges_addr = 0x1000
> .debug_info_addr = 0x1000
> .debug_abbrev_addr = 0x1000
> .debug_line_addr = 0x1000
> .debug_str_addr = 0x1000
> .debug_loc_addr = 0x1000
> .symtab_addr = 0x1000
> .strtab_addr = 0x1000
> warning: section .comment not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_aranges not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_info not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_abbrev not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_line not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_str not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .debug_loc not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .symtab not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> warning: section .strtab not found in 
> /home/matthew/osv/build/release.x64/libvdso.so
> manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000
> add symbol table from file 
> "/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at
> .text_addr = 0x10006250
> .hash_addr = 0x10004200
> .gnu.hash_addr = 0x10004360
> .dynsym_addr = 0x100043c0
> .dynstr_addr = 0x10004840
> .gnu.version_addr = 0x10005092
> .gnu.version_r_addr = 0x100050f8
> .rela.dyn_addr = 0x10005148
> .rela.plt_addr = 0x10005298
> .init_addr = 0x10006000
> .plt_addr = 0x10006020
> .plt.got_addr = 0x10006240
> .fini_addr = 0x1000737c
> .rodata_addr = 0x10008000
> .eh_frame_hdr_addr = 0x1000817c
> .eh_frame_addr = 0x10008210
> .gcc_except_table_addr = 0x10008530
> .init_array_addr = 0x10009c60
> .fini_array_addr = 0x10009c70
> .dynamic_addr = 0x10009c78
> .got_addr = 0x10009e98
> .data_addr = 0x1000a000
> .bss_addr = 0x1000a010
> .comment_addr = 0x10004000
> .debug_aranges_addr = 0x10004000
> .debug_info_addr = 0x10004000
> .debug_abbrev_addr = 0x10004000
> --Type  for more, q to quit, c to continue without paging--c
> .debug_line_addr = 0x10004000
> .debug_str_addr = 0x10004000
> .debug_loc_addr = 0x10004000
> .debug_ranges_addr = 0x10004000
> .symtab_addr = 0x10004000
> .strtab_addr = 0x10004000
> warning: section .comment not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_aranges not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_info not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_abbrev not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_line not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_str not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_loc not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .debug_ranges not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .symtab not found in 
> /home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
> warning: section .strtab not found in 
> 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Matthew Kenigsberg
(gdb) connect
abort (fmt=fmt@entry=0x40645bf0 "Assertion failed: %s (%s: %s: %d)\n") at 
runtime.cc:105
105 do {} while (true);
(gdb) osv syms
manifest.find_file: path=/libvdso.so, found file=libvdso.so
/home/matthew/osv/build/release.x64/libvdso.so 0x1000
add symbol table from file "/home/matthew/osv/build/release.x64/libvdso.so" 
at
.text_addr = 0x10001040
.hash_addr = 0x11c8
.gnu.hash_addr = 0x1200
.dynsym_addr = 0x1238
.dynstr_addr = 0x12f8
.gnu.version_addr = 0x13be
.gnu.version_d_addr = 0x13d0
.rela.plt_addr = 0x1408
.plt_addr = 0x10001000
.eh_frame_addr = 0x10002000
.dynamic_addr = 0x10003e60
.got_addr = 0x10003fd0
.comment_addr = 0x1000
.debug_aranges_addr = 0x1000
.debug_info_addr = 0x1000
.debug_abbrev_addr = 0x1000
.debug_line_addr = 0x1000
.debug_str_addr = 0x1000
.debug_loc_addr = 0x1000
.symtab_addr = 0x1000
.strtab_addr = 0x1000
warning: section .comment not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_aranges not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_info not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_abbrev not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_line not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_str not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .debug_loc not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .symtab not found in 
/home/matthew/osv/build/release.x64/libvdso.so
warning: section .strtab not found in 
/home/matthew/osv/build/release.x64/libvdso.so
manifest.find_file: path=/tools/mkfs.so, found file=tools/mkfs/mkfs.so
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so 0x10004000
add symbol table from file 
"/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so" at
.text_addr = 0x10006250
.hash_addr = 0x10004200
.gnu.hash_addr = 0x10004360
.dynsym_addr = 0x100043c0
.dynstr_addr = 0x10004840
.gnu.version_addr = 0x10005092
.gnu.version_r_addr = 0x100050f8
.rela.dyn_addr = 0x10005148
.rela.plt_addr = 0x10005298
.init_addr = 0x10006000
.plt_addr = 0x10006020
.plt.got_addr = 0x10006240
.fini_addr = 0x1000737c
.rodata_addr = 0x10008000
.eh_frame_hdr_addr = 0x1000817c
.eh_frame_addr = 0x10008210
.gcc_except_table_addr = 0x10008530
.init_array_addr = 0x10009c60
.fini_array_addr = 0x10009c70
.dynamic_addr = 0x10009c78
.got_addr = 0x10009e98
.data_addr = 0x1000a000
.bss_addr = 0x1000a010
.comment_addr = 0x10004000
.debug_aranges_addr = 0x10004000
.debug_info_addr = 0x10004000
.debug_abbrev_addr = 0x10004000
--Type  for more, q to quit, c to continue without paging--c
.debug_line_addr = 0x10004000
.debug_str_addr = 0x10004000
.debug_loc_addr = 0x10004000
.debug_ranges_addr = 0x10004000
.symtab_addr = 0x10004000
.strtab_addr = 0x10004000
warning: section .comment not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_aranges not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_info not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_abbrev not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_line not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_str not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_loc not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .debug_ranges not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .symtab not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
warning: section .strtab not found in 
/home/matthew/osv/build/release.x64/tools/mkfs/mkfs.so
manifest.find_file: path=/libzfs.so, found file=libzfs.so
/home/matthew/osv/build/release.x64/libzfs.so 0x1000b000
add symbol table from file "/home/matthew/osv/build/release.x64/libzfs.so" 
at
.text_addr = 0x100178f0
.hash_addr = 0x1000b200
.gnu.hash_addr = 0x1000c318
.dynsym_addr = 0x1000cd78
.dynstr_addr = 0x10010300
.gnu.version_addr = 0x100124e2
.gnu.version_r_addr = 0x10012958
.rela.dyn_addr = 0x100129b8
.rela.plt_addr = 0x100134b0
.init_addr = 0x10016000
.plt_addr = 0x10016020
.plt.got_addr = 0x100178e0
.fini_addr = 0x100340a0
.rodata_addr = 0x10035000
.eh_frame_hdr_addr = 0x10039768
.eh_frame_addr = 0x1003a470
.init_array_addr = 0x10040010
.fini_array_addr = 0x10040018
.data.rel.ro_addr = 0x10040020
.dynamic_addr = 0x100400e0
.got_addr = 0x100402f0
.data_addr = 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Waldek Kozaczuk
It would be also nice to understand if we are crashing on the 1st 
arch_relocate_jump_slot() for libfzs.so or is it a specific JUMP_SLOT that 
causes this crash? 

On Tuesday, December 8, 2020 at 10:39:06 AM UTC-5 Waldek Kozaczuk wrote:

> After you connect with gdb can you run 'osv mmap' and send us the output. 
> Make sure you run 'osv syms' before it and dump backtrace after. Please see 
> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv for any 
> details.
>
> BTW can you build and run OSv ZFS image on the host without NIX? As I 
> understand NIX is really just a layer on top of any Linux distribution, no? 
> I am afraid I do not still understand what exactly NiX is I guess.
>
>
> On Monday, December 7, 2020 at 2:58:40 PM UTC-5 Matthew Kenigsberg wrote:
>
>> (gdb) frame 18
>> #18 0x4039c95a in elf::object::arch_relocate_jump_slot 
>> (this=this@entry=0xa110fa00, sym=..., 
>> addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
>> arch/x64/arch-elf.cc:172
>> 172*static_cast(addr) = sym.relocated_addr();
>> (gdb) print _pathname
>> $14 = {static npos = 18446744073709551615, 
>>   _M_dataplus = {> = 
>> {<__gnu_cxx::new_allocator> = {}, }, 
>> _M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, {
>> _M_local_buf = "/libzfs.so\000\000\000\000\000", 
>> _M_allocated_capacity = 3347131623889529903}}
>>
>> Also been wondering if nix using nonstandard paths is causing problems, 
>> like for libc:
>> [nix-shell:~/osv/build/release]$ ldd libzfs.so 
>> linux-vdso.so.1 (0x7ffcedbb9000)
>> libuutil.so => not found
>> libc.so.6 => 
>> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 
>> (0x7f7594f38000)
>>
>>  
>> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2
>>  
>> (0x7f7595131000)
>> On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote:
>>
>>> It might be easier to simply print '_pathname' value if you switch to 
>>> the right frame in gdb. It would be nice to confirm that the problem we 
>>> have is with zpool.so and that might lead to understanding why this crash 
>>> happens. Maybe the is something wrong with building zpool.so.
>>>
>>> BTW based on this fragment of the stacktrace:
>>>
>>> #6  0x4035cb07 in elf::program::>> elf::program::modules_list&)>::operator() (
>>> __closure=, __closure=, 
>>> ml=...) at core/elf.cc:1620
>>> #7  elf::program::with_modules>> const*):: >
>>> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
>>> #8  elf::program::lookup_addr (this=0xa0097e70, 
>>> addr=addr@entry=0x100254ce) at core/elf.cc:1617
>>> #9  0x404357cc in osv::lookup_name_demangled 
>>> (addr=addr@entry=0x100254ce,
>>> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) 
>>> at core/demangle.cc:47
>>> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85
>>>
>>> It seems we have a bug (or need of improvement) in print_backtrace() to 
>>> make it NOT try to demangle names like "???+19630095" which causes 
>>> follow-up fault.
>>>
>>> At the same time, it is strange that we crash at line 983 which seems to 
>>> indicate something goes wrong when processing zpool.so.
>>>
>>>  981 if (dynamic_exists(DT_HASH)) {
>>>
>>>  982 auto hashtab = dynamic_ptr(DT_HASH);
>>>
>>>  *983 return hashtab[1];*
>>>
>>>  984 }
>>>
>>> On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote:
>>>
 Can you run the ROFS image you built? Also as I understand it NIX is a 
 package manager but what Linux distribution are you using?

 As far as ZFS goes could you enable ELF debugging - change this line:

 conf-debug_elf=0

 To

 conf-debug_elf=1

 In conf/base.mk, delete core/elf.o and force rebuild the kernel. I 
 think you may also need to change the script upload_manifest.py to peeped 
 ‘—verbose’ to the command line with cpiod.so

 It should show more info about elf loading. It may still be necessary 
 to add extra printouts to capture which exact elf it is crashing on in 
 arch_relocate_jump(). 

 In worst case I would need a copy of your loader-stripped.elf and 
 possibly all the other files like cpiod.so, zfs.so that go into the bootfs 
 part of the image. 

 Regards,
 Waldek


 On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg  
 wrote:

> After forcing it to use the right path for libz.so.1, it's working 
> with rofs, but still having the same issue when using zfs, even after I 
> correct the path for libz.
>
> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg 
> wrote:
>
>> gcc version 9.3.0 (GCC)
>> QEMU emulator version 5.1.0
>>
>> Running with fs=rofs I get the error:
>> Traceback (most recent call last):
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-08 Thread Waldek Kozaczuk
After you connect with gdb can you run 'osv mmap' and send us the output. 
Make sure you run 'osv syms' before it and dump backtrace after. Please 
see https://github.com/cloudius-systems/osv/wiki/Debugging-OSv for any 
details.

BTW can you build and run OSv ZFS image on the host without NIX? As I 
understand NIX is really just a layer on top of any Linux distribution, no? 
I am afraid I do not still understand what exactly NiX is I guess.

On Monday, December 7, 2020 at 2:58:40 PM UTC-5 Matthew Kenigsberg wrote:

> (gdb) frame 18
> #18 0x4039c95a in elf::object::arch_relocate_jump_slot 
> (this=this@entry=0xa110fa00, sym=..., 
> addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
> arch/x64/arch-elf.cc:172
> 172*static_cast(addr) = sym.relocated_addr();
> (gdb) print _pathname
> $14 = {static npos = 18446744073709551615, 
>   _M_dataplus = {> = 
> {<__gnu_cxx::new_allocator> = {}, }, 
> _M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, {
> _M_local_buf = "/libzfs.so\000\000\000\000\000", _M_allocated_capacity 
> = 3347131623889529903}}
>
> Also been wondering if nix using nonstandard paths is causing problems, 
> like for libc:
> [nix-shell:~/osv/build/release]$ ldd libzfs.so 
> linux-vdso.so.1 (0x7ffcedbb9000)
> libuutil.so => not found
> libc.so.6 => 
> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 
> (0x7f7594f38000)
>
>  
> /nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2
>  
> (0x7f7595131000)
> On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote:
>
>> It might be easier to simply print '_pathname' value if you switch to the 
>> right frame in gdb. It would be nice to confirm that the problem we have is 
>> with zpool.so and that might lead to understanding why this crash happens. 
>> Maybe the is something wrong with building zpool.so.
>>
>> BTW based on this fragment of the stacktrace:
>>
>> #6  0x4035cb07 in elf::program::> elf::program::modules_list&)>::operator() (
>> __closure=, __closure=, ml=...) 
>> at core/elf.cc:1620
>> #7  elf::program::with_modules> const*):: >
>> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
>> #8  elf::program::lookup_addr (this=0xa0097e70, 
>> addr=addr@entry=0x100254ce) at core/elf.cc:1617
>> #9  0x404357cc in osv::lookup_name_demangled 
>> (addr=addr@entry=0x100254ce,
>> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) 
>> at core/demangle.cc:47
>> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85
>>
>> It seems we have a bug (or need of improvement) in print_backtrace() to 
>> make it NOT try to demangle names like "???+19630095" which causes 
>> follow-up fault.
>>
>> At the same time, it is strange that we crash at line 983 which seems to 
>> indicate something goes wrong when processing zpool.so.
>>
>>  981 if (dynamic_exists(DT_HASH)) {
>>
>>  982 auto hashtab = dynamic_ptr(DT_HASH);
>>
>>  *983 return hashtab[1];*
>>
>>  984 }
>>
>> On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote:
>>
>>> Can you run the ROFS image you built? Also as I understand it NIX is a 
>>> package manager but what Linux distribution are you using?
>>>
>>> As far as ZFS goes could you enable ELF debugging - change this line:
>>>
>>> conf-debug_elf=0
>>>
>>> To
>>>
>>> conf-debug_elf=1
>>>
>>> In conf/base.mk, delete core/elf.o and force rebuild the kernel. I 
>>> think you may also need to change the script upload_manifest.py to peeped 
>>> ‘—verbose’ to the command line with cpiod.so
>>>
>>> It should show more info about elf loading. It may still be necessary to 
>>> add extra printouts to capture which exact elf it is crashing on in 
>>> arch_relocate_jump(). 
>>>
>>> In worst case I would need a copy of your loader-stripped.elf and 
>>> possibly all the other files like cpiod.so, zfs.so that go into the bootfs 
>>> part of the image. 
>>>
>>> Regards,
>>> Waldek
>>>
>>>
>>> On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg  
>>> wrote:
>>>
 After forcing it to use the right path for libz.so.1, it's working with 
 rofs, but still having the same issue when using zfs, even after I correct 
 the path for libz.

 On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg 
 wrote:

> gcc version 9.3.0 (GCC)
> QEMU emulator version 5.1.0
>
> Running with fs=rofs I get the error:
> Traceback (most recent call last):
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
> 
> main()
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
> gen_image(outfile, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in 
> gen_image
> system_structure_block, bytes_written = write_fs(fp, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-07 Thread Matthew Kenigsberg
(gdb) frame 18
#18 0x4039c95a in elf::object::arch_relocate_jump_slot 
(this=this@entry=0xa110fa00, sym=..., 
addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
arch/x64/arch-elf.cc:172
172*static_cast(addr) = sym.relocated_addr();
(gdb) print _pathname
$14 = {static npos = 18446744073709551615, 
  _M_dataplus = {> = {<__gnu_cxx::new_allocator> 
= {}, }, 
_M_p = 0xa110fa30 "/libzfs.so"}, _M_string_length = 10, {
_M_local_buf = "/libzfs.so\000\000\000\000\000", _M_allocated_capacity 
= 3347131623889529903}}

Also been wondering if nix using nonstandard paths is causing problems, 
like for libc:
[nix-shell:~/osv/build/release]$ ldd libzfs.so 
linux-vdso.so.1 (0x7ffcedbb9000)
libuutil.so => not found
libc.so.6 => 
/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib/libc.so.6 
(0x7f7594f38000)
   
 
/nix/store/9df65igwjmf2wbw0gbrrgair6piqjgmi-glibc-2.31/lib64/ld-linux-x86-64.so.2
 
(0x7f7595131000)
On Sunday, December 6, 2020 at 8:43:10 AM UTC-7 jwkoz...@gmail.com wrote:

> It might be easier to simply print '_pathname' value if you switch to the 
> right frame in gdb. It would be nice to confirm that the problem we have is 
> with zpool.so and that might lead to understanding why this crash happens. 
> Maybe the is something wrong with building zpool.so.
>
> BTW based on this fragment of the stacktrace:
>
> #6  0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() (
> __closure=, __closure=, ml=...) 
> at core/elf.cc:1620
> #7  elf::program::with_modules const*):: >
> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
> #8  elf::program::lookup_addr (this=0xa0097e70, 
> addr=addr@entry=0x100254ce) at core/elf.cc:1617
> #9  0x404357cc in osv::lookup_name_demangled 
> (addr=addr@entry=0x100254ce,
> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) 
> at core/demangle.cc:47
> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85
>
> It seems we have a bug (or need of improvement) in print_backtrace() to 
> make it NOT try to demangle names like "???+19630095" which causes 
> follow-up fault.
>
> At the same time, it is strange that we crash at line 983 which seems to 
> indicate something goes wrong when processing zpool.so.
>
>  981 if (dynamic_exists(DT_HASH)) {
>
>  982 auto hashtab = dynamic_ptr(DT_HASH);
>
>  *983 return hashtab[1];*
>
>  984 }
>
> On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote:
>
>> Can you run the ROFS image you built? Also as I understand it NIX is a 
>> package manager but what Linux distribution are you using?
>>
>> As far as ZFS goes could you enable ELF debugging - change this line:
>>
>> conf-debug_elf=0
>>
>> To
>>
>> conf-debug_elf=1
>>
>> In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think 
>> you may also need to change the script upload_manifest.py to peeped 
>> ‘—verbose’ to the command line with cpiod.so
>>
>> It should show more info about elf loading. It may still be necessary to 
>> add extra printouts to capture which exact elf it is crashing on in 
>> arch_relocate_jump(). 
>>
>> In worst case I would need a copy of your loader-stripped.elf and 
>> possibly all the other files like cpiod.so, zfs.so that go into the bootfs 
>> part of the image. 
>>
>> Regards,
>> Waldek
>>
>>
>> On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg  
>> wrote:
>>
>>> After forcing it to use the right path for libz.so.1, it's working with 
>>> rofs, but still having the same issue when using zfs, even after I correct 
>>> the path for libz.
>>>
>>> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg 
>>> wrote:
>>>
 gcc version 9.3.0 (GCC)
 QEMU emulator version 5.1.0

 Running with fs=rofs I get the error:
 Traceback (most recent call last):
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
 
 main()
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
 gen_image(outfile, manifest)
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in 
 gen_image
 system_structure_block, bytes_written = write_fs(fp, manifest)
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in 
 write_fs
 count, directory_entries_index = write_dir(fp, manifest.get(''), 
 '', manifest)
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in 
 write_dir
 count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
 entry, manifest)
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in 
 write_dir
 count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
 entry, manifest)
   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in 
 write_dir
 inode.count = write_file(fp, val)
   File 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-06 Thread Waldek Kozaczuk
It might be easier to simply print '_pathname' value if you switch to the 
right frame in gdb. It would be nice to confirm that the problem we have is 
with zpool.so and that might lead to understanding why this crash happens. 
Maybe the is something wrong with building zpool.so.

BTW based on this fragment of the stacktrace:

#6  0x4035cb07 in elf::programoperator() (
__closure=, __closure=, ml=...) 
at core/elf.cc:1620
#7  elf::program::with_modules >
(f=..., this=0xa0097e70) at include/osv/elf.hh:702
#8  elf::program::lookup_addr (this=0xa0097e70, 
addr=addr@entry=0x100254ce) at core/elf.cc:1617
#9  0x404357cc in osv::lookup_name_demangled 
(addr=addr@entry=0x100254ce,
buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) at 
core/demangle.cc:47
#10 0x4023c4e0 in print_backtrace () at runtime.cc:85

It seems we have a bug (or need of improvement) in print_backtrace() to 
make it NOT try to demangle names like "???+19630095" which causes 
follow-up fault.

At the same time, it is strange that we crash at line 983 which seems to 
indicate something goes wrong when processing zpool.so.

 981 if (dynamic_exists(DT_HASH)) {

 982 auto hashtab = dynamic_ptr(DT_HASH);

 *983 return hashtab[1];*

 984 }

On Sunday, December 6, 2020 at 10:06:21 AM UTC-5 Waldek Kozaczuk wrote:

> Can you run the ROFS image you built? Also as I understand it NIX is a 
> package manager but what Linux distribution are you using?
>
> As far as ZFS goes could you enable ELF debugging - change this line:
>
> conf-debug_elf=0
>
> To
>
> conf-debug_elf=1
>
> In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think 
> you may also need to change the script upload_manifest.py to peeped 
> ‘—verbose’ to the command line with cpiod.so
>
> It should show more info about elf loading. It may still be necessary to 
> add extra printouts to capture which exact elf it is crashing on in 
> arch_relocate_jump(). 
>
> In worst case I would need a copy of your loader-stripped.elf and possibly 
> all the other files like cpiod.so, zfs.so that go into the bootfs part of 
> the image. 
>
> Regards,
> Waldek
>
>
> On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg  
> wrote:
>
>> After forcing it to use the right path for libz.so.1, it's working with 
>> rofs, but still having the same issue when using zfs, even after I correct 
>> the path for libz.
>>
>> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg 
>> wrote:
>>
>>> gcc version 9.3.0 (GCC)
>>> QEMU emulator version 5.1.0
>>>
>>> Running with fs=rofs I get the error:
>>> Traceback (most recent call last):
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
>>> main()
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
>>> gen_image(outfile, manifest)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in 
>>> gen_image
>>> system_structure_block, bytes_written = write_fs(fp, manifest)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs
>>> count, directory_entries_index = write_dir(fp, manifest.get(''), '', 
>>> manifest)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in 
>>> write_dir
>>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
>>> entry, manifest)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in 
>>> write_dir
>>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
>>> entry, manifest)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in 
>>> write_dir
>>> inode.count = write_file(fp, val)
>>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in 
>>> write_file
>>> with open(path, 'rb') as f:
>>> FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1'
>>>
>>> I think that's from this line in usr.manifest?
>>> /usr/lib/libz.so.1: libz.so.1
>>>
>>> Don't have zlib in the manifest without fs=rofs, and I think zpool uses 
>>> it?
>>>
>>> Looking into it...
>>> On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com 
>>> wrote:
>>>
 I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the 
 code fragment where it happens:

 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void 
 *addr, Elf64_Sxword addend)

 170 {

 171 if (sym.symbol) {

 172 *static_cast(addr) = sym.relocated_addr();

 173 return true;

 174 } else {

 175 return false;

 176 }

 177 }
 It looks like writing at the addr 0x10040ca8 in line 172 caused the 
 fault. Why?

 And then the 2nd page fault in the gdb backtrace as the 1st one was 
 being handled (not sure if that is a bug or just a state of loading of a 
 program).

 981 if (dynamic_exists(DT_HASH)) {

  982  

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-06 Thread Waldek Kozaczuk
Can you run the ROFS image you built? Also as I understand it NIX is a
package manager but what Linux distribution are you using?

As far as ZFS goes could you enable ELF debugging - change this line:

conf-debug_elf=0

To

conf-debug_elf=1

In conf/base.mk, delete core/elf.o and force rebuild the kernel. I think
you may also need to change the script upload_manifest.py to peeped
‘—verbose’ to the command line with cpiod.so

It should show more info about elf loading. It may still be necessary to
add extra printouts to capture which exact elf it is crashing on in
arch_relocate_jump().

In worst case I would need a copy of your loader-stripped.elf and possibly
all the other files like cpiod.so, zfs.so that go into the bootfs part of
the image.

Regards,
Waldek


On Sat, Dec 5, 2020 at 19:31 Matthew Kenigsberg 
wrote:

> After forcing it to use the right path for libz.so.1, it's working with
> rofs, but still having the same issue when using zfs, even after I correct
> the path for libz.
>
> On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg wrote:
>
>> gcc version 9.3.0 (GCC)
>> QEMU emulator version 5.1.0
>>
>> Running with fs=rofs I get the error:
>> Traceback (most recent call last):
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
>> main()
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
>> gen_image(outfile, manifest)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image
>> system_structure_block, bytes_written = write_fs(fp, manifest)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs
>> count, directory_entries_index = write_dir(fp, manifest.get(''), '',
>> manifest)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' +
>> entry, manifest)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
>> count, directory_entries_index = write_dir(fp, val, dirpath + '/' +
>> entry, manifest)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir
>> inode.count = write_file(fp, val)
>>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in
>> write_file
>> with open(path, 'rb') as f:
>> FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1'
>>
>> I think that's from this line in usr.manifest?
>> /usr/lib/libz.so.1: libz.so.1
>>
>> Don't have zlib in the manifest without fs=rofs, and I think zpool uses
>> it?
>>
>> Looking into it...
>> On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com
>> wrote:
>>
>>> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the
>>> code fragment where it happens:
>>>
>>> 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr,
>>> Elf64_Sxword addend)
>>>
>>> 170 {
>>>
>>> 171 if (sym.symbol) {
>>>
>>> 172 *static_cast(addr) = sym.relocated_addr();
>>>
>>> 173 return true;
>>>
>>> 174 } else {
>>>
>>> 175 return false;
>>>
>>> 176 }
>>>
>>> 177 }
>>> It looks like writing at the addr 0x10040ca8 in line 172 caused the
>>> fault. Why?
>>>
>>> And then the 2nd page fault in the gdb backtrace as the 1st one was
>>> being handled (not sure if that is a bug or just a state of loading of a
>>> program).
>>>
>>> 981 if (dynamic_exists(DT_HASH)) {
>>>
>>>  982 auto hashtab = dynamic_ptr(DT_HASH);
>>>
>>>  983 return hashtab[1];
>>>
>>>  984 }
>>> Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or
>>> something?
>>>
>>> Can you try to do the same with ROFS?
>>>
>>> fs=rofs
>>> On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg
>>> wrote:
>>>
 Struggling to get scripts/build to run on NixOS because I'm getting a
 page fault. NixOS does keep shared libraries in nonstandard locations, not
 sure if that's breaking something. More details below, but any ideas?

 As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
 run_cmd("/zpool.so", zpool_args);

 The error from scripts/build:

 OSv v0.55.0-145-g97f17a7a
 eth0: 192.168.122.15
 Booted up in 154.38 ms
 Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so
 set compression=off osv
 Running mkfs...
 page fault outside application, addr: 0x10040ca8
 [registers]
 RIP: 0x4039c25a
 
 RFL: 0x00010202  CS:  0x0008  SS:
 0x0010
 RAX: 0x1007a340  RBX: 0x10040ca8  RCX:
 0x1006abb0  RDX: 0x0002
 RSI: 0x201f6f70  RDI: 0xa1058c00  RBP:
 0x201f6f30  R8:  0xa0a68460
 R9:  0xa0f18da0  R10: 0x  R11:
 0x409dd380  R12: 0xa0f18c00
 R13: 0xa0f18da0  R14: 0x  R15:
 0x409dd380  RSP: 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-05 Thread Matthew Kenigsberg
After forcing it to use the right path for libz.so.1, it's working with 
rofs, but still having the same issue when using zfs, even after I correct 
the path for libz.

On Saturday, December 5, 2020 at 5:18:37 PM UTC-7 Matthew Kenigsberg wrote:

> gcc version 9.3.0 (GCC)
> QEMU emulator version 5.1.0
>
> Running with fs=rofs I get the error:
> Traceback (most recent call last):
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
> main()
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
> gen_image(outfile, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image
> system_structure_block, bytes_written = write_fs(fp, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs
> count, directory_entries_index = write_dir(fp, manifest.get(''), '', 
> manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
> entry, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
> count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
> entry, manifest)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir
> inode.count = write_file(fp, val)
>   File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in write_file
> with open(path, 'rb') as f:
> FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1'
>
> I think that's from this line in usr.manifest?
> /usr/lib/libz.so.1: libz.so.1
>
> Don't have zlib in the manifest without fs=rofs, and I think zpool uses it?
>
> Looking into it...
> On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com 
> wrote:
>
>> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the 
>> code fragment where it happens:
>>
>> 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, 
>> Elf64_Sxword addend)
>>
>> 170 {
>>
>> 171 if (sym.symbol) {
>>
>> 172 *static_cast(addr) = sym.relocated_addr();
>>
>> 173 return true;
>>
>> 174 } else {
>>
>> 175 return false;
>>
>> 176 }
>>
>> 177 }
>> It looks like writing at the addr 0x10040ca8 in line 172 caused the 
>> fault. Why?
>>
>> And then the 2nd page fault in the gdb backtrace as the 1st one was being 
>> handled (not sure if that is a bug or just a state of loading of a program).
>>
>> 981 if (dynamic_exists(DT_HASH)) {
>>
>>  982 auto hashtab = dynamic_ptr(DT_HASH);
>>
>>  983 return hashtab[1];
>>
>>  984 }
>> Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or 
>> something?
>>
>> Can you try to do the same with ROFS?
>>
>> fs=rofs
>> On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg 
>> wrote:
>>
>>> Struggling to get scripts/build to run on NixOS because I'm getting a 
>>> page fault. NixOS does keep shared libraries in nonstandard locations, not 
>>> sure if that's breaking something. More details below, but any ideas?
>>>
>>> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
>>> run_cmd("/zpool.so", zpool_args);
>>>
>>> The error from scripts/build:
>>>
>>> OSv v0.55.0-145-g97f17a7a
>>> eth0: 192.168.122.15
>>> Booted up in 154.38 ms
>>> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set 
>>> compression=off osv
>>> Running mkfs...
>>> page fault outside application, addr: 0x10040ca8
>>> [registers]
>>> RIP: 0x4039c25a 
>>> 
>>> RFL: 0x00010202  CS:  0x0008  SS:  0x0010
>>> RAX: 0x1007a340  RBX: 0x10040ca8  RCX: 
>>> 0x1006abb0  RDX: 0x0002
>>> RSI: 0x201f6f70  RDI: 0xa1058c00  RBP: 
>>> 0x201f6f30  R8:  0xa0a68460
>>> R9:  0xa0f18da0  R10: 0x  R11: 
>>> 0x409dd380  R12: 0xa0f18c00
>>> R13: 0xa0f18da0  R14: 0x  R15: 
>>> 0x409dd380  RSP: 0x201f6f20
>>> Aborted
>>>
>>> [backtrace]
>>> 0x403458d3 
>>> 0x403477ce 
>>> 0x40398ba2 
>>> 0x40397a16 
>>> 0x40360a13 
>>> 0x40360c38 
>>> 0x4039764f 
>>> 0xa12b880f 
>>>
>>> Trying to get a backtrace after connecting with gdb:
>>> (gdb) bt
>>> #0  abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: 
>>> %d)\n") at runtime.cc:105
>>> #1  0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 
>>> "ef->rflags & processor::rflags_if", 
>>> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, 
>>> func=func@entry=0x40672d1a "page_fault")
>>> at runtime.cc:139
>>> #2  0x40398c05 in page_fault (ef=0x80015048) at 
>>> arch/x64/arch-cpu.hh:107
>>> #3  
>>> #4  0x4035c879 in elf::object::symtab_len 
>>> (this=0xa0f18c00) at core/elf.cc:983
>>> #5  0x4035c938 in 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-05 Thread Matthew Kenigsberg
gcc version 9.3.0 (GCC)
QEMU emulator version 5.1.0

Running with fs=rofs I get the error:
Traceback (most recent call last):
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 369, in 
main()
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 366, in main
gen_image(outfile, manifest)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 269, in gen_image
system_structure_block, bytes_written = write_fs(fp, manifest)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 246, in write_fs
count, directory_entries_index = write_dir(fp, manifest.get(''), '', 
manifest)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
entry, manifest)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 207, in write_dir
count, directory_entries_index = write_dir(fp, val, dirpath + '/' + 
entry, manifest)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 222, in write_dir
inode.count = write_file(fp, val)
  File "/home/matthew/osv/scripts/gen-rofs-img.py", line 164, in write_file
with open(path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'libz.so.1'

I think that's from this line in usr.manifest?
/usr/lib/libz.so.1: libz.so.1

Don't have zlib in the manifest without fs=rofs, and I think zpool uses it?

Looking into it...
On Saturday, December 5, 2020 at 4:36:20 PM UTC-7 jwkoz...@gmail.com wrote:

> I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the code 
> fragment where it happens:
>
> 169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, 
> Elf64_Sxword addend)
>
> 170 {
>
> 171 if (sym.symbol) {
>
> 172 *static_cast(addr) = sym.relocated_addr();
>
> 173 return true;
>
> 174 } else {
>
> 175 return false;
>
> 176 }
>
> 177 }
> It looks like writing at the addr 0x10040ca8 in line 172 caused the 
> fault. Why?
>
> And then the 2nd page fault in the gdb backtrace as the 1st one was being 
> handled (not sure if that is a bug or just a state of loading of a program).
>
> 981 if (dynamic_exists(DT_HASH)) {
>
>  982 auto hashtab = dynamic_ptr(DT_HASH);
>
>  983 return hashtab[1];
>
>  984 }
> Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or 
> something?
>
> Can you try to do the same with ROFS?
>
> fs=rofs
> On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote:
>
>> Struggling to get scripts/build to run on NixOS because I'm getting a 
>> page fault. NixOS does keep shared libraries in nonstandard locations, not 
>> sure if that's breaking something. More details below, but any ideas?
>>
>> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
>> run_cmd("/zpool.so", zpool_args);
>>
>> The error from scripts/build:
>>
>> OSv v0.55.0-145-g97f17a7a
>> eth0: 192.168.122.15
>> Booted up in 154.38 ms
>> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set 
>> compression=off osv
>> Running mkfs...
>> page fault outside application, addr: 0x10040ca8
>> [registers]
>> RIP: 0x4039c25a 
>> 
>> RFL: 0x00010202  CS:  0x0008  SS:  0x0010
>> RAX: 0x1007a340  RBX: 0x10040ca8  RCX: 
>> 0x1006abb0  RDX: 0x0002
>> RSI: 0x201f6f70  RDI: 0xa1058c00  RBP: 
>> 0x201f6f30  R8:  0xa0a68460
>> R9:  0xa0f18da0  R10: 0x  R11: 
>> 0x409dd380  R12: 0xa0f18c00
>> R13: 0xa0f18da0  R14: 0x  R15: 
>> 0x409dd380  RSP: 0x201f6f20
>> Aborted
>>
>> [backtrace]
>> 0x403458d3 
>> 0x403477ce 
>> 0x40398ba2 
>> 0x40397a16 
>> 0x40360a13 
>> 0x40360c38 
>> 0x4039764f 
>> 0xa12b880f 
>>
>> Trying to get a backtrace after connecting with gdb:
>> (gdb) bt
>> #0  abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: 
>> %d)\n") at runtime.cc:105
>> #1  0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 
>> "ef->rflags & processor::rflags_if", 
>> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, 
>> func=func@entry=0x40672d1a "page_fault")
>> at runtime.cc:139
>> #2  0x40398c05 in page_fault (ef=0x80015048) at 
>> arch/x64/arch-cpu.hh:107
>> #3  
>> #4  0x4035c879 in elf::object::symtab_len 
>> (this=0xa0f18c00) at core/elf.cc:983
>> #5  0x4035c938 in elf::object::lookup_addr 
>> (this=0xa0f18c00, addr=addr@entry=0x100254ce)
>> at core/elf.cc:1015
>> #6  0x4035cb07 in elf::program::> elf::program::modules_list&)>::operator() (
>> __closure=, __closure=, ml=...) 
>> at core/elf.cc:1620
>> #7  elf::program::with_modules> const*):: >
>> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
>> #8  elf::program::lookup_addr 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-05 Thread Waldek Kozaczuk
I can not reproduce it on Ubuntu 20.20 neither Fedora 33. Here is the code 
fragment where it happens:

169 bool object::arch_relocate_jump_slot(symbol_module& sym, void *addr, 
Elf64_Sxword addend)

170 {

171 if (sym.symbol) {

172 *static_cast(addr) = sym.relocated_addr();

173 return true;

174 } else {

175 return false;

176 }

177 }
It looks like writing at the addr 0x10040ca8 in line 172 caused the 
fault. Why?

And then the 2nd page fault in the gdb backtrace as the 1st one was being 
handled (not sure if that is a bug or just a state of loading of a program).

981 if (dynamic_exists(DT_HASH)) {

 982 auto hashtab = dynamic_ptr(DT_HASH);

 983 return hashtab[1];

 984 }
Is something wrong with the elf files cpiod.so, mkfs.so or zfs.so or 
something?

Can you try to do the same with ROFS?

fs=rofs
On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote:

> Struggling to get scripts/build to run on NixOS because I'm getting a page 
> fault. NixOS does keep shared libraries in nonstandard locations, not sure 
> if that's breaking something. More details below, but any ideas?
>
> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
> run_cmd("/zpool.so", zpool_args);
>
> The error from scripts/build:
>
> OSv v0.55.0-145-g97f17a7a
> eth0: 192.168.122.15
> Booted up in 154.38 ms
> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set 
> compression=off osv
> Running mkfs...
> page fault outside application, addr: 0x10040ca8
> [registers]
> RIP: 0x4039c25a 
> 
> RFL: 0x00010202  CS:  0x0008  SS:  0x0010
> RAX: 0x1007a340  RBX: 0x10040ca8  RCX: 0x1006abb0  
> RDX: 0x0002
> RSI: 0x201f6f70  RDI: 0xa1058c00  RBP: 0x201f6f30  
> R8:  0xa0a68460
> R9:  0xa0f18da0  R10: 0x  R11: 0x409dd380  
> R12: 0xa0f18c00
> R13: 0xa0f18da0  R14: 0x  R15: 0x409dd380  
> RSP: 0x201f6f20
> Aborted
>
> [backtrace]
> 0x403458d3 
> 0x403477ce 
> 0x40398ba2 
> 0x40397a16 
> 0x40360a13 
> 0x40360c38 
> 0x4039764f 
> 0xa12b880f 
>
> Trying to get a backtrace after connecting with gdb:
> (gdb) bt
> #0  abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") 
> at runtime.cc:105
> #1  0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 
> "ef->rflags & processor::rflags_if", 
> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, 
> func=func@entry=0x40672d1a "page_fault")
> at runtime.cc:139
> #2  0x40398c05 in page_fault (ef=0x80015048) at 
> arch/x64/arch-cpu.hh:107
> #3  
> #4  0x4035c879 in elf::object::symtab_len 
> (this=0xa0f18c00) at core/elf.cc:983
> #5  0x4035c938 in elf::object::lookup_addr 
> (this=0xa0f18c00, addr=addr@entry=0x100254ce)
> at core/elf.cc:1015
> #6  0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() (
> __closure=, __closure=, ml=...) 
> at core/elf.cc:1620
> #7  elf::program::with_modules const*):: >
> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
> #8  elf::program::lookup_addr (this=0xa0097e70, 
> addr=addr@entry=0x100254ce) at core/elf.cc:1617
> #9  0x404357cc in osv::lookup_name_demangled 
> (addr=addr@entry=0x100254ce, 
> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) 
> at core/demangle.cc:47
> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85
> #11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at 
> runtime.cc:121
> #12 0x40202989 in abort () at runtime.cc:98
> #13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, 
> addr=) at core/mmu.cc:1314
> #14 mmu::vm_sigsegv (addr=, ef=0x81215068) at 
> core/mmu.cc:1308
> #15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, 
> ef=ef@entry=0x81215068)
> at core/mmu.cc:1328
> #16 0x40398ba3 in page_fault (ef=0x81215068) at 
> arch/x64/mmu.cc:42
> #17 
> #18 0x4039c25a in elf::object::arch_relocate_jump_slot 
> (this=this@entry=0xa0f18c00, sym=..., 
> addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
> arch/x64/arch-elf.cc:172
> #19 0x40360a14 in elf::object::resolve_pltgot 
> (this=0xa0f18c00, index=)
> at core/elf.cc:843
> #20 0x40360c39 in elf_resolve_pltgot (index=308, 
> obj=0xa0f18c00) at core/elf.cc:1860
> #21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47
> #22 0x100254cf in ?? ()
> #23 0xa12b8800 in ?? ()
> #24 0x201f74a0 in ?? ()
> #25 0x100254cf in ?? ()
> #26 0x201f7480 in ?? ()
> #27 0x403f241c in calloc (nmemb=, size= out>) at 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-05 Thread Waldek Kozaczuk
Which version of GCC and QEMU/KV< are you using?

On Saturday, December 5, 2020 at 5:44:12 PM UTC-5 Matthew Kenigsberg wrote:

> Struggling to get scripts/build to run on NixOS because I'm getting a page 
> fault. NixOS does keep shared libraries in nonstandard locations, not sure 
> if that's breaking something. More details below, but any ideas?
>
> As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
> run_cmd("/zpool.so", zpool_args);
>
> The error from scripts/build:
>
> OSv v0.55.0-145-g97f17a7a
> eth0: 192.168.122.15
> Booted up in 154.38 ms
> Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set 
> compression=off osv
> Running mkfs...
> page fault outside application, addr: 0x10040ca8
> [registers]
> RIP: 0x4039c25a 
> 
> RFL: 0x00010202  CS:  0x0008  SS:  0x0010
> RAX: 0x1007a340  RBX: 0x10040ca8  RCX: 0x1006abb0  
> RDX: 0x0002
> RSI: 0x201f6f70  RDI: 0xa1058c00  RBP: 0x201f6f30  
> R8:  0xa0a68460
> R9:  0xa0f18da0  R10: 0x  R11: 0x409dd380  
> R12: 0xa0f18c00
> R13: 0xa0f18da0  R14: 0x  R15: 0x409dd380  
> RSP: 0x201f6f20
> Aborted
>
> [backtrace]
> 0x403458d3 
> 0x403477ce 
> 0x40398ba2 
> 0x40397a16 
> 0x40360a13 
> 0x40360c38 
> 0x4039764f 
> 0xa12b880f 
>
> Trying to get a backtrace after connecting with gdb:
> (gdb) bt
> #0  abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") 
> at runtime.cc:105
> #1  0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 
> "ef->rflags & processor::rflags_if", 
> file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, 
> func=func@entry=0x40672d1a "page_fault")
> at runtime.cc:139
> #2  0x40398c05 in page_fault (ef=0x80015048) at 
> arch/x64/arch-cpu.hh:107
> #3  
> #4  0x4035c879 in elf::object::symtab_len 
> (this=0xa0f18c00) at core/elf.cc:983
> #5  0x4035c938 in elf::object::lookup_addr 
> (this=0xa0f18c00, addr=addr@entry=0x100254ce)
> at core/elf.cc:1015
> #6  0x4035cb07 in elf::program:: elf::program::modules_list&)>::operator() (
> __closure=, __closure=, ml=...) 
> at core/elf.cc:1620
> #7  elf::program::with_modules const*):: >
> (f=..., this=0xa0097e70) at include/osv/elf.hh:702
> #8  elf::program::lookup_addr (this=0xa0097e70, 
> addr=addr@entry=0x100254ce) at core/elf.cc:1617
> #9  0x404357cc in osv::lookup_name_demangled 
> (addr=addr@entry=0x100254ce, 
> buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) 
> at core/demangle.cc:47
> #10 0x4023c4e0 in print_backtrace () at runtime.cc:85
> #11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at 
> runtime.cc:121
> #12 0x40202989 in abort () at runtime.cc:98
> #13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, 
> addr=) at core/mmu.cc:1314
> #14 mmu::vm_sigsegv (addr=, ef=0x81215068) at 
> core/mmu.cc:1308
> #15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, 
> ef=ef@entry=0x81215068)
> at core/mmu.cc:1328
> #16 0x40398ba3 in page_fault (ef=0x81215068) at 
> arch/x64/mmu.cc:42
> #17 
> #18 0x4039c25a in elf::object::arch_relocate_jump_slot 
> (this=this@entry=0xa0f18c00, sym=..., 
> addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
> arch/x64/arch-elf.cc:172
> #19 0x40360a14 in elf::object::resolve_pltgot 
> (this=0xa0f18c00, index=)
> at core/elf.cc:843
> #20 0x40360c39 in elf_resolve_pltgot (index=308, 
> obj=0xa0f18c00) at core/elf.cc:1860
> #21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47
> #22 0x100254cf in ?? ()
> #23 0xa12b8800 in ?? ()
> #24 0x201f74a0 in ?? ()
> #25 0x100254cf in ?? ()
> #26 0x201f7480 in ?? ()
> #27 0x403f241c in calloc (nmemb=, size= out>) at core/mempool.cc:1811
> #28 0x90a98000 in ?? ()
> #29 0x in ?? ()
> On Saturday, November 28, 2020 at 1:39:46 PM UTC-7 Matthew Kenigsberg 
> wrote:
>
>> Hi,
>>
>> I'll send something, might take a bit before I find time to work on it 
>> though.
>>
>> Thanks,
>> Matthew
>>
>> On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote:
>>
>>> On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk  
>>> wrote: 
>>> > 
>>> > Hey, 
>>> > 
>>> > Send a patch with a new app that could demonstrate it, please, if you 
>>> can. I would like to see it. Sounds like a nice improvement. 
>>>
>>> FWIW: I'd love to see it too -- been meaning to play with Nix and this 
>>> gives me a perfect excuse ;-) 
>>>
>>> Thanks, 
>>> Roman. 
>>>
>>

-- 
You received this message because you are subscribed to the Google 

Re: [osv-dev] Re: Pip packages/using Nix

2020-12-05 Thread Matthew Kenigsberg
Struggling to get scripts/build to run on NixOS because I'm getting a page 
fault. NixOS does keep shared libraries in nonstandard locations, not sure 
if that's breaking something. More details below, but any ideas?

As far as I can tell, the error is caused by tools/mkfs/mkfs.cc:71:
run_cmd("/zpool.so", zpool_args);

The error from scripts/build:

OSv v0.55.0-145-g97f17a7a
eth0: 192.168.122.15
Booted up in 154.38 ms
Cmdline: /tools/mkfs.so; /tools/cpiod.so --prefix /zfs/zfs/; /zfs.so set 
compression=off osv
Running mkfs...
page fault outside application, addr: 0x10040ca8
[registers]
RIP: 0x4039c25a 

RFL: 0x00010202  CS:  0x0008  SS:  0x0010
RAX: 0x1007a340  RBX: 0x10040ca8  RCX: 0x1006abb0  
RDX: 0x0002
RSI: 0x201f6f70  RDI: 0xa1058c00  RBP: 0x201f6f30  
R8:  0xa0a68460
R9:  0xa0f18da0  R10: 0x  R11: 0x409dd380  
R12: 0xa0f18c00
R13: 0xa0f18da0  R14: 0x  R15: 0x409dd380  
RSP: 0x201f6f20
Aborted

[backtrace]
0x403458d3 
0x403477ce 
0x40398ba2 
0x40397a16 
0x40360a13 
0x40360c38 
0x4039764f 
0xa12b880f 

Trying to get a backtrace after connecting with gdb:
(gdb) bt
#0  abort (fmt=fmt@entry=0x40644b90 "Assertion failed: %s (%s: %s: %d)\n") 
at runtime.cc:105
#1  0x4023c6fb in __assert_fail (expr=expr@entry=0x40672cf8 
"ef->rflags & processor::rflags_if", 
file=file@entry=0x40672d25 "arch/x64/mmu.cc", line=line@entry=38, 
func=func@entry=0x40672d1a "page_fault")
at runtime.cc:139
#2  0x40398c05 in page_fault (ef=0x80015048) at 
arch/x64/arch-cpu.hh:107
#3  
#4  0x4035c879 in elf::object::symtab_len (this=0xa0f18c00) 
at core/elf.cc:983
#5  0x4035c938 in elf::object::lookup_addr 
(this=0xa0f18c00, addr=addr@entry=0x100254ce)
at core/elf.cc:1015
#6  0x4035cb07 in elf::programoperator() (
__closure=, __closure=, ml=...) 
at core/elf.cc:1620
#7  elf::program::with_modules >
(f=..., this=0xa0097e70) at include/osv/elf.hh:702
#8  elf::program::lookup_addr (this=0xa0097e70, 
addr=addr@entry=0x100254ce) at core/elf.cc:1617
#9  0x404357cc in osv::lookup_name_demangled 
(addr=addr@entry=0x100254ce, 
buf=buf@entry=0x812146d0 "???+19630095", len=len@entry=1024) at 
core/demangle.cc:47
#10 0x4023c4e0 in print_backtrace () at runtime.cc:85
#11 0x4023c6b4 in abort (fmt=fmt@entry=0x40644a9f "Aborted\n") at 
runtime.cc:121
#12 0x40202989 in abort () at runtime.cc:98
#13 0x403458d4 in mmu::vm_sigsegv (ef=0x81215068, 
addr=) at core/mmu.cc:1314
#14 mmu::vm_sigsegv (addr=, ef=0x81215068) at 
core/mmu.cc:1308
#15 0x403477cf in mmu::vm_fault (addr=addr@entry=17592186309800, 
ef=ef@entry=0x81215068)
at core/mmu.cc:1328
#16 0x40398ba3 in page_fault (ef=0x81215068) at 
arch/x64/mmu.cc:42
#17 
#18 0x4039c25a in elf::object::arch_relocate_jump_slot 
(this=this@entry=0xa0f18c00, sym=..., 
addr=addr@entry=0x10040ca8, addend=addend@entry=0) at 
arch/x64/arch-elf.cc:172
#19 0x40360a14 in elf::object::resolve_pltgot 
(this=0xa0f18c00, index=)
at core/elf.cc:843
#20 0x40360c39 in elf_resolve_pltgot (index=308, 
obj=0xa0f18c00) at core/elf.cc:1860
#21 0x40397650 in __elf_resolve_pltgot () at arch/x64/elf-dl.S:47
#22 0x100254cf in ?? ()
#23 0xa12b8800 in ?? ()
#24 0x201f74a0 in ?? ()
#25 0x100254cf in ?? ()
#26 0x201f7480 in ?? ()
#27 0x403f241c in calloc (nmemb=, size=) at core/mempool.cc:1811
#28 0x90a98000 in ?? ()
#29 0x in ?? ()
On Saturday, November 28, 2020 at 1:39:46 PM UTC-7 Matthew Kenigsberg wrote:

> Hi,
>
> I'll send something, might take a bit before I find time to work on it 
> though.
>
> Thanks,
> Matthew
>
> On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote:
>
>> On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk  
>> wrote:
>> >
>> > Hey,
>> >
>> > Send a patch with a new app that could demonstrate it, please, if you 
>> can. I would like to see it. Sounds like a nice improvement.
>>
>> FWIW: I'd love to see it too -- been meaning to play with Nix and this
>> gives me a perfect excuse ;-)
>>
>> Thanks,
>> Roman.
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/601b48fd-64a9-48fc-996f-dc0933bddf98n%40googlegroups.com.


Re: [osv-dev] Re: Pip packages/using Nix

2020-11-28 Thread Matthew Kenigsberg
Hi,

I'll send something, might take a bit before I find time to work on it 
though.

Thanks,
Matthew

On Saturday, November 28, 2020 at 1:11:11 PM UTC-7 Roman Shaposhnik wrote:

> On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk  
> wrote:
> >
> > Hey,
> >
> > Send a patch with a new app that could demonstrate it, please, if you 
> can. I would like to see it. Sounds like a nice improvement.
>
> FWIW: I'd love to see it too -- been meaning to play with Nix and this
> gives me a perfect excuse ;-)
>
> Thanks,
> Roman.
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/51562e56-b201-42cd-b2d0-b4965511153an%40googlegroups.com.


Re: [osv-dev] Re: Pip packages/using Nix

2020-11-28 Thread Roman Shaposhnik
On Tue, Nov 24, 2020 at 8:03 AM Waldek Kozaczuk  wrote:
>
> Hey,
>
> Send a patch with a new app that could demonstrate it, please, if you can. I 
> would like to see it. Sounds like a nice improvement.

FWIW: I'd love to see it too -- been meaning to play with Nix and this
gives me a perfect excuse ;-)

Thanks,
Roman.

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CA%2BULb%2BseTV8sOL7Sq4gfUga%3Dy7wuGirWhoHoBStsaCDsji5-cA%40mail.gmail.com.


[osv-dev] Re: Pip packages/using Nix

2020-11-24 Thread Waldek Kozaczuk
Hey,

Send a patch with a new app that could demonstrate it, please, if you can. 
I would like to see it. Sounds like a nice improvement.

Waldek
On Monday, November 23, 2020 at 7:36:49 PM UTC-5 Matthew Kenigsberg wrote:

> Hi,
>
> That definitely helped, thanks for the response!
>
> Haven't had time to look at this in depth, so I feel like I'm not 
> qualified to know if my own suggestion is actually helpful. But looking at 
> osv-apps/python-from-host/Makefile, it seems like nix could do things a bit 
> more cleanly. Nix does manage both the shared libraries and any python 
> dependencies, so rather than having two runs of manifest_from_host.sh and 
> an rsync, I can just tell nix I need python and some pip packages. Running 
> a single command will then give me every path I need. Tested this with 
> python3.8 and Flask and it worked. Personally I find that workflow a little 
> simpler than having to figure out what directories pip packages are 
> installed in.
>
> Another advantage would be that nixpkgs has 60,000 packages, and although 
> I'm sure there's plenty of compatibility issues, I think at times it would 
> be a lot easier to take advantage of the work that's already been done to 
> create all those packages, like in this case Flask works without 
> modification. Haven't used mpm much, so maybe it does things nix can't, but 
> I would guess nix could solve a lot of the same problems.
>
> Sorry if I'm making suggestions about something I don't understand, just 
> thought I'd bounce my ideas off someone who does understand them. Happy to 
> explain anything in more depth or demo what I mean.
>
> Thanks,
> Matthew
> On Saturday, November 7, 2020 at 10:15:28 PM UTC-7 jwkoz...@gmail.com 
> wrote:
>
>> Hi Matthew,
>>
>> I am not familiar with nix and how exactly it would fit. If you look at 
>> the osv-apps repo there are many examples of python 2/3 apps. All of those 
>> are driven by module.py and optional makefiles to do a job of collecting 
>> relevant files to the final OSv image as scripts/build, scripts/module.py 
>> orchestrates it all. Alternatively, there is capstan with its package* 
>> command and *.mpm archives. I am not sure where and how nix would fit into 
>> this.
>>
>> Now the purpose of manifest_from_host.sh is quite simple - given a Linux 
>> shared library/-ies or executable or a directory with those, find all 
>> *dependant* shared libraries based on information in DT_NEEDED elf header. 
>> As you can see it is not specific to Python. On other hand 
>> manifest_from_host.sh is not indended and can not find all other 
>> dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense 
>> is that would still need to run manifest_from_host.sh against files built 
>> by nix but I might be wrong.
>>
>> Another alternative to building OSv images could be using Docker images 
>> and unpacking them to create corresponding OSv image. For an example look 
>> at 
>> https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker
>>  
>> which uses undocker tool. Another alternative would be then to use Python 
>> docker image in a similar way. And possibly combine it with capstan.
>>
>> I hope it helps a bit,
>> Waldek
>>
>>
>>
>> On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg 
>> wrote:
>>
>>>
>>> Hi,
>>>
>>> Is there a recommended way to add pip packages to osv images?
>>>
>>> Was trying to figure that out and I also have a suggestion: I started 
>>> using nix, which is a package manager, and it seems like it could be a 
>>> really good tool for the job. It keeps track of every dependency for a 
>>> piece of software, and copying all the files to run an image with python38 
>>> and some pip packages only takes one command. I think using nix could 
>>> also make manifest_from_host.sh unnecessary (?)
>>>
>>> Anyways, is there an easier way to be using pip packages?
>>>
>>> Thanks,
>>> Matthew
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/9df31db3-9d34-469b-99b1-1091c653b7dbn%40googlegroups.com.


[osv-dev] Re: Pip packages/using Nix

2020-11-23 Thread Matthew Kenigsberg
Hi,

That definitely helped, thanks for the response!

Haven't had time to look at this in depth, so I feel like I'm not qualified 
to know if my own suggestion is actually helpful. But looking at 
osv-apps/python-from-host/Makefile, it seems like nix could do things a bit 
more cleanly. Nix does manage both the shared libraries and any python 
dependencies, so rather than having two runs of manifest_from_host.sh and 
an rsync, I can just tell nix I need python and some pip packages. Running 
a single command will then give me every path I need. Tested this with 
python3.8 and Flask and it worked. Personally I find that workflow a little 
simpler than having to figure out what directories pip packages are 
installed in.

Another advantage would be that nixpkgs has 60,000 packages, and although 
I'm sure there's plenty of compatibility issues, I think at times it would 
be a lot easier to take advantage of the work that's already been done to 
create all those packages, like in this case Flask works without 
modification. Haven't used mpm much, so maybe it does things nix can't, but 
I would guess nix could solve a lot of the same problems.

Sorry if I'm making suggestions about something I don't understand, just 
thought I'd bounce my ideas off someone who does understand them. Happy to 
explain anything in more depth or demo what I mean.

Thanks,
Matthew
On Saturday, November 7, 2020 at 10:15:28 PM UTC-7 jwkoz...@gmail.com wrote:

> Hi Matthew,
>
> I am not familiar with nix and how exactly it would fit. If you look at 
> the osv-apps repo there are many examples of python 2/3 apps. All of those 
> are driven by module.py and optional makefiles to do a job of collecting 
> relevant files to the final OSv image as scripts/build, scripts/module.py 
> orchestrates it all. Alternatively, there is capstan with its package* 
> command and *.mpm archives. I am not sure where and how nix would fit into 
> this.
>
> Now the purpose of manifest_from_host.sh is quite simple - given a Linux 
> shared library/-ies or executable or a directory with those, find all 
> *dependant* shared libraries based on information in DT_NEEDED elf header. 
> As you can see it is not specific to Python. On other hand 
> manifest_from_host.sh is not indended and can not find all other 
> dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense 
> is that would still need to run manifest_from_host.sh against files built 
> by nix but I might be wrong.
>
> Another alternative to building OSv images could be using Docker images 
> and unpacking them to create corresponding OSv image. For an example look 
> at 
> https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker
>  
> which uses undocker tool. Another alternative would be then to use Python 
> docker image in a similar way. And possibly combine it with capstan.
>
> I hope it helps a bit,
> Waldek
>
>
>
> On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg wrote:
>
>>
>> Hi,
>>
>> Is there a recommended way to add pip packages to osv images?
>>
>> Was trying to figure that out and I also have a suggestion: I started 
>> using nix, which is a package manager, and it seems like it could be a 
>> really good tool for the job. It keeps track of every dependency for a 
>> piece of software, and copying all the files to run an image with python38 
>> and some pip packages only takes one command. I think using nix could 
>> also make manifest_from_host.sh unnecessary (?)
>>
>> Anyways, is there an easier way to be using pip packages?
>>
>> Thanks,
>> Matthew
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/fae5d0a1-9886-41ea-8156-0d7ee2f68059n%40googlegroups.com.


[osv-dev] Re: Pip packages/using Nix

2020-11-07 Thread Waldek Kozaczuk
Hi Matthew,

I am not familiar with nix and how exactly it would fit. If you look at the 
osv-apps repo there are many examples of python 2/3 apps. All of those are 
driven by module.py and optional makefiles to do a job of collecting 
relevant files to the final OSv image as scripts/build, scripts/module.py 
orchestrates it all. Alternatively, there is capstan with its package* 
command and *.mpm archives. I am not sure where and how nix would fit into 
this.

Now the purpose of manifest_from_host.sh is quite simple - given a Linux 
shared library/-ies or executable or a directory with those, find all 
*dependant* shared libraries based on information in DT_NEEDED elf header. 
As you can see it is not specific to Python. On other hand 
manifest_from_host.sh is not indended and can not find all other 
dependencies (*.py., *.pyc, etc files) full Python runtime needs. My sense 
is that would still need to run manifest_from_host.sh against files built 
by nix but I might be wrong.

Another alternative to building OSv images could be using Docker images and 
unpacking them to create corresponding OSv image. For an example look 
at 
https://github.com/cloudius-systems/osv-apps/tree/master/openjdk12-jre-from-docker
 
which uses undocker tool. Another alternative would be then to use Python 
docker image in a similar way. And possibly combine it with capstan.

I hope it helps a bit,
Waldek



On Tuesday, November 3, 2020 at 10:03:20 AM UTC-5 Matthew Kenigsberg wrote:

>
> Hi,
>
> Is there a recommended way to add pip packages to osv images?
>
> Was trying to figure that out and I also have a suggestion: I started 
> using nix, which is a package manager, and it seems like it could be a 
> really good tool for the job. It keeps track of every dependency for a 
> piece of software, and copying all the files to run an image with python38 
> and some pip packages only takes one command. I think using nix could 
> also make manifest_from_host.sh unnecessary (?)
>
> Anyways, is there an easier way to be using pip packages?
>
> Thanks,
> Matthew
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/0b8be52b-1b43-4069-9c69-3a320b8046e4n%40googlegroups.com.