I connected with gdb and here is stacktrace I got for the main app thread:

#0  sched::thread::switch_to (this=this@entry=0xffff8000001d1040) at 
arch/x64/arch-switch.hh:108
#1  0x000000004040dace in sched::cpu::reschedule_from_interrupt 
(this=0xffff80000001e040, called_from_yield=called_from_yield@entry=false, 
    preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
#2  0x000000004040e800 in sched::cpu::schedule () at 
include/osv/sched.hh:1315
#3  0x000000004040e8e6 in sched::thread::wait 
(this=this@entry=0xffff800000f0a040) at core/sched.cc:1216
#4  0x000000004043ca86 in sched::thread::do_wait_for<lockfree::mutex, 
sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41
#5  sched::thread::wait_for<waitqueue&> (mtx=...) at 
include/osv/sched.hh:1225
#6  waitqueue::wait (this=this@entry=0x408fa650 <mmu::vma_list_mutex+48>, 
mtx=...) at core/waitqueue.cc:56
#7  0x00000000403eb27b in rwlock::reader_wait_lockable (this=<optimized 
out>) at core/rwlock.cc:174
#8  rwlock::rlock (this=this@entry=0x408fa620 <mmu::vma_list_mutex>) at 
core/rwlock.cc:29
#9  0x000000004034b88c in rwlock_for_read::lock (this=0x408fa620 
<mmu::vma_list_mutex>) at include/osv/rwlock.h:113
#10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=..., this=<synthetic 
pointer>) at /usr/include/c++/9/bits/std_mutex.h:159
#11 lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock 
(lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89
#12 mmu::vm_fault (addr=17592186081280, addr@entry=17592186083096, 
ef=ef@entry=0xffff800000f0f068) at core/mmu.cc:1333
#13 0x00000000403adf7c in page_fault (ef=0xffff800000f0f068) at 
arch/x64/mmu.cc:42
#14 <signal handler called>
#15 0x00000000405bf0cd in _Unwind_IteratePhdrCallback ()
#16 0x000000004047fd37 in <lambda(const 
elf::program::modules_list&)>::operator() (ml=..., __closure=<synthetic 
pointer>) at libc/dlfcn.cc:118
#17 elf::program::with_modules<dl_iterate_phdr(int (*)(dl_phdr_info*, 
size_t, void*), void*)::<lambda(const elf::program::modules_list&)> > 
(f=..., 
    this=0xffffa0000009cbb0) at include/osv/elf.hh:698
#18 dl_iterate_phdr (callback=0x405befa0 <_Unwind_IteratePhdrCallback>, 
data=0x200000700520) at libc/dlfcn.cc:99
#19 0x00000000405c0255 in _Unwind_Find_FDE ()
#20 0x00000000405bc693 in uw_frame_state_for ()
#21 0x00000000405be1da in _Unwind_RaiseException ()
#22 0x00000000404c4d1c in __cxa_throw ()
#23 0x0000000040205229 in mmu::find_hole (start=<optimized out>, 
size=<optimized out>) at include/osv/error.h:36
#24 0x000000004034ecea in mmu::allocate (v=v@entry=0xffffa00000cf2b80, 
start=35184372088832, start@entry=0, size=size@entry=9223372036854779904, 
    search=search@entry=true) at core/mmu.cc:1113
#25 0x000000004034fa97 in mmu::map_anon (addr=addr@entry=0x0, 
size=size@entry=9223372036854779904, flags=flags@entry=2, perm=perm@entry=3)
    at core/mmu.cc:1219
#26 0x00000000403f89a0 in memory::mapped_malloc_large (offset=64, 
size=9223372036854779904) at core/mempool.cc:919
#27 memory::malloc_large (size=9223372036854779904, alignment=16, 
block=true, contiguous=false) at core/mempool.cc:919
#28 0x00000000403fa272 in std_malloc (size=9223372036854775807, 
alignment=16) at core/mempool.cc:1795
#29 0x00000000403fa63b in malloc (size=9223372036854775807) at 
core/mempool.cc:2001
#30 0x00001000000075d5 in main ()
#31 0x0000000040444c11 in osv::application::run_main 
(this=0xffffa0007ffb4210) at /usr/include/c++/9/bits/stl_vector.h:915
#32 0x0000000040444d65 in __libc_start_main (main=0x100000007560 <main>) at 
core/app.cc:37
#33 0x000010000000801e in _start ()

It is trying to allocate tons of memory and it looks like we crash in 
find_hole() probably with throw make_error(ENOMEM); 

I wonder if it is app (https://github.com/jnovy/pxz/blob/master/pxz.c) 
passing such memory size or is there some bug on our side?

(BTW osv info threads fails like this - would be nice to fix it:

(gdb) osv info threads
   1 (0xffff800000017040) reclaimer       cpu0 status::waiting 
condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43 
vruntime  6.07461e-25
Python Exception <class 'Exception'> Class does not extend list_base_hook: 
sched::timer_base: 
Error occurred in Python: Class does not extend list_base_hook: 
sched::timer_base
)

When I examined pxz.c it eventually calls execvpe() which will definitely 
NOT work in OSv (OSv does not support processes so forking does not work -> 
there is some research fork that does that which I sent paper about 
recently).

135 void __attribute__((noreturn)) run_xz( char **argv, char **envp ) {
136         execve(XZ_BINARY, argv, envp);
137         error(0, errno, "execution of "XZ_BINARY" binary failed");
138         exit(EXIT_FAILURE);
139 }

xz seems to work fine (at least --help):

./scripts/manifest_from_host.sh -w xz && ./scripts/build --append-manifest 
fs=rofs
./scripts/firecracker.py 
OSv v0.55.0-9-gc13529d9
Booted up in 7.42 ms
Cmdline: /xz --help 
Usage: /xz [OPTION]... [FILE]...
Compress or decompress FILEs in the .xz format.

  -z, --compress      force compression
  -d, --decompress    force decompression
  -t, --test          test compressed file integrity
  -l, --list          list information about .xz files
  -k, --keep          keep (don't delete) input files
  -f, --force         force overwrite of output file and (de)compress links
  -c, --stdout        write to standard output and don't delete input files
  -0 ... -9           compression preset; default is 6; take compressor 
*and*
                      decompressor memory usage into account before using 
7-9!
  -e, --extreme       try to improve compression ratio by using more CPU 
time;
                      does not affect decompressor memory requirements
  -T, --threads=NUM   use at most NUM threads; the default is 1; set to 0
                      to use as many threads as there are processor cores
  -q, --quiet         suppress warnings; specify twice to suppress errors 
too
  -v, --verbose       be verbose; specify twice for even more verbose
  -h, --help          display this short help and exit
  -H, --long-help     display the long help (lists also the advanced 
options)
  -V, --version       display the version number and exit

With no FILE, or when FILE is -, read standard input.

Report bugs to <[email protected]> (in English or Finnish).
XZ Utils home page: <https://tukaani.org/xz/>

Waldek

On Thursday, May 21, 2020 at 6:59:07 AM UTC-4, Nadav Har'El wrote:
>
> On Thu, May 21, 2020 at 12:46 PM De Vries <[email protected] 
> <javascript:>> wrote:
>
>> Hi,
>>
>> Sorry if this is a bit of a newbie question. I'm trying to run a pretty 
>> simple application on OSv: pxz <https://github.com/jnovy/pxz>. I'm able 
>> to run other apps like mysql for example without any problem.
>> I have tried this the following way. First, I compiled the pxz executable 
>> with the -fPIE flag on the host machine, then put it in a new folder at 
>> osv/apps/pxz. I then ran the following:
>> ./scripts/manifest_from_host.sh -r ~/osv/apps/pxz/pxz > ./apps/pxz/usr.
>> manifest
>> ./scripts/build image=pxz
>>
>> It generates the following usr.manifest
>> # (PIE) Position Independent Executable
>> /pxz: /home/user1/osv/apps/pxz/pxz
>> # --------------------
>> # Dependencies
>> # --------------------
>> /usr/lib/libgomp.so.1: /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> /usr/lib/liblzma.so.5: /lib/x86_64-linux-gnu/liblzma.so.5
>> # --------------------
>>
>> Running it with 
>> ./scripts/run.py -e "pxz --version"
>>
>> Results in 
>> OSv v0.55.0-6-g557251e1
>> eth0: 192.168.122.15
>> Booted up in 407.56 ms
>> Cmdline: pxz --version
>>
>> But it just hangs. No errors, but also no output. I have tried actually 
>> using pxz (not just --version) to compress a file but that also hangs 
>> indefinitely (while this works fine on the host machine).
>>
>
> It's hard to say. It seems like you did everything right. I assume that if 
> you run "pxz --version" on the host it works properly - prints a version 
> number and exits - right?
> During the "hang", does OSv do some busy loop ("top" will show you the OSv 
> vm taking 100% CPU) or waits for something?
>
> One thing you can do to figure out what is going on is to attach gdb to 
> the running VM, and inquire from it what threads are running, and what they 
> are waiting for.
> It's not trivial to do, but not particular difficult either, and explained 
> well (I hope) here: 
> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
> Note that you don't need to rebuild OSv specially for debugging to debug 
> it this way. 
>  
>
>>
>> Running ./scripts/run.py with the -V flag looks completely fine except 
>> maybe for the last line that is printed (after it prints Cmdline: pxz 
>> --version):
>> sysconf(): stubbed for parameter 0
>>
>>
> This is a _SC_ARG_MAX parameter to sysconf(), it is indeed not implemented 
> (and can be trivially implemented) but I doubt that this is the problem 
> causing the hang (I also wonder why this program would need to check 
> _SC_ARG_MAX if it's just planning to print the version number, not exec() 
> anything - you can look at this software's source code to see what it does 
> with _SC_ARG_MAX.
>
>  
>
>> I have also tried to run pxz using the way its done in the native-example 
>> application, but that also results in it hanging indefinitely.
>> What could be the issue here?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/9ce2c259-c6e9-475d-aa73-e7e6d71cd722%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/osv-dev/9ce2c259-c6e9-475d-aa73-e7e6d71cd722%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/f46243cb-d57e-4b75-9cd2-9993330635d6%40googlegroups.com.

Reply via email to