Re: [osv-dev] New user: trouble running a simple program

Waldek Kozaczuk Thu, 21 May 2020 12:16:34 -0700

I connected with gdb and here is stacktrace I got for the main app thread:

#0  sched::thread::switch_to (this=this@entry=0xffff8000001d1040) at 
arch/x64/arch-switch.hh:108
#1  0x000000004040dace in sched::cpu::reschedule_from_interrupt 
(this=0xffff80000001e040, called_from_yield=called_from_yield@entry=false, 
    preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
#2  0x000000004040e800 in sched::cpu::schedule () at 
include/osv/sched.hh:1315
#3  0x000000004040e8e6 in sched::thread::wait 
(this=this@entry=0xffff800000f0a040) at core/sched.cc:1216
#4  0x000000004043ca86 in sched::thread::do_wait_for<lockfree::mutex, 
sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41
#5  sched::thread::wait_for<waitqueue&> (mtx=...) at 
include/osv/sched.hh:1225
#6  waitqueue::wait (this=this@entry=0x408fa650 <mmu::vma_list_mutex+48>, 
mtx=...) at core/waitqueue.cc:56
#7  0x00000000403eb27b in rwlock::reader_wait_lockable (this=<optimized 
out>) at core/rwlock.cc:174
#8  rwlock::rlock (this=this@entry=0x408fa620 <mmu::vma_list_mutex>) at 
core/rwlock.cc:29
#9  0x000000004034b88c in rwlock_for_read::lock (this=0x408fa620 
<mmu::vma_list_mutex>) at include/osv/rwlock.h:113
#10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=..., this=<synthetic 
pointer>) at /usr/include/c++/9/bits/std_mutex.h:159
#11 lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock 
(lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89
#12 mmu::vm_fault (addr=17592186081280, addr@entry=17592186083096, 
ef=ef@entry=0xffff800000f0f068) at core/mmu.cc:1333
#13 0x00000000403adf7c in page_fault (ef=0xffff800000f0f068) at 
arch/x64/mmu.cc:42
#14 <signal handler called>
#15 0x00000000405bf0cd in _Unwind_IteratePhdrCallback ()
#16 0x000000004047fd37 in <lambda(const 
elf::program::modules_list&)>::operator() (ml=..., __closure=<synthetic 
pointer>) at libc/dlfcn.cc:118
#17 elf::program::with_modules<dl_iterate_phdr(int (*)(dl_phdr_info*, 
size_t, void*), void*)::<lambda(const elf::program::modules_list&)> > 
(f=..., 
    this=0xffffa0000009cbb0) at include/osv/elf.hh:698
#18 dl_iterate_phdr (callback=0x405befa0 <_Unwind_IteratePhdrCallback>, 
data=0x200000700520) at libc/dlfcn.cc:99
#19 0x00000000405c0255 in _Unwind_Find_FDE ()
#20 0x00000000405bc693 in uw_frame_state_for ()
#21 0x00000000405be1da in _Unwind_RaiseException ()
#22 0x00000000404c4d1c in __cxa_throw ()
#23 0x0000000040205229 in mmu::find_hole (start=<optimized out>, 
size=<optimized out>) at include/osv/error.h:36
#24 0x000000004034ecea in mmu::allocate (v=v@entry=0xffffa00000cf2b80, 
start=35184372088832, start@entry=0, size=size@entry=9223372036854779904, 
    search=search@entry=true) at core/mmu.cc:1113
#25 0x000000004034fa97 in mmu::map_anon (addr=addr@entry=0x0, 
size=size@entry=9223372036854779904, flags=flags@entry=2, perm=perm@entry=3)
    at core/mmu.cc:1219
#26 0x00000000403f89a0 in memory::mapped_malloc_large (offset=64, 
size=9223372036854779904) at core/mempool.cc:919
#27 memory::malloc_large (size=9223372036854779904, alignment=16, 
block=true, contiguous=false) at core/mempool.cc:919
#28 0x00000000403fa272 in std_malloc (size=9223372036854775807, 
alignment=16) at core/mempool.cc:1795
#29 0x00000000403fa63b in malloc (size=9223372036854775807) at 
core/mempool.cc:2001
#30 0x00001000000075d5 in main ()
#31 0x0000000040444c11 in osv::application::run_main 
(this=0xffffa0007ffb4210) at /usr/include/c++/9/bits/stl_vector.h:915
#32 0x0000000040444d65 in __libc_start_main (main=0x100000007560 <main>) at 
core/app.cc:37
#33 0x000010000000801e in _start ()


It is trying to allocate tons of memory and it looks like we crash in 
find_hole() probably with throw make_error(ENOMEM); 

I wonder if it is app (https://github.com/jnovy/pxz/blob/master/pxz.c) 
passing such memory size or is there some bug on our side?

(BTW osv info threads fails like this - would be nice to fix it:

(gdb) osv info threads
   1 (0xffff800000017040) reclaimer       cpu0 status::waiting 
condvar::wait(lockfree::mutex*, sched::timer*) at core/condvar.cc:43 
vruntime  6.07461e-25
Python Exception <class 'Exception'> Class does not extend list_base_hook: 
sched::timer_base: 
Error occurred in Python: Class does not extend list_base_hook: 
sched::timer_base
)

When I examined pxz.c it eventually calls execvpe() which will definitely 
NOT work in OSv (OSv does not support processes so forking does not work -> 
there is some research fork that does that which I sent paper about 
recently).

135 void __attribute__((noreturn)) run_xz( char **argv, char **envp ) {
136         execve(XZ_BINARY, argv, envp);
137         error(0, errno, "execution of "XZ_BINARY" binary failed");
138         exit(EXIT_FAILURE);
139 }

xz seems to work fine (at least --help):

./scripts/manifest_from_host.sh -w xz && ./scripts/build --append-manifest 
fs=rofs
./scripts/firecracker.py 
OSv v0.55.0-9-gc13529d9
Booted up in 7.42 ms
Cmdline: /xz --help 
Usage: /xz [OPTION]... [FILE]...
Compress or decompress FILEs in the .xz format.

  -z, --compress      force compression
  -d, --decompress    force decompression
  -t, --test          test compressed file integrity
  -l, --list          list information about .xz files
  -k, --keep          keep (don't delete) input files
  -f, --force         force overwrite of output file and (de)compress links
  -c, --stdout        write to standard output and don't delete input files
  -0 ... -9           compression preset; default is 6; take compressor 
*and*
                      decompressor memory usage into account before using 
7-9!
  -e, --extreme       try to improve compression ratio by using more CPU 
time;
                      does not affect decompressor memory requirements
  -T, --threads=NUM   use at most NUM threads; the default is 1; set to 0
                      to use as many threads as there are processor cores
  -q, --quiet         suppress warnings; specify twice to suppress errors 
too
  -v, --verbose       be verbose; specify twice for even more verbose
  -h, --help          display this short help and exit
  -H, --long-help     display the long help (lists also the advanced 
options)
  -V, --version       display the version number and exit

With no FILE, or when FILE is -, read standard input.

Report bugs to <lasse.col...@tukaani.org> (in English or Finnish).
XZ Utils home page: <https://tukaani.org/xz/>

Waldek

On Thursday, May 21, 2020 at 6:59:07 AM UTC-4, Nadav Har'El wrote:
>
> On Thu, May 21, 2020 at 12:46 PM De Vries <f1r3fl...@gmail.com 
> <javascript:>> wrote:
>
>> Hi,
>>
>> Sorry if this is a bit of a newbie question. I'm trying to run a pretty 
>> simple application on OSv: pxz <https://github.com/jnovy/pxz>. I'm able 
>> to run other apps like mysql for example without any problem.
>> I have tried this the following way. First, I compiled the pxz executable 
>> with the -fPIE flag on the host machine, then put it in a new folder at 
>> osv/apps/pxz. I then ran the following:
>> ./scripts/manifest_from_host.sh -r ~/osv/apps/pxz/pxz > ./apps/pxz/usr.
>> manifest
>> ./scripts/build image=pxz
>>
>> It generates the following usr.manifest
>> # (PIE) Position Independent Executable
>> /pxz: /home/user1/osv/apps/pxz/pxz
>> # --------------------
>> # Dependencies
>> # --------------------
>> /usr/lib/libgomp.so.1: /usr/lib/x86_64-linux-gnu/libgomp.so.1
>> /usr/lib/liblzma.so.5: /lib/x86_64-linux-gnu/liblzma.so.5
>> # --------------------
>>
>> Running it with 
>> ./scripts/run.py -e "pxz --version"
>>
>> Results in 
>> OSv v0.55.0-6-g557251e1
>> eth0: 192.168.122.15
>> Booted up in 407.56 ms
>> Cmdline: pxz --version
>>
>> But it just hangs. No errors, but also no output. I have tried actually 
>> using pxz (not just --version) to compress a file but that also hangs 
>> indefinitely (while this works fine on the host machine).
>>
>
> It's hard to say. It seems like you did everything right. I assume that if 
> you run "pxz --version" on the host it works properly - prints a version 
> number and exits - right?
> During the "hang", does OSv do some busy loop ("top" will show you the OSv 
> vm taking 100% CPU) or waits for something?
>
> One thing you can do to figure out what is going on is to attach gdb to 
> the running VM, and inquire from it what threads are running, and what they 
> are waiting for.
> It's not trivial to do, but not particular difficult either, and explained 
> well (I hope) here: 
> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv#debugging-osv-with-gdb
> Note that you don't need to rebuild OSv specially for debugging to debug 
> it this way. 
>  
>
>>
>> Running ./scripts/run.py with the -V flag looks completely fine except 
>> maybe for the last line that is printed (after it prints Cmdline: pxz 
>> --version):
>> sysconf(): stubbed for parameter 0
>>
>>
> This is a _SC_ARG_MAX parameter to sysconf(), it is indeed not implemented 
> (and can be trivially implemented) but I doubt that this is the problem 
> causing the hang (I also wonder why this program would need to check 
> _SC_ARG_MAX if it's just planning to print the version number, not exec() 
> anything - you can look at this software's source code to see what it does 
> with _SC_ARG_MAX.
>
>  
>
>> I have also tried to run pxz using the way its done in the native-example 
>> application, but that also results in it hanging indefinitely.
>> What could be the issue here?
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to osv...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/9ce2c259-c6e9-475d-aa73-e7e6d71cd722%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/osv-dev/9ce2c259-c6e9-475d-aa73-e7e6d71cd722%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/f46243cb-d57e-4b75-9cd2-9993330635d6%40googlegroups.com.

Re: [osv-dev] New user: trouble running a simple program

Reply via email to