I get the dump from gdb:

#0  0x00000000403965f2 in processor::cli_hlt () at arch/x64/processor.hh:247
#1  arch::halt_no_interrupts () at arch/x64/arch.hh:48
#2  osv::halt () at arch/x64/power.cc:26
#3  0x0000000040221b14 in abort (fmt=fmt@entry=0x405ff34b "Aborted\n") at
runtime.cc:132
#4  0x0000000040221b32 in abort () at runtime.cc:98
#5  0x0000000040463aec in osv::generate_signal (siginfo=...,
ef=0xffff800102932068) at libc/signal.cc:124
#6  0x0000000040463b5b in osv::handle_mmap_fault (addr=<optimized out>,
sig=<optimized out>, ef=<optimized out>) at libc/signal.cc:139
#7  0x000000004032f3ea in mmu::vm_fault (addr=<optimized out>, addr@entry=1,
ef=ef@entry=0xffff800102932068) at core/mmu.cc:1336
#8  0x000000004038f787 in page_fault (ef=0xffff800102932068) at
arch/x64/mmu.cc:42
#9  <signal handler called>
#10 0x000010003896052a in
google::protobuf::internal::AddDescriptors(google::protobuf::internal::DescriptorTable
const*) ()
#11 0x00001000388aceb6 in ?? ()
#12 0x000000004033ff9a in elf::object::run_init_funcs
(this=0xffffa0011ad8c800, argc=argc@entry=0, argv=argv@entry=0x0) at
core/elf.cc:1175
#13 0x000000004034102b in elf::program::init_library
(this=this@entry=0xffffa000001fbdf0,
argc=argc@entry=0, argv=argv@entry=0x0) at core/elf.cc:1486
#14 0x00000000403473ac in elf::program::get_library
(this=this@entry=0xffffa000001fbdf0,
name="/lib/python3.6/google/protobuf/pyext/_
message.cpython-36m-x86_64-linux-gnu.so",
    extra_path=std::vector of length 0, capacity 0,
delay_init=delay_init@entry=false) at core/elf.cc:1465
#15 0x0000000040462f4a in dlopen (filename=0x200006f47520
"/lib/python3.6/google/protobuf/pyext/_
message.cpython-36m-x86_64-linux-gnu.so", flags=<optimized out>) at
libc/dlfcn.cc:54
#16 0x0000100000937229 in _PyImport_FindSharedFuncptr ()
#17 0x000010000095dba7 in _PyImport_LoadDynamicModuleWithSpec ()

It seems the problem is this function:
google::protobuf::internal::AddDescriptors. Unfortunately, tensorflow
library strips the debug info even through I add -g when I compile it.



On Thu, Jan 23, 2020 at 5:33 PM Waldek Kozaczuk <[email protected]>
wrote:

> Yeah that is pretty big image given that the default memory size if 4GB. I
> wonder if increasing the memory to something like 8GB would make rofs work.
> rofs is not very memory efficient - see
> https://github.com/cloudius-systems/osv/issues/979. This is one of the
> issues I would want to work next.
>
>
> On Thursday, January 23, 2020 at 6:28:38 PM UTC-5, zhiting zhu wrote:
>>
>> The image is 2.5 G.
>>
>> On Thu, Jan 23, 2020 at 5:26 PM Waldek Kozaczuk <[email protected]>
>> wrote:
>>
>>> BTW how big is you image?
>>>
>>> I wounder if ROFS hangs because it runs out of memory when trying to
>>> load files into memory.
>>>
>>> On Thursday, January 23, 2020 at 6:24:21 PM UTC-5, zhiting zhu wrote:
>>>>
>>>> Yeah, Zfs image doesn't hang. I don't know why rofs image hangs. I need
>>>> to increase the qemu memory in upload_manifest.py otherwise it hangs on
>>>> building zfs images.
>>>>
>>>> On Thu, Jan 23, 2020 at 5:17 PM Waldek Kozaczuk <[email protected]>
>>>> wrote:
>>>>
>>>>> It seems like it got stuck while trying to mount the filesystem. The
>>>>> next boot message would normally be 'VFS: mounting devfs at /dev".
>>>>>
>>>>> I wonder if the image (usr.img) is somehow locked or something. Have
>>>>> you tried to rebuild the image? Try zfs.
>>>>>
>>>>> Waldek
>>>>>
>>>>> On Thursday, January 23, 2020 at 5:54:08 PM UTC-5, zhiting zhu wrote:
>>>>>>
>>>>>> native-example and python image works with qemu. It seems it only
>>>>>> hangs on my custom tensorflow image. I'm only passing --verbose /python3 
>>>>>> to
>>>>>> run.py
>>>>>>
>>>>>>
>>>>>> On Thu, Jan 23, 2020 at 4:34 PM Waldek Kozaczuk <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Are you passing any parameters to ./scripts/run.py? I would also
>>>>>>> suggest to specify just single vCPU - it should make debugging easier.
>>>>>>>
>>>>>>> Also does it hang with this app only or others as well? Can you try
>>>>>>> this:
>>>>>>> ./scripts/build image=native-example
>>>>>>> ./scripts/run.py
>>>>>>>
>>>>>>> Waldek
>>>>>>>
>>>>>>> On Thursday, January 23, 2020 at 5:28:43 PM UTC-5, zhiting zhu wrote:
>>>>>>>>
>>>>>>>> Unfortunately, I can't boot the vm with qemu. It's hanging at the
>>>>>>>> beginning.
>>>>>>>>
>>>>>>>> I'm seeing this:
>>>>>>>> bsd: initializing - done
>>>>>>>> VFS: mounting ramfs at /
>>>>>>>> VFS: mounting devfs at /dev
>>>>>>>> net: initializing - done
>>>>>>>> vga: Add VGA device instance
>>>>>>>> eth0: ethernet address: 52:54:00:12:34:56
>>>>>>>> virtio-blk: Add blk device instances 0 as vblk0, devsize=1192516096
>>>>>>>> random: virtio-rng registered as a source.
>>>>>>>> random: intel drng, rdrand registered as a source.
>>>>>>>> random: <Software, Yarrow> initialized
>>>>>>>> VFS: unmounting /dev
>>>>>>>> VFS: mounting rofs at /rofs
>>>>>>>> random: device unblocked.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Jan 23, 2020 at 3:25 PM zhiting zhu <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Here's the output log. See the file attached.
>>>>>>>>>
>>>>>>>>> On Thu, Jan 23, 2020 at 7:19 AM Nadav Har'El <[email protected]>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Jan 23, 2020 at 2:23 PM Waldek Kozaczuk <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Can you send us the full output? I wonder if there are any
>>>>>>>>>>> warnings before it?
>>>>>>>>>>>
>>>>>>>>>>> I see you are running this on firecracker. Can you run it under
>>>>>>>>>>> qemu and connect to it with gdb and see if you get better 
>>>>>>>>>>> stacktrace?
>>>>>>>>>>>
>>>>>>>>>>> You can add another debug statement like that:
>>>>>>>>>>>
>>>>>>>>>>> if (strcmp(" /lib/python3.6/google/protobuf/pyext/_message.
>>>>>>>>>>> cpython-36m-x86_64-linux-gnu.so",pathname)==0 && i == 28) {
>>>>>>>>>>> ...
>>>>>>>>>>> // Put breakpoint here
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> and try to see what statement causes the fault. Make sure to do
>>>>>>>>>>> 'osv syms" to get as much debug info resolved as possible. (see
>>>>>>>>>>> https://github.com/cloudius-systems/osv/wiki/Debugging-OSv).
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I hope that "osv syms" will find the
>>>>>>>>>> newly-loaded-but-not-yet-completely-loaded libraries. If it doesn't, 
>>>>>>>>>> maybe
>>>>>>>>>> we can fix the order of when the array that "osv syms" uses gets 
>>>>>>>>>> written
>>>>>>>>>> during loaded.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I wonder if this has to with the order of initializing the ELF
>>>>>>>>>>> objects when called by dlopen().
>>>>>>>>>>>
>>>>>>>>>>> Waldek
>>>>>>>>>>>
>>>>>>>>>>> PS. If no more clue next step would be to add an app so we can
>>>>>>>>>>> build it and reproduce it?
>>>>>>>>>>>
>>>>>>>>>>> On Wednesday, January 22, 2020 at 6:23:30 PM UTC-5, zhiting zhu
>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hey,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm hitting this error:
>>>>>>>>>>>>
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing DT_INIT
>>>>>>>>>>>> function
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Finished executing
>>>>>>>>>>>> DT_INIT function
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 56
>>>>>>>>>>>> DT_INIT_ARRAYSZ functions
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 0 init
>>>>>>>>>>>> func 0x1000388ad500
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 1 init
>>>>>>>>>>>> func 0x1000388ac8a0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 2 init
>>>>>>>>>>>> func 0x1000388ac8d0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 3 init
>>>>>>>>>>>> func 0x1000388ac900
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 4 init
>>>>>>>>>>>> func 0x1000388ac930
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 5 init
>>>>>>>>>>>> func 0x1000388ac960
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 6 init
>>>>>>>>>>>> func 0x1000388ac990
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 7 init
>>>>>>>>>>>> func 0x1000388ac9c0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 8 init
>>>>>>>>>>>> func 0x1000388ac9f0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 9 init
>>>>>>>>>>>> func 0x1000388aca20
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 10 init
>>>>>>>>>>>> func 0x1000388aca50
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 11 init
>>>>>>>>>>>> func 0x1000388aca80
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 12 init
>>>>>>>>>>>> func 0x1000388acab0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 13 init
>>>>>>>>>>>> func 0x1000388acae0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 14 init
>>>>>>>>>>>> func 0x1000388acb10
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 15 init
>>>>>>>>>>>> func 0x1000388acb40
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 16 init
>>>>>>>>>>>> func 0x1000388acb70
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 17 init
>>>>>>>>>>>> func 0x1000388acc50
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 18 init
>>>>>>>>>>>> func 0x1000388acc80
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 19 init
>>>>>>>>>>>> func 0x1000388accb0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 20 init
>>>>>>>>>>>> func 0x1000388acce0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 21 init
>>>>>>>>>>>> func 0x1000388acd10
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 22 init
>>>>>>>>>>>> func 0x1000388acd40
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 23 init
>>>>>>>>>>>> func 0x1000388acd70
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 24 init
>>>>>>>>>>>> func 0x1000388acda0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 25 init
>>>>>>>>>>>> func 0x1000388acdd0
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 26 init
>>>>>>>>>>>> func 0x1000388ace00
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 27 init
>>>>>>>>>>>> func 0x1000388ace50
>>>>>>>>>>>> ELF [tid:51, /lib/python3.6/google/protobuf/pyext/_
>>>>>>>>>>>> message.cpython-36m-x86_64-linux-gnu.so]: Executing 28 init
>>>>>>>>>>>> func 0x1000388ace80
>>>>>>>>>>>> Aborted
>>>>>>>>>>>>
>>>>>>>>>>>> [backtrace]
>>>>>>>>>>>> 0x0000000040463abb <osv::generate_signal(siginfo&,
>>>>>>>>>>>> exception_frame*)+59>
>>>>>>>>>>>> 0x0000000040463b2a <osv::handle_mmap_fault(unsigned long, int,
>>>>>>>>>>>> exception_frame*)+26>
>>>>>>>>>>>> 0x000000004032f3e9 <mmu::vm_fault(unsigned long,
>>>>>>>>>>>> exception_frame*)+185>
>>>>>>>>>>>> 0x000000004038f7b6 <page_fault+166>
>>>>>>>>>>>> 0x000000004038e5f6 <???+1077470710>
>>>>>>>>>>>> 0x0000000040341042 <elf::program::init_library(int, char**)+402>
>>>>>>>>>>>> 0x00000000403473db
>>>>>>>>>>>> <elf::program::get_library(std::__cxx11::basic_string<char,
>>>>>>>>>>>> std::char_traits<char>, std::allocator<char> >,
>>>>>>>>>>>> std::vector<std::__cxx11::basic_string<char, 
>>>>>>>>>>>> std::char_traits<char>,
>>>>>>>>>>>> std::allocator<char> >, 
>>>>>>>>>>>> std::allocator<std::__cxx11::basic_string<char,
>>>>>>>>>>>> std::char_traits<char>, std::allocator<char> > > >, bool)+715>
>>>>>>>>>>>> 0x0000000040462f19 <dlopen+153>
>>>>>>>>>>>> 0x0000100000937228 <_PyImport_FindSharedFuncptr+376>
>>>>>>>>>>>> 0x006567617373656c <???+1936942444>
>>>>>>>>>>>> 2020-01-22T17:13:51.345740567 [anonymous-instance:ERROR:vmm/src/
>>>>>>>>>>>> lib.rs:1658] Failed to log metrics: Logger was not initialized.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there any clue how to debug this? The function pointer seems
>>>>>>>>>>>> to point to a valid address but I get a seg fault when executing 
>>>>>>>>>>>> it.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Zhiting
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>> Google Groups "OSv Development" group.
>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>> https://groups.google.com/d/msgid/osv-dev/ecdc93a7-2a7e-4d39-87e9-6de15578b7df%40googlegroups.com
>>>>>>>>>>> <https://groups.google.com/d/msgid/osv-dev/ecdc93a7-2a7e-4d39-87e9-6de15578b7df%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>> .
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>> Google Groups "OSv Development" group.
>>>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>>>> send an email to [email protected].
>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>> https://groups.google.com/d/msgid/osv-dev/CANEVyjsYXNroc3%3DeXB7Z0cV-rfeO9yUGpkDGZP87LDNunXHocQ%40mail.gmail.com
>>>>>>>>>> <https://groups.google.com/d/msgid/osv-dev/CANEVyjsYXNroc3%3DeXB7Z0cV-rfeO9yUGpkDGZP87LDNunXHocQ%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>>>>>>> .
>>>>>>>>>>
>>>>>>>>> --
>>>>>>> You received this message because you are subscribed to the Google
>>>>>>> Groups "OSv Development" group.
>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>> send an email to [email protected].
>>>>>>> To view this discussion on the web visit
>>>>>>> https://groups.google.com/d/msgid/osv-dev/096465a1-ffbc-45f9-b469-fa5c97c752f7%40googlegroups.com
>>>>>>> <https://groups.google.com/d/msgid/osv-dev/096465a1-ffbc-45f9-b469-fa5c97c752f7%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>> .
>>>>>>>
>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "OSv Development" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/osv-dev/173619bc-4cdf-4408-8b5d-541693d08de6%40googlegroups.com
>>>>> <https://groups.google.com/d/msgid/osv-dev/173619bc-4cdf-4408-8b5d-541693d08de6%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "OSv Development" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/osv-dev/36872cb0-9673-4756-ab6d-df94a19dba4b%40googlegroups.com
>>> <https://groups.google.com/d/msgid/osv-dev/36872cb0-9673-4756-ab6d-df94a19dba4b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "OSv Development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/osv-dev/34332f2f-7a70-4711-bc34-ce19bccaedcd%40googlegroups.com
> <https://groups.google.com/d/msgid/osv-dev/34332f2f-7a70-4711-bc34-ce19bccaedcd%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/CA%2B3q14wpBELN_9CKqYctj_mr9UMJDDYL06DtX-OCuiMe8LoWAg%40mail.gmail.com.

Reply via email to