Re: [osv-dev] OSv page_fault assert during "bsd: initializing"?

Waldek Kozaczuk Mon, 03 May 2021 13:51:20 -0700

Nadav,
I have sent a patch "mempool: resize l1_pool_stats once" that should 
address this.


David,
Could you also please test this patch to see if it fixes the crashes?

Waldek 

On Friday, April 30, 2021 at 12:27:47 PM UTC-4 Nadav Har'El wrote:

> On Fri, Apr 30, 2021 at 4:28 AM Waldek Kozaczuk <[email protected]> 
> wrote:
>
>> And I was also able to replicate it on QEMU running OSv in PVH mode on 
>> microvm machine (similar to how firecracker runs it) again in nested 
>> virtualization.
>>
>> ./scripts/run.py -k -p qemu_microvm --nics=0 -c 1
>>
>>
>> I have used the same recipe - 3 terminals running similar commands to 
>> start C native example on ROFS image. After I connected with gdb I saw this:
>>
>>
>> (gdb) osv info threads
>>
>>   id  address             name            cpu  status         vruntime  
>> total_cpu_time location
>>
>>    1 (0xffff800000014040) reclaimer       cpu0 waiting     6.51405e-24    
>>         1229 condvar::wait(lockfree::mutex*, sched::timer*) at 
>> core/condvar.cc:43
>>
>>    2 (0xffff80000004b040) kvm_wall_clock_ cpu0 waiting     4.76921e-24    
>>         8226 std::_Function_handler<void(), 
>> kvmclock::kvmclock()::<lambda()> >::_M_invoke(const std::_Any_data &) at 
>> /usr/include/c++/10/bits/std_function.h:291
>>
>>    3 (0xffff800000068040) page-access-sca cpu0 queued      1.92449e-23    
>>        33196 pagecache::access_scanner::run() at core/pagecache.cc:695
>>
>>    4 (0xffff800000080040) page_pool_l2    cpu0 queued      1.40362e-23    
>>        24193 memory::page_pool::l2::fill_thread() at core/mempool.cc:1435
>>
>>    5 (0xffff80000008d040) itimer-real     cpu? unstarted             0    
>>            0 ?? at arch/x64/entry.S:112
>>
>>    6 (0xffff800000094040) itimer-virt     cpu? unstarted             0    
>>            0 ?? at arch/x64/entry.S:112
>>
>>    7 (0xffff800000150040) balancer0       cpu0 queued      4.63251e-21    
>>      8154897 memory::page_pool::l1::alloc_page_local() at 
>> core/mempool.cc:1403
>>
>>    8 (0xffff800000168040) rcu0            cpu0 waiting     2.04856e-23    
>>        25420 osv::rcu::cpu_quiescent_state_thread::do_work() at 
>> core/rcu.cc:178
>>
>>    9 (0xffff800000180040) idle0           cpu0 queued              inf    
>>        12481 std::_Function_handler<void(), 
>> sched::cpu::init_idle_thread()::<lambda()> >::_M_invoke(const 
>> std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
>>
>>   10 (0xffff800000197040) >init           cpu0 waiting     7.75482e-24    
>>        13338 std::_Function_handler<void(), 
>> sched::thread::reaper::reaper()::<lambda()> >::_M_invoke(const 
>> std::_Any_data &) at /usr/include/c++/10/bits/std_function.h:291
>>
>>   11 (0xffff8000001b3040) thread taskq    cpu0 waiting     1.16969e-23    
>>        10158 synch_port::_msleep(void*, mtx*, int, char const*, int) at 
>> bsd/porting/synch.cc:101
>>
>>   12 (0xffff8000001bf040) callout         cpu0 waiting     1.15418e-23    
>>         9891 _callout_thread() at bsd/porting/callout.cc:118
>>
>>   13 (0xffff8000001d9040) page_pool_l1_0  cpu0 running     -2.4551e-21    
>>       709254 sched::thread::switch_to() at arch/x64/arch-switch.hh:108
>>
>> Number of threads: 13
>>
>> (gdb) bt
>>
>> #0  abort (fmt=fmt@entry=0x4063f860 "Assertion failed: %s (%s: %s: 
>> %d)\n") at runtime.cc:106
>>
>> #1  0x0000000040237f29 in __assert_fail (expr=<optimized out>, 
>> file=<optimized out>, line=<optimized out>, func=<optimized out>) at 
>> runtime.cc:140
>>
>> #2  0x0000000040390970 in page_fault (ef=0xffff80000094a048) at 
>> arch/x64/mmu.cc:37
>>
>> #3  <signal handler called>
>>
>> #4  0x00000000403e6266 in memory::page_pool::l1::pop 
>> (this=0xffff8000001d7040) at core/mempool.cc:1191
>>
>> #5  memory::page_pool::l1::alloc_page_local () at core/mempool.cc:1411
>>
>> #6  0x00000000403e8cfa in memory::page_pool::l1::alloc_page () at 
>> core/mempool.cc:1174
>>
>> #7  memory::untracked_alloc_page () at core/mempool.cc:1671
>>
>> #8  memory::pool::add_page (this=0x4090c968 <memory::malloc_pools+72>) at 
>> core/mempool.cc:254
>>
>> #9  0x00000000403e8e18 in memory::pool::alloc (this=0x4090c968 
>> <memory::malloc_pools+72>) at core/mempool.cc:221
>>
>> #10 0x00000000403e8f9c in std_malloc (size=8, alignment=8) at 
>> core/mempool.cc:1782
>>
>> #11 0x00000000403e9457 in malloc (size=8) at core/mempool.cc:2001
>>
>> #12 0x00000000404b0439 in operator new(unsigned long) ()
>>
>> #13 0x0000000040354018 in 
>> __gnu_cxx::new_allocator<elf::object*>::allocate (__n=1, this=<optimized 
>> out>) at /usr/include/c++/10/ext/new_allocator.h:103
>>
>> #14 std::allocator_traits<std::allocator<elf::object*> >::allocate 
>> (__a=..., __n=1) at /usr/include/c++/10/bits/alloc_traits.h:460
>>
>> #15 std::_Vector_base<elf::object*, std::allocator<elf::object*> 
>> >::_M_allocate (__n=1, this=0xffff8000001dd2e0)
>>
>>     at /usr/include/c++/10/bits/stl_vector.h:346
>>
>> #16 std::vector<elf::object*, std::allocator<elf::object*> >::reserve 
>> (__n=1, this=0xffff8000001dd2e0) at /usr/include/c++/10/bits/vector.tcc:78
>>
>> #17 elf::program::modules_get (this=0xffffa0000009adf0) at 
>> core/elf.cc:1810
>>
>> #18 0x00000000403548d0 in 
>> elf::program::with_modules<elf::program::lookup_addr(void 
>> const*)::<lambda(const elf::program::modules_list&)> > (f=..., 
>>
>>     this=0xffffa0000009adf0) at include/osv/elf.hh:702
>>
>> #19 elf::program::lookup_addr (this=0xffffa0000009adf0, 
>>
>>     addr=addr@entry=0x40237f28 <__assert_fail(char const*, char const*, 
>> unsigned int, char const*)+24>) at core/elf.cc:1621
>>
>> #20 0x00000000404273ec in osv::lookup_name_demangled (addr=0x40237f28 
>> <__assert_fail(char const*, char const*, unsigned int, char const*)+24>, 
>>
>>     buf=0xffff8000001dd7a0 "", len=1024) at core/demangle.cc:47
>>
>> #21 0x0000000040237a1a in print_backtrace () at runtime.cc:86
>>
>> #22 0x0000000040237edc in abort (fmt=fmt@entry=0x4063f860 "Assertion 
>> failed: %s (%s: %s: %d)\n") at runtime.cc:122
>>
>> #23 0x0000000040237f29 in __assert_fail (expr=<optimized out>, 
>> file=<optimized out>, line=<optimized out>, func=<optimized out>) at 
>> runtime.cc:140
>>
>> #24 0x0000000040390970 in page_fault (ef=0xffff8000001de068) at 
>> arch/x64/mmu.cc:37
>>
>> #25 <signal handler called>
>>
>> #26 0x00000000403e860e in memory::page_pool::l1::push 
>> (page=0xffff800000a27000, this=0xffff8000001d7040) at core/mempool.cc:1198
>>
>> #27 memory::page_pool::l1::refill () at core/mempool.cc:1379
>>
>> #28 0x00000000403e88f5 in memory::page_pool::l1::fill_thread () at 
>> core/mempool.cc:1356
>>
>> #29 0x00000000403eb579 in 
>> memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}::operator()() const 
>> (__closure=<optimized out>) at core/mempool.cc:1164
>>
>> #30 std::__invoke_impl<void, 
>> memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}&>(std::__invoke_other, 
>> memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}&) (__f=...) at 
>> /usr/include/c++/10/bits/invoke.h:60
>>
>> #31 std::__invoke_r<void, 
>> memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}&>(memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}&)
>>  
>> (__fn=...)
>>
>>     at /usr/include/c++/10/bits/invoke.h:153
>>
>> #32 std::_Function_handler<void (), 
>> memory::page_pool::l1::l1(sched::cpu*)::{lambda()#1}>::_M_invoke(std::_Any_data
>>  
>> const&) (__functor=...)
>>
>>     at /usr/include/c++/10/bits/std_function.h:291
>>
>> --Type <RET> for more, q to quit, c to continue without paging--c
>>
>> #33 0x00000000403fbc0a in sched::thread::main (this=0xffff8000001d9040) 
>> at core/sched.cc:1267
>>
>> #34 sched::thread_main_c (t=0xffff8000001d9040) at 
>> arch/x64/arch-switch.hh:325
>>
>> #35 0x0000000040390733 in thread_main () at arch/x64/entry.S:113
>>
>> (gdb) frame 26
>>
>> #26 0x00000000403e860e in memory::page_pool::l1::push 
>> (page=0xffff800000a27000, this=0xffff8000001d7040) at core/mempool.cc:1198
>>
>> 1198         l1_pool_stats[cpu_id]._nr = nr;
>>
>> (gdb) p cpu_id
>>
>> $1 = 0
>>
>> (gdb) p l1_pool_stats
>>
>> $2 = std::vector of length 0, capacity 0
>>
>> (gdb) p l1_pool_stats[0]
>>
>> Cannot access memory at address 0x0
>>
>> (gdb) p smp_allocator
>>
>> $3 = true
>> The page fault happens at frame 26 - core/mempool.cc:1198. It seems that 
>> the l1_pool_stats global variable has address of 0. I do not know how it is 
>> possible and  I do not see any obvious bug, yet (I have not had time to 
>> study it yet). In any case the l1_pool_stats is some vector that keeps 
>> statistics about L1 memory pool.
>>
>
> I have a guess.
>
> I think that the vector is valid - it was simply never resize()ed to the 
> right size, so it has zero size. This is a vector, not a hash_map, so doing 
> l1_pool_stats[cpu_id]._nr = nr does *NOT* create the cpu_id'th element if 
> not already existing - it must have existed previously.
>
> I see the vector is resized in 
>
> static sched::cpu::notifier _notifier([] () {
>    ...
>     l1_pool_stats.resize(sched::cpus.size());
> }
>
> One guess I have on a problem here is that it seems you call this 
> l1_pool_stats.resize(sched::cpus.size())  again and again for each of the 
> CPUs. Maybe it even happens in parallel on two CPUs and cause a big mess if 
> it happens (and explains why this problem happens rarely and not every 
> time)? Can you try doing it only once - perhaps in the same place you 
> decide to do do smp_allocator = true, or using some other mutual exclusion 
> trick?
>
> If this doesn't help, please also take a look at the initialization order. 
> The l1_pool_stats initialization doesn't have a 
> __attribute__((init_priority(...))), and I don't know how it is ordered 
> compared to other initializations and to that notifier thing. Maybe you can 
> add printouts to debug if there's a problem there.
>
>
>> So my hunch was to comment out 2 lines where we access this variable and 
>> see what happens:
>>
>> diff --git a/core/mempool.cc b/core/mempool.cc
>> index bd8e2fcf..8f843e99 100644
>> --- a/core/mempool.cc
>> +++ b/core/mempool.cc
>> @@ -1188,14 +1188,14 @@ struct l1 {
>>      void* pop()
>>      {
>>          assert(nr);
>> -        l1_pool_stats[cpu_id]._nr = nr - 1;
>> +        //l1_pool_stats[cpu_id]._nr = nr - 1;
>>          return _pages[--nr];
>>      }
>>      void push(void* page)
>>      {
>>          assert(nr < 512);
>>          _pages[nr++] = page;
>> -        l1_pool_stats[cpu_id]._nr = nr;
>> +        //l1_pool_stats[cpu_id]._nr = nr;
>>  
>>      }
>>      void* top() { return _pages[nr - 1]; }
>>
>> And miraculously the crash seems to go away. I ran the test with over 10K 
>> iterations in each terminal on QEMU and over 20K iterations on Firecracker. 
>> And I could not reproduce it after this patch.
>>
>> I wonder if you will see the same behavior. I wonder what the root cause 
>> of this crash is. And who knows if that is a symptom of something weirder 
>> and we will see other crashes even with these 2 lines commented out
>>
>> Waldek
>>
>> PS. On a side note, the nested virtualization makes OSv boot time 7-9 
>> times slower (~40ms vs 5-7ms). QEMU microvm is actually faster than 
>> firecracker as far as boot time goes - on average it takes ~ 22ms to boot.
>>  In both cases, there is no networking device which would add some time.
>> On Thursday, April 29, 2021 at 7:04:02 PM UTC-4 Waldek Kozaczuk wrote:
>>
>>> I have installed VMWare Fusion Pro on my other Mac (OSX - 10.15.7, 
>>> 2.3GHz, 8-core, i9) with Fedora 33 server in it, enabled the 2 settings to 
>>> allow KVM in nested virtualization and I was able to replicate the crashes.
>>>
>>> So there is most likely something about nested virtualization that 
>>> possibly triggers some bug in OSv.
>>>
>>> Now that I can re-produce it, I will try to figure out what is wrong.
>>>
>>> Waldek
>>>
>>> On Wednesday, April 28, 2021 at 3:34:18 AM UTC-4 [email protected] 
>>> wrote:
>>>
>>>> That's disappointing, I'd hoped this would be reproducible on your 
>>>> setup. As you say, perhaps this is an indication that it is somehow 
>>>> connected with the fact that our environment is using nested 
>>>> virtualisation.
>>>>
>>>> The VMWare KB article you referenced isn't relevant to my setup, as it 
>>>> relates to MacOS 11.x and I'm running MacOS 10.15.7 (as the article 
>>>> indicates, there were some significant MacOS changes in 11.x which 
>>>> impacted 
>>>> VMWare's virtualisation capabilities, so I'm avoiding the upgrade for the 
>>>> moment).
>>>>
>>>> Our target environment will be using nested virtualisation (Firecracker 
>>>> launched OSv VMs running on an Alpine Linux VM under a VMWare ESXi host) 
>>>> and as indicated previously, we see this same failure in our test 
>>>> environment which replicates the target environment.
>>>>
>>>> I have not tried to reproduce this on a non-nested virtualisation 
>>>> setup. I'll look into this.
>>>>
>>>> VMWare Fusion Pro is available as a fully-functional 30-day free trial 
>>>> from https://www.vmware.com/products/fusion/fusion-evaluation.html
>>>> If you do go down the route of trying it under Fusion, to enable nested 
>>>> virtualisation you will need to modify the following settings for the 
>>>> Ubuntu/Fedora VM:
>>>>
>>>> Settings->Processors & Memory->Advanced:
>>>>     Enable hypervisor applications in this virtual machine
>>>>     Enable code profiling application in this virtual machine (can be 
>>>> enabled optionally for Ubuntu, but seems to be required for Fedora)
>>>>
>>>> David
>>>>
>>>> On Wednesday, April 28, 2021 at 12:34:43 AM UTC+1 [email protected] 
>>>> wrote:
>>>>
>>>>> On Thursday, April 22, 2021 at 11:42:54 AM UTC-4 [email protected] 
>>>>> wrote:
>>>>>
>>>>>> Further to my previous reply, I've now been able to reproduce the 
>>>>>> original bsd: initializing Assert failure on my Fedora 33 system.
>>>>>> It did require some specific actions, which seem to invoke the 
>>>>>> failure relatively quickly. These more closely align the firecracker 
>>>>>> instance launch method with our test system, where we launch a few 
>>>>>> different OSv VMs under firecracker (on Alpine) in quick succession. On 
>>>>>> our 
>>>>>> test system we see a much higher failure rate, as high as 1 in 20 
>>>>>> launches.
>>>>>>
>>>>>> Using latest OSv source (pulled today), with no modifications (i.e. 
>>>>>> none of the new instrumenting changes you've provided in relation to 
>>>>>> this 
>>>>>> issue applied):
>>>>>>
>>>>>> build -j4 fs=rofs image=native-example
>>>>>> firecracker.py -V (just to prove that it runs and to obtain the 
>>>>>> converted raw ROFS image file needed by firecracker).
>>>>>>
>>>>>> I then duplicated the kernel.elf and usr.raw files to:
>>>>>> kernel1.elf
>>>>>> kernel2.elf
>>>>>> kernel3.elf
>>>>>> usr1.raw
>>>>>> usr2.raw
>>>>>> usr3.raw
>>>>>>
>>>>>> This probably isn't necessary, but for the avoidance of doubt, I used 
>>>>>> this approach rather than sharing the same files across multiple 
>>>>>> firecracker instances.
>>>>>>
>>>>>> I then extracted the firecracker json config generated by the 
>>>>>> firecracker.py -V command into 3 separate json config files. In each 
>>>>>> file, 
>>>>>> I modified the paths for the kernel.elf and usr.raw to reference one of 
>>>>>> the 
>>>>>> numbered versions of the same file. Here's an example of the resulting 
>>>>>> content:
>>>>>>
>>>>>> {
>>>>>>    "machine-config": {
>>>>>>       "vcpu_count": 1,
>>>>>>       "mem_size_mib": 128,
>>>>>>       "ht_enabled": false
>>>>>>    },
>>>>>>    "drives": [
>>>>>>       {
>>>>>>          "drive_id": "rootfs",
>>>>>>          "path_on_host": "/home/test/usr1.raw",
>>>>>>          "is_root_device": false,
>>>>>>          "is_read_only": false
>>>>>>       }
>>>>>>    ],
>>>>>>    "boot-source": {
>>>>>>       "kernel_image_path": "/home/test/kernel1.elf",
>>>>>>       "boot_args": "--verbose --nopci --rootfs=rofs /hello"
>>>>>>    }
>>>>>> }
>>>>>>
>>>>>> Finally, in three different terminal windows, I ran a script which 
>>>>>> performs the usual while forever loop launching the firecracker 
>>>>>> instance, 
>>>>>> using a different numbered json config. in each terminal window and 
>>>>>> logging 
>>>>>> to a different log file in each terminal window. Here's an example of 
>>>>>> the 
>>>>>> script content:
>>>>>>
>>>>>> *Script 1*
>>>>>> #!/bin/bash
>>>>>> rm test1.log; while true; do firecracker --no-api --config-file 
>>>>>> firecracker1.json >> test1.log ; done
>>>>>>
>>>>>> *Script 2*
>>>>>> #!/bin/bash
>>>>>> rm test2.log; while true; do firecracker --no-api --config-file 
>>>>>> firecracker2.json >> test2.log ; done
>>>>>>
>>>>>> *Script 3*
>>>>>> #!/bin/bash
>>>>>> rm test3.log; while true; do firecracker --no-api --config-file 
>>>>>> firecracker3.json >> test3.log ; done
>>>>>>
>>>>>>
>>>>>> On my system, running these three scripts concurrently invokes the 
>>>>>> Assert condition within 30 seconds of the start of the 3rd script.
>>>>>> Interestingly, when one fails, all 3 seem to fail, though I can't be 
>>>>>> sure this is happening at exactly the same time. 
>>>>>>
>>>>>> Hopefully this provides a mechanism by which you can reproduce the 
>>>>>> issue.
>>>>>>
>>>>>
>>>>> I have run the exact same test - 3 terminals each firing the command 
>>>>> as you have it above. And I could NOT reproduce it after 50K runs or so 
>>>>> in 
>>>>> each terminal.
>>>>>
>>>>> Now the difference is that my setup does NOT involve nested 
>>>>> virtualization. Both of my MBPros have Ubuntu and Fedora installed 
>>>>> natively 
>>>>> (dual or triple boot mode). 
>>>>>
>>>>> I have never run firecracker nor OSv on firecracker in such a setup. 
>>>>> Until this point, I was not even aware that such a possibility exists on 
>>>>> Mac (this page - https://kb.vmware.com/s/article/80785 - seems to 
>>>>> point there are some limitations). Given all that it could be that this 
>>>>> is 
>>>>> a bug in OSv or some other layer involved in this nested virtualization 
>>>>> setup.
>>>>>
>>>>> I know that Firecracker is used in production on the so-called 
>>>>> 'bare-metal' EC2 instances (i3, etc) that in reality involve nested 
>>>>> virtualization (Nitro hypervisor) but I am not sure how well firecracker 
>>>>> is 
>>>>> supported in nested virtualization setup in general besides that.
>>>>>
>>>>> Do your experiments target nested virtualization by design or is it 
>>>>> just a matter of convenience?
>>>>>
>>>>> Have you tried to reproduce this problem in not nested virtualization 
>>>>> setup?
>>>>>
>>>>> Waldek
>>>>>
>>>>> PS. I do not have VMware Fusion handy on any of my MBPros machines but 
>>>>> I guess I could install it and see if I can reproduce your issue. Is 
>>>>> VMware 
>>>>> Fusion free?
>>>>>
>>>>>  
>>>>>
>>>>>> On Wednesday, April 21, 2021 at 3:15:08 PM UTC+1 David Smith wrote:
>>>>>>
>>>>>>> Some info. for my host (Intel CPU based) Mac system (on which I'm 
>>>>>>> running MacOS 10.15.7):
>>>>>>>
>>>>>>> Darwin 19.6.0 Darwin Kernel Version 19.6.0: Tue Jan 12 22:13:05 PST 
>>>>>>> 2021; root:xnu-6153.141.16~1/RELEASE_X86_64 x86_64
>>>>>>>
>>>>>>> machdep.cpu.max_basic: 27
>>>>>>> machdep.cpu.max_ext: 2147483656 <(214)%20748-3656>
>>>>>>> machdep.cpu.vendor: GenuineIntel
>>>>>>> machdep.cpu.brand_string: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
>>>>>>> machdep.cpu.family: 6
>>>>>>> machdep.cpu.model: 126
>>>>>>> machdep.cpu.extmodel: 7
>>>>>>> machdep.cpu.extfamily: 0
>>>>>>> machdep.cpu.stepping: 5
>>>>>>> machdep.cpu.feature_bits: 9221959987971750911
>>>>>>> machdep.cpu.leaf7_feature_bits: 4072613871 <(407)%20261-3871> 
>>>>>>> 1077960526
>>>>>>> machdep.cpu.leaf7_feature_bits_edx: 3154117648 <(315)%20411-7648>
>>>>>>> machdep.cpu.extfeature_bits: 1241984796928
>>>>>>> machdep.cpu.signature: 460517
>>>>>>> machdep.cpu.brand: 0
>>>>>>> machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP 
>>>>>>> MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM 
>>>>>>> PBE 
>>>>>>> SSE3 PCLMULQDQ DTES64 MON DSCPL VMX EST TM2 SSSE3 FMA CX16 TPR PDCM 
>>>>>>> SSE4.1 
>>>>>>> SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR 
>>>>>>> AVX1.0 
>>>>>>> RDRAND F16C
>>>>>>> machdep.cpu.leaf7_features: RDWRFSGS TSC_THREAD_OFFSET SGX BMI1 AVX2 
>>>>>>> FDPEO SMEP BMI2 ERMS INVPCID FPU_CSDS AVX512F AVX512DQ RDSEED ADX SMAP 
>>>>>>> AVX512IFMA CLFSOPT IPT AVX512CD SHA AVX512BW AVX512VL AVX512VBMI UMIP 
>>>>>>> PKU 
>>>>>>> GFNI VAES VPCLMULQDQ AVX512VNNI AVX512BITALG AVX512VPOPCNTDQ RDPID 
>>>>>>> SGXLC 
>>>>>>> FSREPMOV MDCLEAR IBRS STIBP L1DF ACAPMSR SSBD
>>>>>>> machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT 
>>>>>>> PREFETCHW RDTSCP TSCI
>>>>>>> machdep.cpu.logical_per_package: 16
>>>>>>> machdep.cpu.cores_per_package: 8
>>>>>>> machdep.cpu.microcode_version: 160
>>>>>>> machdep.cpu.processor_flag: 7
>>>>>>> machdep.cpu.mwait.linesize_min: 64
>>>>>>> machdep.cpu.mwait.linesize_max: 64
>>>>>>> machdep.cpu.mwait.extensions: 3
>>>>>>> machdep.cpu.mwait.sub_Cstates: 286396448
>>>>>>> machdep.cpu.thermal.sensor: 1
>>>>>>> machdep.cpu.thermal.dynamic_acceleration: 1
>>>>>>> machdep.cpu.thermal.invariant_APIC_timer: 1
>>>>>>> machdep.cpu.thermal.thresholds: 2
>>>>>>> machdep.cpu.thermal.ACNT_MCNT: 1
>>>>>>> machdep.cpu.thermal.core_power_limits: 1
>>>>>>> machdep.cpu.thermal.fine_grain_clock_mod: 1
>>>>>>> machdep.cpu.thermal.package_thermal_intr: 1
>>>>>>> machdep.cpu.thermal.hardware_feedback: 0
>>>>>>> machdep.cpu.thermal.energy_policy: 1
>>>>>>> machdep.cpu.xsave.extended_state: 743 2688 2696 0
>>>>>>> machdep.cpu.xsave.extended_state1: 15 2432 8448 0
>>>>>>> machdep.cpu.arch_perf.version: 5
>>>>>>> machdep.cpu.arch_perf.number: 8
>>>>>>> machdep.cpu.arch_perf.width: 48
>>>>>>> machdep.cpu.arch_perf.events_number: 8
>>>>>>> machdep.cpu.arch_perf.events: 0
>>>>>>> machdep.cpu.arch_perf.fixed_number: 4
>>>>>>> machdep.cpu.arch_perf.fixed_width: 48
>>>>>>> machdep.cpu.cache.linesize: 64
>>>>>>> machdep.cpu.cache.L2_associativity: 8
>>>>>>> machdep.cpu.cache.size: 256
>>>>>>> machdep.cpu.address_bits.physical: 39
>>>>>>> machdep.cpu.address_bits.virtual: 48
>>>>>>> machdep.cpu.core_count: 4
>>>>>>> machdep.cpu.thread_count: 8
>>>>>>> machdep.cpu.tsc_ccc.numerator: 104
>>>>>>> machdep.cpu.tsc_ccc.denominator: 2
>>>>>>>
>>>>>>> I would add that this is my development system. We have a test 
>>>>>>> system which utilises VMWare ESXi and runs the OSv VM under firecracker 
>>>>>>> on 
>>>>>>> an Alpine Linux VMWare VM (so nested virtualisation again) and we also 
>>>>>>> see 
>>>>>>> the "bsd initializing" crash on startup on this configuration 
>>>>>>> occasionally.
>>>>>>>
>>>>>>> I added the patch to dump the physical memory region information as 
>>>>>>> you requested. Built again with build fs=rofs image=native-example
>>>>>>> then tested with  while true; do ./scripts/firecracker.py >> 
>>>>>>> /tmp/running; done (my firecracker.py script is slightly modified 
>>>>>>> to add in the '--power-off-on-abort' cmd line argument, which I think 
>>>>>>> is 
>>>>>>> not important, but I mention it just in case).
>>>>>>> On the 996'th iteration, I hit the missing symbol failure. Below is 
>>>>>>> the output for the last successful invocation followed by the failed 
>>>>>>> invocation. The physical memory information dumped is consistent across 
>>>>>>> all 
>>>>>>> 996 invocations.
>>>>>>>
>>>>>>> OSv v0.55.0-237-g7cd33926
>>>>>>> mempool: add range at:      ffff800000956434
>>>>>>> mempool: add range of size: 00000000076a9bcc
>>>>>>> mempool: add range at:      ffff800000000001
>>>>>>> mempool: add range of size: 000000000009fbff
>>>>>>> mempool: add range at:      ffff800000100000
>>>>>>> mempool: add range of size: 0000000000100000
>>>>>>> 1 CPUs detected
>>>>>>> Firmware vendor: Unknown
>>>>>>>
>>>>>>> *percpu_l1 = 0x175040
>>>>>>> bsd: initializing - done
>>>>>>> VFS: mounting ramfs at /
>>>>>>> VFS: mounting devfs at /dev
>>>>>>> net: initializing - done
>>>>>>> Detected virtio-mmio device: (2,0)
>>>>>>> virtio-blk: Add blk device instances 0 as vblk0, devsize=6474752
>>>>>>> random: intel drng, rdrand registered as a source.
>>>>>>> random: <Software, Yarrow> initialized
>>>>>>> VFS: unmounting /dev
>>>>>>> VFS: mounting rofs at /rofs
>>>>>>> VFS: mounting devfs at /dev
>>>>>>> VFS: mounting procfs at /proc
>>>>>>> VFS: mounting sysfs at /sys
>>>>>>> VFS: mounting ramfs at /tmp
>>>>>>> Booted up in 121.29 ms
>>>>>>> Cmdline: /hello 
>>>>>>> Hello from C code
>>>>>>> VFS: unmounting /dev
>>>>>>> VFS: unmounting /proc
>>>>>>> VFS: unmounting /
>>>>>>> ROFS: spent 1.78 ms reading from disk
>>>>>>> ROFS: read 64 512-byte blocks from disk
>>>>>>> ROFS: allocated 61 512-byte blocks of cache memory
>>>>>>> ROFS: hit ratio is 93.88%
>>>>>>> Powering off.
>>>>>>> 2021-04-21T13:22:17.382768: Start
>>>>>>> 2021-04-21T13:22:17.382851: API socket-less: True
>>>>>>> 2021-04-21T13:22:17.382855: Firecracker ready
>>>>>>> 2021-04-21T13:22:17.382940: Configured VM
>>>>>>> 2021-04-21T13:22:17.382945: Added disk
>>>>>>> 2021-04-21T13:22:17.382948: Created OSv VM with cmdline: 
>>>>>>> --power-off-on-abort --verbose --nopci --rootfs=rofs /hello
>>>>>>> {
>>>>>>>    "machine-config": {
>>>>>>>       "vcpu_count": 1,
>>>>>>>       "mem_size_mib": 128,
>>>>>>>       "ht_enabled": false
>>>>>>>    },
>>>>>>>    "drives": [
>>>>>>>       {
>>>>>>>          "drive_id": "rootfs",
>>>>>>>          "path_on_host": 
>>>>>>> "/home/david-griffin/projects/firecracker/osv/scripts/../build/last/usr.raw",
>>>>>>>          "is_root_device": false,
>>>>>>>          "is_read_only": false
>>>>>>>       }
>>>>>>>    ],
>>>>>>>    "boot-source": {
>>>>>>>       "kernel_image_path": 
>>>>>>> "/home/david-griffin/projects/firecracker/osv/scripts/../build/last/kernel.elf",
>>>>>>>       "boot_args": "--power-off-on-abort --verbose --nopci 
>>>>>>> --rootfs=rofs /hello"
>>>>>>>    }
>>>>>>> }
>>>>>>> 2021-04-21T13:22:17.385972: Waiting for firecracker process to 
>>>>>>> terminate
>>>>>>> 2021-04-21T13:22:17.586150: End
>>>>>>> OSv v0.55.0-237-g7cd33926
>>>>>>> mempool: add range at:      ffff800000956434
>>>>>>> mempool: add range of size: 00000000076a9bcc
>>>>>>> mempool: add range at:      ffff800000000001
>>>>>>> mempool: add range of size: 000000000009fbff
>>>>>>> mempool: add range at:      ffff800000100000
>>>>>>> mempool: add range of size: 0000000000100000
>>>>>>> 1 CPUs detected
>>>>>>> Firmware vendor: Unknown
>>>>>>>
>>>>>>> *percpu_l1 = 0x175040
>>>>>>> addr: 0x0000000000000008    mmu::page_size: 4096     elf:mspa: 
>>>>>>> 0x0000000000000000
>>>>>>> trying to execute or access missing symbol
>>>>>>> [backtrace]
>>>>>>> addr: 0x0000000000000008    mmu::page_size: 4096     elf:mspa: 
>>>>>>> 0x0000000000000000
>>>>>>>
>>>>>>> Prior to adding the physical memory info. patch, I setup a Fedora 33 
>>>>>>> system (again under my VMWare Fusion) environment, installed the same 
>>>>>>> OSv 
>>>>>>> sources, built the same image with the same commands and ran the same 
>>>>>>> test. 
>>>>>>> I was unable to reproduce either startup failure over two attempts each 
>>>>>>> running approx. 50K iterations.
>>>>>>> Fedora 33 is running a later kernel (5.11.11) compared to both my 
>>>>>>> Ubuntu 20.04 (5.8.0) and Alpine (5.10.13). Could there be some subtle 
>>>>>>> kernel interaction? 
>>>>>>>
>>>>>>> On Tuesday, April 20, 2021 at 5:08:23 PM UTC+1 David Smith wrote:
>>>>>>>
>>>>>>>> Again, I appreciate you continuing to make suggestions. I was also 
>>>>>>>> on vacation yesterday and catching up today. I will respond to your 
>>>>>>>> various 
>>>>>>>> points in more detail as soon as possible.
>>>>>>>>
>>>>>>>> On the question of nested virtualisation, the answer is yes, my dev 
>>>>>>>> environment is Ubuntu 20.04 under VMWare Fusion, with Intel VT-x/EPT 
>>>>>>>> and 
>>>>>>>> Code Profiling CPU options enabled in the VM settings.
>>>>>>>>
>>>>>>>> On Thursday, April 15, 2021 at 11:06:37 PM UTC+1 [email protected] 
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Another related possibility is that we have a bug somewhere where 
>>>>>>>>> we read a configuration value - some memory address, size parameter, 
>>>>>>>>> etc. 
>>>>>>>>> And we read it incorrectly - instead of say 4 bytes we read 8 bytes 
>>>>>>>>> and we 
>>>>>>>>> get garbage value which is then used to set up things incorrectly. 
>>>>>>>>> And we 
>>>>>>>>> end up reading from or writing to some wrong places.
>>>>>>>>>
>>>>>>>>> On Thursday, April 15, 2021 at 5:04:47 PM UTC-4 Waldek Kozaczuk 
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sorry I was away on a vacation so responding somewhat late.
>>>>>>>>>>
>>>>>>>>>> Yes, I did see your April 6th reply, which seems to prove that 
>>>>>>>>>> the issue is unlikely related to the GCC version nor other build 
>>>>>>>>>> specifics. 
>>>>>>>>>>
>>>>>>>>>> It is still most likely a bug in OSv that is triggered by some 
>>>>>>>>>> runtime-specific element that we do not know. My hunch is that it is 
>>>>>>>>>> memory-related, possibly setup, given we see two types of crashes - 
>>>>>>>>>> "bsd 
>>>>>>>>>> .." page fault and missing symbol page error. The second one is very 
>>>>>>>>>> weird 
>>>>>>>>>> as the symbols from the kernel should always be found, so most 
>>>>>>>>>> likely 
>>>>>>>>>> something is reading from the wrong part of memory.
>>>>>>>>>>
>>>>>>>>>> Having said that I do not know where to go from there especially 
>>>>>>>>>> given I cannot reproduce it.
>>>>>>>>>>
>>>>>>>>>> Today I tried to re-produce on my Ubuntu 20.10 machine to no 
>>>>>>>>>> avail. I looked at your email from Feb 26 where you wrote:
>>>>>>>>>>
>>>>>>>>>> "On the crash issue, I built an OSv image with the standard 
>>>>>>>>>> native example: *./scripts/build image=native-example*
>>>>>>>>>> Then ran continuous launches under firecracker using a bash loop: 
>>>>>>>>>> *while 
>>>>>>>>>> true; do ./scripts/firecracker.py; done*
>>>>>>>>>> After 1249 iterations, I got the same Assert at startup."
>>>>>>>>>>
>>>>>>>>>> which seems to prove that this error is not app-specific and I 
>>>>>>>>>> used exactly some commands to build and run the example:
>>>>>>>>>>
>>>>>>>>>> With the original command:
>>>>>>>>>>
>>>>>>>>>> while true; do ./scripts/firecracker.py >> /tmp/running; done
>>>>>>>>>>
>>>>>>>>>> I could run it 71700 (over 71K) with no errors. 
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> With verbose enabled:
>>>>>>>>>>
>>>>>>>>>> while true; do ./scripts/firecracker.py -V >> /tmp/running2; done
>>>>>>>>>>
>>>>>>>>>> I could run it 46691(over 46K) with no errors.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Let me share some of my host machine-specific (5-6 year old 
>>>>>>>>>> MacBook Pro) info:
>>>>>>>>>>
>>>>>>>>>> *more /proc/cpuinfo *
>>>>>>>>>>
>>>>>>>>>> *processor : 0*
>>>>>>>>>>
>>>>>>>>>> *vendor_id : GenuineIntel*
>>>>>>>>>>
>>>>>>>>>> *cpu family : 6*
>>>>>>>>>>
>>>>>>>>>> *model : 70*
>>>>>>>>>>
>>>>>>>>>> *model name : Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz*
>>>>>>>>>>
>>>>>>>>>> *stepping : 1*
>>>>>>>>>>
>>>>>>>>>> *microcode : 0x1c*
>>>>>>>>>>
>>>>>>>>>> *cpu MHz : 876.525*
>>>>>>>>>>
>>>>>>>>>> *cache size : 6144 KB*
>>>>>>>>>>
>>>>>>>>>> *physical id : 0*
>>>>>>>>>>
>>>>>>>>>> *siblings : 8*
>>>>>>>>>>
>>>>>>>>>> *core id : 0*
>>>>>>>>>>
>>>>>>>>>> *cpu cores : 4*
>>>>>>>>>>
>>>>>>>>>> *apicid : 0*
>>>>>>>>>>
>>>>>>>>>> *initial apicid : 0*
>>>>>>>>>>
>>>>>>>>>> *fpu : yes*
>>>>>>>>>>
>>>>>>>>>> *fpu_exception : yes*
>>>>>>>>>>
>>>>>>>>>> *cpuid level : 13*
>>>>>>>>>>
>>>>>>>>>> *wp : yes*
>>>>>>>>>>
>>>>>>>>>> *flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca 
>>>>>>>>>> cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe 
>>>>>>>>>> syscall nx 
>>>>>>>>>> pdpe1*
>>>>>>>>>>
>>>>>>>>>> *gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl 
>>>>>>>>>> xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor 
>>>>>>>>>> ds_cpl 
>>>>>>>>>> vmx smx est tm2 s*
>>>>>>>>>>
>>>>>>>>>> *sse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe 
>>>>>>>>>> popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 
>>>>>>>>>> cpuid_fault 
>>>>>>>>>> epb invpcid_singl*
>>>>>>>>>>
>>>>>>>>>> *e pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid 
>>>>>>>>>> ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt 
>>>>>>>>>> dtherm 
>>>>>>>>>> ida arat pln *
>>>>>>>>>>
>>>>>>>>>> *pts md_clear flush_l1d*
>>>>>>>>>>
>>>>>>>>>> *vmx flags : vnmi preemption_timer invvpid ept_x_only ept_ad 
>>>>>>>>>> ept_1gb flexpriority tsc_offset vtpr mtf vapic ept vpid 
>>>>>>>>>> unrestricted_guest 
>>>>>>>>>> ple shadow_vm*
>>>>>>>>>>
>>>>>>>>>> *cs*
>>>>>>>>>>
>>>>>>>>>> *bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf 
>>>>>>>>>> mds swapgs itlb_multihit srbds*
>>>>>>>>>>
>>>>>>>>>> *bogomips : 4589.68*
>>>>>>>>>>
>>>>>>>>>> *clflush size : 64*
>>>>>>>>>>
>>>>>>>>>> *cache_alignment : 64*
>>>>>>>>>>
>>>>>>>>>> *address sizes : 39 bits physical, 48 bits virtual*
>>>>>>>>>>
>>>>>>>>>> *power management:*
>>>>>>>>>>
>>>>>>>>>> uname -a
>>>>>>>>>> Linux wkoMacBookPro 5.8.0-49-generic #55-Ubuntu SMP Wed Mar 24 
>>>>>>>>>> 14:45:45 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
>>>>>>>>>>
>>>>>>>>>> So my machine comes with Intel CPUs.
>>>>>>>>>>
>>>>>>>>>> Is yours Intel as well or AMD? I know that Firecracker had to add 
>>>>>>>>>> some special support for AMD.
>>>>>>>>>>
>>>>>>>>>> Finally given it might be memory-setup related issue I wonder 
>>>>>>>>>> what the physical memory ranges you see.
>>>>>>>>>>
>>>>>>>>>> This patch:
>>>>>>>>>> git diff core/
>>>>>>>>>> diff --git a/core/mempool.cc b/core/mempool.cc
>>>>>>>>>> index bd8e2fcf..52f94774 100644
>>>>>>>>>> --- a/core/mempool.cc
>>>>>>>>>> +++ b/core/mempool.cc
>>>>>>>>>> @@ -1732,6 +1732,8 @@ void free_initial_memory_range(void* addr, 
>>>>>>>>>> size_t size)
>>>>>>>>>>      if (!size) {
>>>>>>>>>>          return;
>>>>>>>>>>      }
>>>>>>>>>> +    debug_early_u64("mempool: add range at:      ", (u64)addr);
>>>>>>>>>> +    debug_early_u64("mempool: add range of size: ", size);
>>>>>>>>>>      auto a = reinterpret_cast<uintptr_t>(addr);
>>>>>>>>>>      auto delta = align_up(a, page_size) - a;
>>>>>>>>>>      if (delta > size) {
>>>>>>>>>>
>>>>>>>>>> produces this extra info with verbose output:
>>>>>>>>>>
>>>>>>>>>> 2021-04-15T13:40:50.493125: Start
>>>>>>>>>> 2021-04-15T13:40:50.493200: API socket-less: True
>>>>>>>>>> 2021-04-15T13:40:50.493204: Firecracker ready
>>>>>>>>>> 2021-04-15T13:40:50.493293: Configured VM
>>>>>>>>>> 2021-04-15T13:40:50.493298: Added disk
>>>>>>>>>> 2021-04-15T13:40:50.493301: Created OSv VM with cmdline: 
>>>>>>>>>> --verbose --nopci --rootfs=zfs /hello
>>>>>>>>>> {
>>>>>>>>>>    "machine-config": {
>>>>>>>>>>       "vcpu_count": 1,
>>>>>>>>>>       "mem_size_mib": 128,
>>>>>>>>>>       "ht_enabled": false
>>>>>>>>>>    },
>>>>>>>>>>    "drives": [
>>>>>>>>>>       {
>>>>>>>>>>          "drive_id": "rootfs",
>>>>>>>>>>          "path_on_host": 
>>>>>>>>>> "/home/wkozaczuk/projects/osv-master/scripts/../build/last/usr.raw",
>>>>>>>>>>          "is_root_device": false,
>>>>>>>>>>          "is_read_only": false
>>>>>>>>>>       }
>>>>>>>>>>    ],
>>>>>>>>>>    "boot-source": {
>>>>>>>>>>       "kernel_image_path": 
>>>>>>>>>> "/home/wkozaczuk/projects/osv-master/scripts/../build/last/kernel.elf",
>>>>>>>>>>       "boot_args": "--verbose --nopci --rootfs=zfs /hello"
>>>>>>>>>>    }
>>>>>>>>>> }
>>>>>>>>>> OSv v0.55.0-240-g9d1f5111
>>>>>>>>>> *mempool: add range at:      ffff800000957434*
>>>>>>>>>> *mempool: add range of size: 00000000076a8bcc*
>>>>>>>>>> *mempool: add range at:      ffff800000000001*
>>>>>>>>>> *mempool: add range of size: 000000000009fbff*
>>>>>>>>>> *mempool: add range at:      ffff800000100000*
>>>>>>>>>> *mempool: add range of size: 0000000000100000*
>>>>>>>>>> 1 CPUs detected
>>>>>>>>>> Firmware vendor: Unknown
>>>>>>>>>> bsd: initializing - done
>>>>>>>>>> VFS: mounting ramfs at /
>>>>>>>>>> VFS: mounting devfs at /dev
>>>>>>>>>> net: initializing - done
>>>>>>>>>> Detected virtio-mmio device: (2,0)
>>>>>>>>>> virtio-blk: Add blk device instances 0 as vblk0, devsize=268435456
>>>>>>>>>> random: intel drng, rdrand registered as a source.
>>>>>>>>>> random: <Software, Yarrow> initialized
>>>>>>>>>> VFS: unmounting /dev
>>>>>>>>>> VFS: mounting zfs at /zfs
>>>>>>>>>> zfs: mounting osv/zfs from device /dev/vblk0.1
>>>>>>>>>> VFS: mounting devfs at /dev
>>>>>>>>>> VFS: mounting procfs at /proc
>>>>>>>>>> VFS: mounting sysfs at /sys
>>>>>>>>>> BSD shrinker: event handler list found: 0xffffa000009ad080
>>>>>>>>>>         BSD shrinker found: 1
>>>>>>>>>> BSD shrinker: unlocked, running
>>>>>>>>>> Booted up in 47.39 ms
>>>>>>>>>> Cmdline: /hello 
>>>>>>>>>> Hello from C code
>>>>>>>>>> VFS: unmounting /dev
>>>>>>>>>> VFS: unmounting /proc
>>>>>>>>>> VFS: unmounting /
>>>>>>>>>> Powering off.
>>>>>>>>>> 2021-04-15T13:40:50.495752: Waiting for firecracker process to 
>>>>>>>>>> terminate
>>>>>>>>>> 2021-04-15T13:40:50.594645: End
>>>>>>>>>>
>>>>>>>>>> Also in all 46691 runs with verbose I would always see the same 
>>>>>>>>>> memory ranges.
>>>>>>>>>>
>>>>>>>>>> I wonder what ranges are in your case. I know we used to have 
>>>>>>>>>> some edge-case bugs in memory setup around parsing e820 ranges so.
>>>>>>>>>>
>>>>>>>>>> Maybe we have a bug where we read incorrectly e820 and then 
>>>>>>>>>> create a wrong mapping or read less memory than provided.
>>>>>>>>>>
>>>>>>>>>> Are the ranges the same in your case or possibly fluctuate and in 
>>>>>>>>>> case of the crash are somewhat different?
>>>>>>>>>>
>>>>>>>>>> Otherwise, I am out of any more ideas.
>>>>>>>>>>
>>>>>>>>>> Waldek
>>>>>>>>>>
>>>>>>>>>> PS. Do you possibly run your tests in a nested virtualization 
>>>>>>>>>> setup or possibly on EC2 i3 metal instances?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sunday, April 11, 2021 at 12:43:17 PM UTC-4 [email protected] 
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> The gcc 10.2.0 build already included the missing symbol change 
>>>>>>>>>>> to disable setting up the missing symbols page.
>>>>>>>>>>>
>>>>>>>>>>> Did you see my April 6th reply?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sunday, April 11, 2021 at 4:45:59 AM UTC+1 [email protected] 
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Interesting. What happens if you apply that other change to 
>>>>>>>>>>>> disable setting up that missing symbols page and compile with gcc 
>>>>>>>>>>>> 10.2.0?
>>>>>>>>>>>>
>>>>>>>>>>>> On Wednesday, April 7, 2021 at 8:30:10 AM UTC-4 
>>>>>>>>>>>> [email protected] wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Re: Your earlier point that it could be due to my gcc version 
>>>>>>>>>>>>> revealing a gcc/OSv bug, I thought I would try upgrading to the 
>>>>>>>>>>>>> latest 
>>>>>>>>>>>>> gcc/g++ tools. So I'm now running:
>>>>>>>>>>>>>
>>>>>>>>>>>>> gcc (Ubuntu 10.2.0-5ubuntu1~20.04) 10.2.0
>>>>>>>>>>>>>
>>>>>>>>>>>>> I rebuilt my OSv image as follows:
>>>>>>>>>>>>>
>>>>>>>>>>>>> ./scripts/build clean
>>>>>>>>>>>>> ./scripts/build -j4 image=golang-example
>>>>>>>>>>>>>
>>>>>>>>>>>>> Then re-ran my test. At iterations 591, 688 and 9326, I saw 
>>>>>>>>>>>>> the missing symbol failure, but so far, no sign of the Assert 
>>>>>>>>>>>>> failure.
>>>>>>>>>>>>>
>>>>>>>>>>>> -- 
>>
> You received this message because you are subscribed to the Google Groups 
>> "OSv Development" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected].
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/osv-dev/f820b2d4-a641-4c39-9c9b-2e8b8a0a0d05n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/osv-dev/f820b2d4-a641-4c39-9c9b-2e8b8a0a0d05n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/ae40d15f-3e16-4818-9d87-8715dcaed238n%40googlegroups.com.

Re: [osv-dev] OSv page_fault assert during "bsd: initializing"?

Reply via email to