Thanks for this - I was hitting a wierd page fault issue on our
application as we've recently moved from 0.52 to the latest OSv.
Something like this, which occurs early on in startup:

Assertion failed: ef->rflags & processor::rflags_if (arch/x64/mmu.cc:
page_fault: 34)

[backtrace]
0x00000000402298ea <__assert_fail+26>
0x000000004039aa30 <page_fault+240>
0x0000000040399826 <???+1077516326>
0x000000004039c0a8 <interrupt+232>
0x000000004039a779 <???+1077520249>
0x0000000040214ca1 <main_cont(int, char**)+193>
0x00000000403f9646 <thread_main_c+38>
0x000000004039a7a2 <???+1077520290>

On fixing the memory to 2GB in virsh, the problem was fixed. Applying
your patch also fixed it, it seems.

Rick

On Tue, 2019-08-20 at 20:04 -0700, Waldek Kozaczuk wrote:
> This patch definitely fixes an apparent bug I introduced myself in
> the past. I have tested that issue #1048 goes away with 4,5,6, 7 or
> 8GB of memory. I have also verified using cli module that free memory
> is reported properly now.
> 
> However, there is still 1 question and 1 issue outstanding:
> 1. I do not understand how this bug arch_setup_free_memory() would
> lead to a page fault reported by issue 1048 or other "read errors"
> with higher memory (8GB, end so). I would expect this bug lead to OSv
> missing to use the memory above 1GB in the e820 block but still be
> able to operate properly without the page fault. Is there another
> underlying bug that this patch actually covers?
> 
> 2. After this patch the tst-huge.so does not pass - actually hangs or
> never completes. I have played with it a bit and discovered that it
> passes if I run it with the right amount of memory - 128M < m <= 1G,
> but fails with anything above 1GB (the deafult is 2GB). It could be
> that the test is flaky and has to have right amount of free memory to
> pass (?).
> 
> Here is the stacktrace of where it was stuck:
> 
> sched::thread::switch_to (this=this@entry=0xffff8000001ba040) at
> arch/x64/arch-switch.hh:108
> #1  0x00000000403ff794 in sched::cpu::reschedule_from_interrupt
> (this=0xffff80000001d040, called_from_yield=called_from_yield@entry=f
> alse, 
>     preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
> #2  0x00000000403ffc8c in sched::cpu::schedule () at
> include/osv/sched.hh:1310
> #3  0x0000000040400372 in sched::thread::wait (this=this@entry=0xffff
> 8000014a1040) at core/sched.cc:1214
> #4  0x0000000040428072 in sched::thread::do_wait_for<lockfree::mutex,
> sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41
> #5  sched::thread::wait_for<waitqueue&> (mtx=...) at
> include/osv/sched.hh:1220
> #6  waitqueue::wait (this=this@entry=0x408ec550
> <mmu::vma_list_mutex+48>, mtx=...) at core/waitqueue.cc:56
> #7  0x00000000403e2d83 in rwlock::reader_wait_lockable
> (this=<optimized out>) at core/rwlock.cc:174
> #8  rwlock::rlock (this=this@entry=0x408ec520 <mmu::vma_list_mutex>)
> at core/rwlock.cc:29
> #9  0x000000004034ad98 in rwlock_for_read::lock (this=0x408ec520
> <mmu::vma_list_mutex>) at include/osv/rwlock.h:113
> #10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=...,
> this=<synthetic pointer>) at /usr/include/c++/8/bits/std_mutex.h:162
> #11
> lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock
> (lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89
> #12 mmu::vm_fault (addr=18446603337326391296, addr@entry=184466033373
> 26395384, ef=ef@entry=0xffff8000014a6068) at core/mmu.cc:1334
> #13 0x00000000403a746e in page_fault (ef=0xffff8000014a6068) at
> arch/x64/mmu.cc:38
> #14 <signal handler called>
> #15 0x00000000403f2114 in memory::page_range_allocator::insert<true>
> (this=this@entry=0x40904300 <memory::free_page_ranges>, pr=...)
>     at core/mempool.cc:575
> #16 0x00000000403ef83c in
> memory::page_range_allocator::<lambda(memory::page_range&)>::operator
> () (header=..., __closure=<synthetic pointer>)
>     at core/mempool.cc:751
> #17
> memory::page_range_allocator::<lambda(memory::page_range&)>::operator
> () (header=..., __closure=<synthetic pointer>) at core/mempool.cc:736
> #18
> memory::page_range_allocator::for_each<memory::page_range_allocator::
> alloc_aligned(size_t, size_t, size_t,
> bool)::<lambda(memory::page_range&)> > (f=..., min_order=<optimized
> out>, this=0x40904300 <memory::free_page_ranges>) at
> core/mempool.cc:809
> #19 memory::page_range_allocator::alloc_aligned (this=this@entry=0x40
> 904300 <memory::free_page_ranges>, size=size@entry=2097152, 
>     offset=offset@entry=0, alignment=alignment@entry=2097152, 
> fill=fill@entry=true) at core/mempool.cc:736
> #20 0x00000000403f0164 in memory::alloc_huge_page (N=N@entry=2097152)
> at core/mempool.cc:1601
> #21 0x000000004035030e in
> mmu::uninitialized_anonymous_page_provider::map (this=0x40873150
> <mmu::page_allocator_init>, offset=83886080, 
>     ptep=..., pte=..., write=<optimized out>) at include/osv/mmu-
> defs.hh:219
> #22 0x0000000040355b94 in mmu::populate<(mmu::account_opt)1>::page<1>
> (offset=83886080, ptep=..., this=0x2000001ffd70)
>     at include/osv/mmu-defs.hh:235
> #23 mmu::page<mmu::populate<>, 1> (ptep=..., offset=83886080,
> pops=...) at core/mmu.cc:311
> #24 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 2>::operator()
> (base_virt=35185397596160, parent=..., this=<synthetic pointer>)
>     at core/mmu.cc:437
> #25 mmu::map_level<mmu::populate<(mmu::account_opt)1>,
> 3>::map_range<2> (this=<synthetic pointer>, ptep=...,
> base_virt=35184372088832, 
>     slop=4096, page_mapper=..., size=132120576, vcur=<optimized out>)
> at core/mmu.cc:399
> #26 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 3>::operator()
> (base_virt=35184372088832, parent=..., this=<synthetic pointer>)
>     at core/mmu.cc:449
> #27 mmu::map_level<mmu::populate<(mmu::account_opt)1>,
> 4>::map_range<3> (this=<synthetic pointer>, ptep=...,
> base_virt=35184372088832, 
>     slop=4096, page_mapper=..., size=134217728, vcur=<optimized out>)
> at core/mmu.cc:399
> #28 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 4>::operator()
> (base_virt=35184372088832, parent=..., this=<synthetic pointer>)
> --Type <RET> for more, q to quit, c to continue without paging--
>     at core/mmu.cc:449
> #29 mmu::map_range<mmu::populate<(mmu::account_opt)1> > (
> vma_start=vma_start@entry=35185313710080, vstart=vstart@entry=3518531
> 3710080, 
>     size=<optimized out>, page_mapper=..., slop=slop@entry=4096) at
> core/mmu.cc:354
> #30 0x0000000040356385 in
> mmu::operate_range<mmu::populate<(mmu::account_opt)1> >
> (size=<optimized out>, start=0x200038200000, 
>     vma_start=<optimized out>, mapper=...) at core/mmu.cc:801
> #31 mmu::vma::operate_range<mmu::populate<(mmu::account_opt)1> >
> (size=3, addr=0x200038200000, mapper=..., this=0xffffa000012deb00)
>     at core/mmu.cc:1412
> #32 mmu::populate_vma<(mmu::account_opt)1> (vma=vma@entry=0xffffa0000
> 12deb00, v=v@entry=0x200038200000, size=size@entry=134217728, 
>     write=write@entry=false) at core/mmu.cc:1206
> #33 0x000000004034e8d2 in mmu::map_anon (addr=addr@entry=0x0, 
> size=size@entry=134217728, flags=flags@entry=2, perm=perm@entry=3)
>     at core/mmu.cc:1222
> #34 0x000000004046503d in mmap (addr=addr@entry=0x0, 
> length=length@entry=134217728, prot=prot@entry=3, flags=flags@entry=3
> 2802, 
>     fd=fd@entry=-1, offset=offset@entry=0) at libc/mman.cc:156
> #35 0x0000100000006624 in exhaust_memory (size=size@entry=134217728)
> at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:31
> #36 0x000010000000621e in main (argc=<optimized out>, argv=<optimized
> out>) at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:99
> #37 0x000000004043090d in osv::application::run_main
> (this=0xffffa00001130e10) at /usr/include/c++/8/bits/stl_vector.h:805
> #38 0x0000000040226b51 in osv::application::main
> (this=0xffffa00001130e10) at core/app.cc:320
> #39 0x0000000040430ab9 in
> osv::application::<lambda(void*)>::operator() (__closure=0x0,
> app=<optimized out>) at core/app.cc:233
> #40 osv::application::<lambda(void*)>::_FUN(void *) () at
> core/app.cc:235
> #41 0x000000004045eec6 in
> pthread_private::pthread::<lambda()>::operator()
> (__closure=0xffffa0000149c200) at libc/pthread.cc:114
> #42 std::_Function_handler<void(),
> pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t,
> const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const
> std::_Any_data &) (__functor=...) at
> /usr/include/c++/8/bits/std_function.h:297
> #43 0x0000000040401117 in sched::thread_main_c (t=0xffff8000014a1040)
> at arch/x64/arch-switch.hh:271
> #44 0x00000000403a7263 in thread_main () at arch/x64/entry.S:113
> 
> Waldek
> 
> On Tuesday, August 20, 2019 at 10:53:30 PM UTC-4, Waldek Kozaczuk
> wrote:
> > The commit 97fe8aa3d2d8f2c938fcaa379c44ae5a80dfbf33 adjusted logic 
> > in arch_setup_free_memory() to improve memory utilization 
> > by making OSv use memory below kernel (<= 2MB). 
> > 
> > Ironically the new logic introduced new bug which led to much
> > bigger 
> > waste of memory. Specifically it did not take into account 
> > the case of memory region starting below 2MB and ending 
> > above 1GB at the same time and make it skip the part above 1GB
> > altogether. 
> > 
> > This patch fixes this bug and makes issue reported below go away. 
> > 
> > Fixes #1048 
> > 
> > Signed-off-by: Waldemar Kozaczuk <[email protected]> 
> > --- 
> >  arch/x64/arch-setup.cc | 12 ++++++++---- 
> >  1 file changed, 8 insertions(+), 4 deletions(-) 
> > 
> > diff --git a/arch/x64/arch-setup.cc b/arch/x64/arch-setup.cc 
> > index e5fb7a6e..986a0928 100644 
> > --- a/arch/x64/arch-setup.cc 
> > +++ b/arch/x64/arch-setup.cc 
> > @@ -175,11 +175,15 @@ void arch_setup_free_memory() 
> >          // 
> >          // Free the memory below elf_phys_start which we could not
> > before 
> >          if (ent.addr < (u64)elf_phys_start) { 
> > +            auto ent_below_kernel = ent; 
> >              if (ent.addr + ent.size >= (u64)elf_phys_start) { 
> > -                ent = truncate_above(ent, (u64) elf_phys_start); 
> > +                ent_below_kernel = truncate_above(ent, (u64)
> > elf_phys_start); 
> > +            } 
> > +            mmu::free_initial_memory_range(ent_below_kernel.addr,
> > ent_below_kernel.size); 
> > +            // If there is nothing left below elf_phys_start
> > return 
> > +            if (ent.addr + ent.size <= (u64)elf_phys_start) { 
> > +               return; 
> >              } 
> > -            mmu::free_initial_memory_range(ent.addr, ent.size); 
> > -            return; 
> >          } 
> >          // 
> >          // Ignore memory already freed above 
> > @@ -331,4 +335,4 @@ void reset_bootchart(osv_multiboot_info_type*
> > mb_info) 
> >   
> >      mb_info->tsc_uncompress_done_hi = now_high; 
> >      mb_info->tsc_uncompress_done = now_low; 
> > -} 
> > \ No newline at end of file 
> > +} 

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/6e20f4ba3085db676f28f9799cf2c7eb28fb94bf.camel%40rossfell.co.uk.

Reply via email to