Thanks for this - I was hitting a wierd page fault issue on our
application as we've recently moved from 0.52 to the latest OSv.
Something like this, which occurs early on in startup:

Assertion failed: ef->rflags & processor::rflags_if (arch/x64/mmu.cc:
page_fault: 34)

[backtrace]
0x00000000402298ea <__assert_fail+26>
0x000000004039aa30 <page_fault+240>
0x0000000040399826 <???+1077516326>
0x000000004039c0a8 <interrupt+232>
0x000000004039a779 <???+1077520249>
0x0000000040214ca1 <main_cont(int, char**)+193>
0x00000000403f9646 <thread_main_c+38>
0x000000004039a7a2 <???+1077520290>

On fixing the memory to 2GB in virsh, the problem was fixed. Applying
your patch also fixed it, it seems.

Rick

On Tue, 2019-08-20 at 20:04 -0700, Waldek Kozaczuk wrote:
> This patch definitely fixes an apparent bug I introduced myself in
> the past. I have tested that issue #1048 goes away with 4,5,6, 7 or
> 8GB of memory. I have also verified using cli module that free memory
> is reported properly now.
> 
> However, there is still 1 question and 1 issue outstanding:
> 1. I do not understand how this bug arch_setup_free_memory() would
> lead to a page fault reported by issue 1048 or other "read errors"
> with higher memory (8GB, end so). I would expect this bug lead to OSv
> missing to use the memory above 1GB in the e820 block but still be
> able to operate properly without the page fault. Is there another
> underlying bug that this patch actually covers?
> 
> 2. After this patch the tst-huge.so does not pass - actually hangs or
> never completes. I have played with it a bit and discovered that it
> passes if I run it with the right amount of memory - 128M < m <= 1G,
> but fails with anything above 1GB (the deafult is 2GB). It could be
> that the test is flaky and has to have right amount of free memory to
> pass (?).
> 
> Here is the stacktrace of where it was stuck:
> 
> sched::thread::switch_to (this=this@entry=0xffff8000001ba040) at
> arch/x64/arch-switch.hh:108
> #1  0x00000000403ff794 in sched::cpu::reschedule_from_interrupt
> (this=0xffff80000001d040, called_from_yield=called_from_yield@entry=f
> alse, 
>     preempt_after=..., preempt_after@entry=...) at core/sched.cc:339
> #2  0x00000000403ffc8c in sched::cpu::schedule () at
> include/osv/sched.hh:1310
> #3  0x0000000040400372 in sched::thread::wait (this=this@entry=0xffff
> 8000014a1040) at core/sched.cc:1214
> #4  0x0000000040428072 in sched::thread::do_wait_for<lockfree::mutex,
> sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41
> #5  sched::thread::wait_for<waitqueue&> (mtx=...) at
> include/osv/sched.hh:1220
> #6  waitqueue::wait (this=this@entry=0x408ec550
> <mmu::vma_list_mutex+48>, mtx=...) at core/waitqueue.cc:56
> #7  0x00000000403e2d83 in rwlock::reader_wait_lockable
> (this=<optimized out>) at core/rwlock.cc:174
> #8  rwlock::rlock (this=this@entry=0x408ec520 <mmu::vma_list_mutex>)
> at core/rwlock.cc:29
> #9  0x000000004034ad98 in rwlock_for_read::lock (this=0x408ec520
> <mmu::vma_list_mutex>) at include/osv/rwlock.h:113
> #10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=...,
> this=<synthetic pointer>) at /usr/include/c++/8/bits/std_mutex.h:162
> #11
> lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock
> (lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89
> #12 mmu::vm_fault (addr=18446603337326391296, addr@entry=184466033373
> 26395384, ef=ef@entry=0xffff8000014a6068) at core/mmu.cc:1334
> #13 0x00000000403a746e in page_fault (ef=0xffff8000014a6068) at
> arch/x64/mmu.cc:38
> #14 <signal handler called>
> #15 0x00000000403f2114 in memory::page_range_allocator::insert<true>
> (this=this@entry=0x40904300 <memory::free_page_ranges>, pr=...)
>     at core/mempool.cc:575
> #16 0x00000000403ef83c in
> memory::page_range_allocator::<lambda(memory::page_range&)>::operator
> () (header=..., __closure=<synthetic pointer>)
>     at core/mempool.cc:751
> #17
> memory::page_range_allocator::<lambda(memory::page_range&)>::operator
> () (header=..., __closure=<synthetic pointer>) at core/mempool.cc:736
> #18
> memory::page_range_allocator::for_each<memory::page_range_allocator::
> alloc_aligned(size_t, size_t, size_t,
> bool)::<lambda(memory::page_range&)> > (f=..., min_order=<optimized
> out>, this=0x40904300 <memory::free_page_ranges>) at
> core/mempool.cc:809
> #19 memory::page_range_allocator::alloc_aligned (this=this@entry=0x40
> 904300 <memory::free_page_ranges>, size=size@entry=2097152, 
>     offset=offset@entry=0, alignment=alignment@entry=2097152, 
> fill=fill@entry=true) at core/mempool.cc:736
> #20 0x00000000403f0164 in memory::alloc_huge_page (N=N@entry=2097152)
> at core/mempool.cc:1601
> #21 0x000000004035030e in
> mmu::uninitialized_anonymous_page_provider::map (this=0x40873150
> <mmu::page_allocator_init>, offset=83886080, 
>     ptep=..., pte=..., write=<optimized out>) at include/osv/mmu-
> defs.hh:219
> #22 0x0000000040355b94 in mmu::populate<(mmu::account_opt)1>::page<1>
> (offset=83886080, ptep=..., this=0x2000001ffd70)
>     at include/osv/mmu-defs.hh:235
> #23 mmu::page<mmu::populate<>, 1> (ptep=..., offset=83886080,
> pops=...) at core/mmu.cc:311
> #24 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 2>::operator()
> (base_virt=35185397596160, parent=..., this=<synthetic pointer>)
>     at core/mmu.cc:437
> #25 mmu::map_level<mmu::populate<(mmu::account_opt)1>,
> 3>::map_range<2> (this=<synthetic pointer>, ptep=...,
> base_virt=35184372088832, 
>     slop=4096, page_mapper=..., size=132120576, vcur=<optimized out>)
> at core/mmu.cc:399
> #26 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 3>::operator()
> (base_virt=35184372088832, parent=..., this=<synthetic pointer>)
>     at core/mmu.cc:449
> #27 mmu::map_level<mmu::populate<(mmu::account_opt)1>,
> 4>::map_range<3> (this=<synthetic pointer>, ptep=...,
> base_virt=35184372088832, 
>     slop=4096, page_mapper=..., size=134217728, vcur=<optimized out>)
> at core/mmu.cc:399
> #28 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 4>::operator()
> (base_virt=35184372088832, parent=..., this=<synthetic pointer>)
> --Type <RET> for more, q to quit, c to continue without paging--
>     at core/mmu.cc:449
> #29 mmu::map_range<mmu::populate<(mmu::account_opt)1> > (
> vma_start=vma_start@entry=35185313710080, vstart=vstart@entry=3518531
> 3710080, 
>     size=<optimized out>, page_mapper=..., slop=slop@entry=4096) at
> core/mmu.cc:354
> #30 0x0000000040356385 in
> mmu::operate_range<mmu::populate<(mmu::account_opt)1> >
> (size=<optimized out>, start=0x200038200000, 
>     vma_start=<optimized out>, mapper=...) at core/mmu.cc:801
> #31 mmu::vma::operate_range<mmu::populate<(mmu::account_opt)1> >
> (size=3, addr=0x200038200000, mapper=..., this=0xffffa000012deb00)
>     at core/mmu.cc:1412
> #32 mmu::populate_vma<(mmu::account_opt)1> (vma=vma@entry=0xffffa0000
> 12deb00, v=v@entry=0x200038200000, size=size@entry=134217728, 
>     write=write@entry=false) at core/mmu.cc:1206
> #33 0x000000004034e8d2 in mmu::map_anon (addr=addr@entry=0x0, 
> size=size@entry=134217728, flags=flags@entry=2, perm=perm@entry=3)
>     at core/mmu.cc:1222
> #34 0x000000004046503d in mmap (addr=addr@entry=0x0, 
> length=length@entry=134217728, prot=prot@entry=3, flags=flags@entry=3
> 2802, 
>     fd=fd@entry=-1, offset=offset@entry=0) at libc/mman.cc:156
> #35 0x0000100000006624 in exhaust_memory (size=size@entry=134217728)
> at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:31
> #36 0x000010000000621e in main (argc=<optimized out>, argv=<optimized
> out>) at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:99
> #37 0x000000004043090d in osv::application::run_main
> (this=0xffffa00001130e10) at /usr/include/c++/8/bits/stl_vector.h:805
> #38 0x0000000040226b51 in osv::application::main
> (this=0xffffa00001130e10) at core/app.cc:320
> #39 0x0000000040430ab9 in
> osv::application::<lambda(void*)>::operator() (__closure=0x0,
> app=<optimized out>) at core/app.cc:233
> #40 osv::application::<lambda(void*)>::_FUN(void *) () at
> core/app.cc:235
> #41 0x000000004045eec6 in
> pthread_private::pthread::<lambda()>::operator()
> (__closure=0xffffa0000149c200) at libc/pthread.cc:114
> #42 std::_Function_handler<void(),
> pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t,
> const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const
> std::_Any_data &) (__functor=...) at
> /usr/include/c++/8/bits/std_function.h:297
> #43 0x0000000040401117 in sched::thread_main_c (t=0xffff8000014a1040)
> at arch/x64/arch-switch.hh:271
> #44 0x00000000403a7263 in thread_main () at arch/x64/entry.S:113
> 
> Waldek
> 
> On Tuesday, August 20, 2019 at 10:53:30 PM UTC-4, Waldek Kozaczuk
> wrote:
> > The commit 97fe8aa3d2d8f2c938fcaa379c44ae5a80dfbf33 adjusted logic 
> > in arch_setup_free_memory() to improve memory utilization 
> > by making OSv use memory below kernel (<= 2MB). 
> > 
> > Ironically the new logic introduced new bug which led to much
> > bigger 
> > waste of memory. Specifically it did not take into account 
> > the case of memory region starting below 2MB and ending 
> > above 1GB at the same time and make it skip the part above 1GB
> > altogether. 
> > 
> > This patch fixes this bug and makes issue reported below go away. 
> > 
> > Fixes #1048 
> > 
> > Signed-off-by: Waldemar Kozaczuk <jwkozac...@gmail.com> 
> > --- 
> >  arch/x64/arch-setup.cc | 12 ++++++++---- 
> >  1 file changed, 8 insertions(+), 4 deletions(-) 
> > 
> > diff --git a/arch/x64/arch-setup.cc b/arch/x64/arch-setup.cc 
> > index e5fb7a6e..986a0928 100644 
> > --- a/arch/x64/arch-setup.cc 
> > +++ b/arch/x64/arch-setup.cc 
> > @@ -175,11 +175,15 @@ void arch_setup_free_memory() 
> >          // 
> >          // Free the memory below elf_phys_start which we could not
> > before 
> >          if (ent.addr < (u64)elf_phys_start) { 
> > +            auto ent_below_kernel = ent; 
> >              if (ent.addr + ent.size >= (u64)elf_phys_start) { 
> > -                ent = truncate_above(ent, (u64) elf_phys_start); 
> > +                ent_below_kernel = truncate_above(ent, (u64)
> > elf_phys_start); 
> > +            } 
> > +            mmu::free_initial_memory_range(ent_below_kernel.addr,
> > ent_below_kernel.size); 
> > +            // If there is nothing left below elf_phys_start
> > return 
> > +            if (ent.addr + ent.size <= (u64)elf_phys_start) { 
> > +               return; 
> >              } 
> > -            mmu::free_initial_memory_range(ent.addr, ent.size); 
> > -            return; 
> >          } 
> >          // 
> >          // Ignore memory already freed above 
> > @@ -331,4 +335,4 @@ void reset_bootchart(osv_multiboot_info_type*
> > mb_info) 
> >   
> >      mb_info->tsc_uncompress_done_hi = now_high; 
> >      mb_info->tsc_uncompress_done = now_low; 
> > -} 
> > \ No newline at end of file 
> > +} 

-- 
You received this message because you are subscribed to the Google Groups "OSv 
Development" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to osv-dev+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/osv-dev/6e20f4ba3085db676f28f9799cf2c7eb28fb94bf.camel%40rossfell.co.uk.

Reply via email to