Thanks for this - I was hitting a wierd page fault issue on our application as we've recently moved from 0.52 to the latest OSv. Something like this, which occurs early on in startup:
Assertion failed: ef->rflags & processor::rflags_if (arch/x64/mmu.cc: page_fault: 34) [backtrace] 0x00000000402298ea <__assert_fail+26> 0x000000004039aa30 <page_fault+240> 0x0000000040399826 <???+1077516326> 0x000000004039c0a8 <interrupt+232> 0x000000004039a779 <???+1077520249> 0x0000000040214ca1 <main_cont(int, char**)+193> 0x00000000403f9646 <thread_main_c+38> 0x000000004039a7a2 <???+1077520290> On fixing the memory to 2GB in virsh, the problem was fixed. Applying your patch also fixed it, it seems. Rick On Tue, 2019-08-20 at 20:04 -0700, Waldek Kozaczuk wrote: > This patch definitely fixes an apparent bug I introduced myself in > the past. I have tested that issue #1048 goes away with 4,5,6, 7 or > 8GB of memory. I have also verified using cli module that free memory > is reported properly now. > > However, there is still 1 question and 1 issue outstanding: > 1. I do not understand how this bug arch_setup_free_memory() would > lead to a page fault reported by issue 1048 or other "read errors" > with higher memory (8GB, end so). I would expect this bug lead to OSv > missing to use the memory above 1GB in the e820 block but still be > able to operate properly without the page fault. Is there another > underlying bug that this patch actually covers? > > 2. After this patch the tst-huge.so does not pass - actually hangs or > never completes. I have played with it a bit and discovered that it > passes if I run it with the right amount of memory - 128M < m <= 1G, > but fails with anything above 1GB (the deafult is 2GB). It could be > that the test is flaky and has to have right amount of free memory to > pass (?). > > Here is the stacktrace of where it was stuck: > > sched::thread::switch_to (this=this@entry=0xffff8000001ba040) at > arch/x64/arch-switch.hh:108 > #1 0x00000000403ff794 in sched::cpu::reschedule_from_interrupt > (this=0xffff80000001d040, called_from_yield=called_from_yield@entry=f > alse, > preempt_after=..., preempt_after@entry=...) at core/sched.cc:339 > #2 0x00000000403ffc8c in sched::cpu::schedule () at > include/osv/sched.hh:1310 > #3 0x0000000040400372 in sched::thread::wait (this=this@entry=0xffff > 8000014a1040) at core/sched.cc:1214 > #4 0x0000000040428072 in sched::thread::do_wait_for<lockfree::mutex, > sched::wait_object<waitqueue> > (mtx=...) at include/osv/mutex.h:41 > #5 sched::thread::wait_for<waitqueue&> (mtx=...) at > include/osv/sched.hh:1220 > #6 waitqueue::wait (this=this@entry=0x408ec550 > <mmu::vma_list_mutex+48>, mtx=...) at core/waitqueue.cc:56 > #7 0x00000000403e2d83 in rwlock::reader_wait_lockable > (this=<optimized out>) at core/rwlock.cc:174 > #8 rwlock::rlock (this=this@entry=0x408ec520 <mmu::vma_list_mutex>) > at core/rwlock.cc:29 > #9 0x000000004034ad98 in rwlock_for_read::lock (this=0x408ec520 > <mmu::vma_list_mutex>) at include/osv/rwlock.h:113 > #10 std::lock_guard<rwlock_for_read&>::lock_guard (__m=..., > this=<synthetic pointer>) at /usr/include/c++/8/bits/std_mutex.h:162 > #11 > lock_guard_for_with_lock<rwlock_for_read&>::lock_guard_for_with_lock > (lock=..., this=<synthetic pointer>) at include/osv/mutex.h:89 > #12 mmu::vm_fault (addr=18446603337326391296, addr@entry=184466033373 > 26395384, ef=ef@entry=0xffff8000014a6068) at core/mmu.cc:1334 > #13 0x00000000403a746e in page_fault (ef=0xffff8000014a6068) at > arch/x64/mmu.cc:38 > #14 <signal handler called> > #15 0x00000000403f2114 in memory::page_range_allocator::insert<true> > (this=this@entry=0x40904300 <memory::free_page_ranges>, pr=...) > at core/mempool.cc:575 > #16 0x00000000403ef83c in > memory::page_range_allocator::<lambda(memory::page_range&)>::operator > () (header=..., __closure=<synthetic pointer>) > at core/mempool.cc:751 > #17 > memory::page_range_allocator::<lambda(memory::page_range&)>::operator > () (header=..., __closure=<synthetic pointer>) at core/mempool.cc:736 > #18 > memory::page_range_allocator::for_each<memory::page_range_allocator:: > alloc_aligned(size_t, size_t, size_t, > bool)::<lambda(memory::page_range&)> > (f=..., min_order=<optimized > out>, this=0x40904300 <memory::free_page_ranges>) at > core/mempool.cc:809 > #19 memory::page_range_allocator::alloc_aligned (this=this@entry=0x40 > 904300 <memory::free_page_ranges>, size=size@entry=2097152, > offset=offset@entry=0, alignment=alignment@entry=2097152, > fill=fill@entry=true) at core/mempool.cc:736 > #20 0x00000000403f0164 in memory::alloc_huge_page (N=N@entry=2097152) > at core/mempool.cc:1601 > #21 0x000000004035030e in > mmu::uninitialized_anonymous_page_provider::map (this=0x40873150 > <mmu::page_allocator_init>, offset=83886080, > ptep=..., pte=..., write=<optimized out>) at include/osv/mmu- > defs.hh:219 > #22 0x0000000040355b94 in mmu::populate<(mmu::account_opt)1>::page<1> > (offset=83886080, ptep=..., this=0x2000001ffd70) > at include/osv/mmu-defs.hh:235 > #23 mmu::page<mmu::populate<>, 1> (ptep=..., offset=83886080, > pops=...) at core/mmu.cc:311 > #24 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 2>::operator() > (base_virt=35185397596160, parent=..., this=<synthetic pointer>) > at core/mmu.cc:437 > #25 mmu::map_level<mmu::populate<(mmu::account_opt)1>, > 3>::map_range<2> (this=<synthetic pointer>, ptep=..., > base_virt=35184372088832, > slop=4096, page_mapper=..., size=132120576, vcur=<optimized out>) > at core/mmu.cc:399 > #26 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 3>::operator() > (base_virt=35184372088832, parent=..., this=<synthetic pointer>) > at core/mmu.cc:449 > #27 mmu::map_level<mmu::populate<(mmu::account_opt)1>, > 4>::map_range<3> (this=<synthetic pointer>, ptep=..., > base_virt=35184372088832, > slop=4096, page_mapper=..., size=134217728, vcur=<optimized out>) > at core/mmu.cc:399 > #28 mmu::map_level<mmu::populate<(mmu::account_opt)1>, 4>::operator() > (base_virt=35184372088832, parent=..., this=<synthetic pointer>) > --Type <RET> for more, q to quit, c to continue without paging-- > at core/mmu.cc:449 > #29 mmu::map_range<mmu::populate<(mmu::account_opt)1> > ( > vma_start=vma_start@entry=35185313710080, vstart=vstart@entry=3518531 > 3710080, > size=<optimized out>, page_mapper=..., slop=slop@entry=4096) at > core/mmu.cc:354 > #30 0x0000000040356385 in > mmu::operate_range<mmu::populate<(mmu::account_opt)1> > > (size=<optimized out>, start=0x200038200000, > vma_start=<optimized out>, mapper=...) at core/mmu.cc:801 > #31 mmu::vma::operate_range<mmu::populate<(mmu::account_opt)1> > > (size=3, addr=0x200038200000, mapper=..., this=0xffffa000012deb00) > at core/mmu.cc:1412 > #32 mmu::populate_vma<(mmu::account_opt)1> (vma=vma@entry=0xffffa0000 > 12deb00, v=v@entry=0x200038200000, size=size@entry=134217728, > write=write@entry=false) at core/mmu.cc:1206 > #33 0x000000004034e8d2 in mmu::map_anon (addr=addr@entry=0x0, > size=size@entry=134217728, flags=flags@entry=2, perm=perm@entry=3) > at core/mmu.cc:1222 > #34 0x000000004046503d in mmap (addr=addr@entry=0x0, > length=length@entry=134217728, prot=prot@entry=3, flags=flags@entry=3 > 2802, > fd=fd@entry=-1, offset=offset@entry=0) at libc/mman.cc:156 > #35 0x0000100000006624 in exhaust_memory (size=size@entry=134217728) > at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:31 > #36 0x000010000000621e in main (argc=<optimized out>, argv=<optimized > out>) at /home/wkozaczuk/projects/osv/tests/tst-huge.cc:99 > #37 0x000000004043090d in osv::application::run_main > (this=0xffffa00001130e10) at /usr/include/c++/8/bits/stl_vector.h:805 > #38 0x0000000040226b51 in osv::application::main > (this=0xffffa00001130e10) at core/app.cc:320 > #39 0x0000000040430ab9 in > osv::application::<lambda(void*)>::operator() (__closure=0x0, > app=<optimized out>) at core/app.cc:233 > #40 osv::application::<lambda(void*)>::_FUN(void *) () at > core/app.cc:235 > #41 0x000000004045eec6 in > pthread_private::pthread::<lambda()>::operator() > (__closure=0xffffa0000149c200) at libc/pthread.cc:114 > #42 std::_Function_handler<void(), > pthread_private::pthread::pthread(void* (*)(void*), void*, sigset_t, > const pthread_private::thread_attr*)::<lambda()> >::_M_invoke(const > std::_Any_data &) (__functor=...) at > /usr/include/c++/8/bits/std_function.h:297 > #43 0x0000000040401117 in sched::thread_main_c (t=0xffff8000014a1040) > at arch/x64/arch-switch.hh:271 > #44 0x00000000403a7263 in thread_main () at arch/x64/entry.S:113 > > Waldek > > On Tuesday, August 20, 2019 at 10:53:30 PM UTC-4, Waldek Kozaczuk > wrote: > > The commit 97fe8aa3d2d8f2c938fcaa379c44ae5a80dfbf33 adjusted logic > > in arch_setup_free_memory() to improve memory utilization > > by making OSv use memory below kernel (<= 2MB). > > > > Ironically the new logic introduced new bug which led to much > > bigger > > waste of memory. Specifically it did not take into account > > the case of memory region starting below 2MB and ending > > above 1GB at the same time and make it skip the part above 1GB > > altogether. > > > > This patch fixes this bug and makes issue reported below go away. > > > > Fixes #1048 > > > > Signed-off-by: Waldemar Kozaczuk <[email protected]> > > --- > > arch/x64/arch-setup.cc | 12 ++++++++---- > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/arch/x64/arch-setup.cc b/arch/x64/arch-setup.cc > > index e5fb7a6e..986a0928 100644 > > --- a/arch/x64/arch-setup.cc > > +++ b/arch/x64/arch-setup.cc > > @@ -175,11 +175,15 @@ void arch_setup_free_memory() > > // > > // Free the memory below elf_phys_start which we could not > > before > > if (ent.addr < (u64)elf_phys_start) { > > + auto ent_below_kernel = ent; > > if (ent.addr + ent.size >= (u64)elf_phys_start) { > > - ent = truncate_above(ent, (u64) elf_phys_start); > > + ent_below_kernel = truncate_above(ent, (u64) > > elf_phys_start); > > + } > > + mmu::free_initial_memory_range(ent_below_kernel.addr, > > ent_below_kernel.size); > > + // If there is nothing left below elf_phys_start > > return > > + if (ent.addr + ent.size <= (u64)elf_phys_start) { > > + return; > > } > > - mmu::free_initial_memory_range(ent.addr, ent.size); > > - return; > > } > > // > > // Ignore memory already freed above > > @@ -331,4 +335,4 @@ void reset_bootchart(osv_multiboot_info_type* > > mb_info) > > > > mb_info->tsc_uncompress_done_hi = now_high; > > mb_info->tsc_uncompress_done = now_low; > > -} > > \ No newline at end of file > > +} -- You received this message because you are subscribed to the Google Groups "OSv Development" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/osv-dev/6e20f4ba3085db676f28f9799cf2c7eb28fb94bf.camel%40rossfell.co.uk.
