On 14/10/19(Mon) 16:17, Alexander Bluhm wrote:
> On Fri, Oct 11, 2019 at 01:19:02PM +0000, L??vai, D??niel wrote:
> > uvm_fault(0xfffffd8124d90960, 0x7f884cecdcf8, 0, 2) -> e
^^^^^^^^^^^^^^
Do I understand correctly that the faulting page is 0x7f884cecd000?
PTE_BASE corresponds to 0x7f8000000000, the VA in the fault above should
be 0x84cecdcf8000, in bluhm@'s report 0x27ea48908000.
Both reports involve multi-threaded programs.
Alexander what is the CPU of the machine where you can reproduce the
bug?
Are we trying to understand how a page storing PTEs can generate a
fault? Is it what the traces say or am I completely on a wrong track?
> > kernel: page fault trap, code=0
> > Stopped at pmap_page_remove+0x210: xchgq %rax,0(%rcx,%rdx,1)
>
> > ddb{3}> trace
> > pmap_page_remove(fffffd800975d480) at pmap_page_remove+0x210
> > uvm_anfree(fffffd8125d62b10) at uvm_anfree+0x36
> > amap_wipeout(fffffd8123d95170) at amap_wipeout+0xe5
> > uvm_unmap_detach(ffff800022420fe8,0) at uvm_unmap_detach+0x90
> > sys_munmap(ffff800022233cb8,ffff800022421060,ffff8000224210d0) at
> > sys_munmap+0x11d
> > syscall(ffff800022421140) at syscall+0x305
> > Xsyscall(6,49,109a8d931e10,49,109a58e72150,1099d9b9f000) at Xsyscall+0x128
> > end of kernel
> > end trace frame: 0x109a82dffa50, count: -7
>
> I see this bug for a while now.
>
> https://marc.info/?l=openbsd-bugs&m=156399483018833&w=2
>
> I can trigger it by running /usr/src/regress/lib/libpthread/malloc_duel
> for some hours. Moritz Buhl has tried to bisect the problem and
> it appears to exists since January 2019. But it is hard to be sure
> as reproducing takes a while. It is also unclear whether the change
> in behavior is caused by compiler, kernel, libc, libpthread or
> malloc_duel. We could not trigger it with OpenBSD 6.4.
>
> bluhm
>