Re: [uml-devel] Endless page fault for the same miss address in my UML

Terry Hsu Fri, 12 Apr 2013 20:01:20 -0700

On Fri, Apr 12, 2013 at 1:14 AM, Terry Hsu <terry.sh...@gmail.com> wrote:


> okay so I looked into the faultinfo structure and was able to obtain the
> faulting address, error code, and trap number(?). From my understanding the
> error code is the bottom 3 bits of the exception code. But I see error code
> "20" sometimes and do not what it means.
>

According to p.6-55 in Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 3: System Programming
Guide<http://download.intel.com/design/processor/manuals/253668.pdf>,
the lower 5 bits are Present, Read/Write, User/supervisor, RSVD, and
Instruction/Data bit respectively. So error code 20 means the fault is
caused by an instruction read to a non-present page in user mode.

I found the the reason why the fault cannot be fixed by UML. It is probably
because UML puts the faultinfo in the wrong stub, since I changed the vm
area pointers of the child process, when the fault happens, UML incorrectly
finds its parent process's stub pages and puts the faultinfo in it.
Therefore when the child process tries to access its own skas stub and fix
the fault, it still cannot find the correct instruction pointers hence the
fault happens endlessly.

Why does every process that runs in UML need its own stub for page fault
handling? It seems to me they could've shared the SIGSEGV signal handler
and the function that invokes mmap, munmap, mprotect. In this way only two
pages are needed for all the processes.

I am not sure if I understand the whole thing correctly. Please correct me
if it's not right.

Thanks!

 I am now looking at how the special mapping works with the host kernel. I
> think this might lead me to the solution of my problem. It sounds like the
> special mapping is not installed correctly so that the UML was not able to
> fix the fault.
>
>
>
>
> On Thu, Apr 11, 2013 at 7:00 PM, Terry Hsu <terry.sh...@gmail.com> wrote:
>
>> In the unmodified kernel, I did not see the kernel call mmap (which in
>> turn calls mmap_region) to install the mapping for the faulting page in
>> child task. The child task does not have the UML invoked mmap to install
>> mapping. So I could not examine the parameters passed to mmap neither the
>> return value of it.
>>
>> Thanks for the explanation of the special mapping. After reading your
>> comment I went to Jeff Dike's website to find out more about skas:
>> http://user-mode-linux.sourceforge.net/old/skas.html
>>
>> The handle_pte_fault() calls __do_fault(), which in turn invokes
>> filemap_fault() through
>> vma->vm_ops->fault(vma, &vmf). How do I find out exactly what the miss
>> address is for? I am posting the log I print out here. This is the
>> unmodified kernel version. So the page is faulted in correctly without
>> calling mmap for the forked child task.
>>
>> *Note: this is the correct version of page fault in the unmodified
>> kernel.*
>> [segv_handler] Caller is userspace+0x25d/0x44c, pid 598 a.out
>> [segv] Caller is segv_handler+0xb1/0xbb, pid 598 a.out
>> [handle_page_fault] Caller is segv+0xfa/0x324, pid 598 a.out
>> [handle_page_fault] fault address: 0x400e9cc8
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte does not exist!
>> [handle_page_fault] before handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 0
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>> [find_vma] Caller is handle_page_fault+0x1ca/0x957, pid 598 a.out
>> [handle_mm_fault] Caller is handle_page_fault+0x50d/0x957, pid 598 a.out
>> [handle_mm_fault] pgd: 295944192
>> [handle_mm_fault] pud: 295944192
>> [handle_mm_fault] pmd: 294746112
>> [*handle_mm_fault*] pte: 295581512
>> [*handle_pte_fault*] calling do_linear_fault
>> [*__do_fault*] __do_fault for 0x400e9cc8
>> [__do_fault] line 3292 of file mm/memory.c, pid 598
>> [*filemap_fault*] line 1604 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1622 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1654 of file mm/filemap.c, pid 598
>> [filemap_fault] line 1680 of file mm/filemap.c, pid 598
>> [__do_fault] line 3312 of file mm/memory.c, pid 598
>> [__do_fault] line 3367 of file mm/memory.c, pid 598
>> [__do_fault] line 3395 of file mm/memory.c, pid 598
>> [__do_fault] line 3408 of file mm/memory.c, pid 598
>> [__do_fault] line 3425 of file mm/memory.c, pid 598
>> [__do_fault] line 3458 of file mm/memory.c, pid 598
>> [__do_fault] __do_fault for 0x400e9cc8 returning 512
>> [handle_page_fault] line 205 of file arch/um/kernel/trap.c, pid 598
>> [handle_page_fault] mm->mm_id: 673
>> [flush_tlb_page] Caller is handle_page_fault+0x7f5/0x957, pid 598 a.out
>> [flush_tlb_page] mm->mm_id: 673
>> [handle_page_fault] page walk for 0x400e9cc8
>> [handle_page_fault] pte for 0x400e9cc8: 0x119e3748
>> [handle_page_fault] after handle_page_fault
>> [print_mm_rss_stat] mm->rss_stat for mm id: 673
>> [print_mm_rss_stat] mm->rss_stat.count[0] = 1
>> [print_mm_rss_stat] mm->rss_stat.count[1] = 27
>> [print_mm_rss_stat] mm->rss_stat.count[2] = 0
>>
>>
>>
>>
>>
>> On Thu, Apr 11, 2013 at 5:19 PM, richard -rw- weinberger <
>> richard.weinber...@gmail.com> wrote:
>>
>>> On Thu, Apr 11, 2013 at 10:14 PM, Terry Hsu <terry.sh...@gmail.com>
>>> wrote:
>>> > The page fault loop for the same address happens in my UML. But for
>>> both my
>>> > UML and the mainline (I am using 3.7.1) kernel, the addresses that
>>> trigger
>>> > the page fault (in the child thread) are covered by certain vm areas.
>>> I use
>>> > gdb to trace the function call and notice that mmap_region() is never
>>> called
>>> > during the execution of the child task. I am guessing it's because the
>>> child
>>> > task does not use large enough memory space to have the UML installed
>>> > mapping for it.
>>>
>>> Okay, let's try to figure out what happens here.
>>> The UML _guest_ process has some vmas installed, upon access the host
>>> kernel finds
>>> out that there is no memory mapping installed in the _host_ side of
>>> UML and sends SIGSEGV
>>> to the process. UML's host part catches the SIGSEGV and tries to fix it.
>>> Usually it does so by mmap()'ing the faulting page into the UML guest
>>> process.
>>> This is where the SKAS stub magic happens. It write the to be fixed
>>> address into STUB_DATA
>>> and sets EIP/RIP to STUB_CODE such that the process itself calls mmap().
>>> After the stub has finished it traps itself and the UML emulation
>>> continues.
>>>
>>> Now we need to figure out a) What address is faulting and why? b) What
>>> does the UML _host_ side
>>> code to fix it? i.e. What are the mmap() parameters? c) Does this mmap()
>>> fail?
>>>
>>> To me it looks like UML is unable to fix the fault and therefore it
>>> faults over and over again.
>>>
>>> --
>>> Thanks,
>>> //richard
>>>
>>
>>
>

------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter

_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

Re: [uml-devel] Endless page fault for the same miss address in my UML

Reply via email to