On Fri, Oct 24, 2008 at 2:44 PM, Rene Herman <[EMAIL PROTECTED]>wrote:
> On 24-10-08 09:25, Prasad Joshi wrote:
>
> My understanding is when a process does a fork
>> 1. a new page table will be allocated to the process
>> 2. it will be exactly same copy of the parent process
>> 3. both the page tables (parent's and child's) will have entries
>> marked as read-only
>> So any write to them will be detected as page fault and page fault
>> handler will treat the pages as COW and will essentially create a new
>> page, update the faulted process's page table and returns.
>>
>> The question is
>> dup_mem() is actually copying the current process's mm_struct to the
>> new process mm_struct. So now new process's mm_struct->pgd will be
>> pointing to pgd of parent process.
>>
>> Then it is calling mm_init() to allocate new pgd, why?
>> My understanding is conflicting with the code, Is my understanding
>> correct?
>>
>
> Your understanding is correct. If I understand the question correctly it
> seems you are confusing pagetables and the pages themselves here though.
>
> Yes, the pages are shared CoW but the page _tables_ are not. Both parent
> and child have an actual physical copy of the tables; a write will fault,
> unmark both parent and child's entries read-only, and update the address in
> the faulting process' entry to the address of the newly allocated page.
>
Thanks a lot, I got it,
dup_mmap() is the function which copies the page table entries of the parent
process in the child process and marks them as read-only pages.
Flow of how sys_fork() will result into a call to the dup_mmap()
do_fork ()
copy_process
copy_mm
dup_mm
allocates memory to hold struct mm_struct
memcpy(mm, current->oldmm, sizeof(*mm));
calls mm_init() ==> creates new pgd for child process (also
pmd and pud)
calls dup_mmap()
dup_mmap ()
{
duplicates the vma regions
copyies page table entries from parent to child
then calls pte_wrprotect() for child and parent page
}
static inline pte_t pte_wrprotect(pte_t pte)
{
return __pte(pte_val(pte) & ~_PAGE_RW);
}
Thanks Rene,
--Prasad
>
> It is possible to not only CoW the pages but also the tables themselves and
> there were some initial implementations of that in 2.{4,5} times and in the
> context of the rmap patch (which increased data that had to be copied) but
> as far as I'm aware, that died as not in the end all that useful. A single
> page holds 1024 PTEs and a child would fault on its stack immediately so
> that you'd get an immediate fault anyway at least for that.
>
> In these days of ballooning 64-bit addressspaces, revisiting CoW tables
> might actually be useful -- if that hasn't in fact been done already,
> ofcourse...
>
> Rene.
>