On 01/08, Mel Gorman wrote:
>
> On Wed, Jan 08, 2014 at 05:13:38PM +0100, Oleg Nesterov wrote:
> >
> > Yes. But, for example, get_futex_key() does
> >
> >     if (unlikely(PageTail(page))) {
> >             put_page(page);
> >
> > why this put_page() can't race with _split? If nothing else, another thread
> > can unmap the part of this vma.
> >
>
> The race is not prevented but that does not mean it matters. Basic
> scenario where a split starts after the PageTail check but before the
> put_page in get_futex_key
>
> CPU A
> get_futex_key
>   -> fast gup, page table removing prevents parallel unmap and free
>     -> gup_huge_pmd (arch/x86/mm/gup.c at least)
>       -> get_huge_page_tail (increment page tail _map_count)
>       -> get_huge_page_multiple (increment ref on head page)
>   -> Check PageTail
>                                       CPU B
>                                       split_huge_page_to_list
>                                         -> split_huge_page_refcount
>                                            spin_lock_irq(lru_lock)
>                                            compound_lock
>   -> put_page(tail_page)
>     ->put_compound_page
>        looks up head page

Yes.

But suppose that CPU B completes split_huge_page_to_list/munmap/etc
and frees this head page.

>        takes reference unless zero

suppose this page_head was reallocated and get_page_unless_zero()
succeds right after set_page_refcounted(),

>        compound_lock (block)
>                                            complete split
>                                            compound_unlock
>        check PageTail
>
> This put_page blocks on the compound lock, finds the page is no longer a
> PageTail

Sure. The problem is that compound_lock() itself can race with prep_new_page()
or I missed something.

> The parallel unmap is prevented by get_huge_page_multiple in the gup path
> and held in place until put_page_compound frees it later.

Again, I can easily miss something. And yes, page_tail returned by gup
has a reference to its page_head (via page_head->_count). But
__split_huge_page_refcount() destroys this connection and decrements
page_head->_count.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to