Re: [patch 0/4] [RFC] EMMU Notifiers V5

2008-02-03 Thread Robin Holt
Great news!  I have taken over Dean's xpmem patch set while he is on
sabbatical.  Before he left, he had his patch mostly working on top of
this patch set.  We had one deadlock.  I have coded for that specific
deadlock and xpmem now passes a simple grant/attach/fault/fork/unmap/map
test.

After analyzing it, I believe we still have a nearly related deadlock
which will require some refactoring of code.  I am certain that the
same mechanism I used for this deadlock break will work in that case,
but it will require too many changes for me to finish this weekend.

For our customer base, this case, in the past, has resulted in termination
of the application and our MPI library specifically states that this
mode of operation is not permitted, so I think we will be able to pass
their regression tests.  I will need to coordinate that early next week.

The good news, at this point, Christoph's version 5 of the mmu_notifiers
appears to work for xpmem.  The mmu_notifier call-outs where the
in_atomic flag is set still result in a BUG_ON.  That is not an issue
for our normal customer as our MPI already states this is not a valid
mode of operation and provides means to avoid those types of mappings.

Thanks,
Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/4] [RFC] EMMU Notifiers V5

2008-02-03 Thread Robin Holt
Great news!  I have taken over Dean's xpmem patch set while he is on
sabbatical.  Before he left, he had his patch mostly working on top of
this patch set.  We had one deadlock.  I have coded for that specific
deadlock and xpmem now passes a simple grant/attach/fault/fork/unmap/map
test.

After analyzing it, I believe we still have a nearly related deadlock
which will require some refactoring of code.  I am certain that the
same mechanism I used for this deadlock break will work in that case,
but it will require too many changes for me to finish this weekend.

For our customer base, this case, in the past, has resulted in termination
of the application and our MPI library specifically states that this
mode of operation is not permitted, so I think we will be able to pass
their regression tests.  I will need to coordinate that early next week.

The good news, at this point, Christoph's version 5 of the mmu_notifiers
appears to work for xpmem.  The mmu_notifier call-outs where the
in_atomic flag is set still result in a BUG_ON.  That is not an issue
for our normal customer as our MPI already states this is not a valid
mode of operation and provides means to avoid those types of mappings.

Thanks,
Robin
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/4] [RFC] EMMU Notifiers V5

2008-02-02 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 09:04:39PM -0800, Christoph Lameter wrote:
> - Has page tables to track pages whose refcount was elevated(?) but
>   no reverse maps.

Just a correction, rmaps exists or swap couldn't be sane, it's just
that it's not built on the page_t because the guest memory is really
virtual and not physical at all (hence it swaps really well, thanks to
the regular linux VM algorithms without requiring any KVM knowledge at
all, it all looks (shared) anonymous memory as far as linux is
concerned ;).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [patch 0/4] [RFC] EMMU Notifiers V5

2008-02-02 Thread Andrea Arcangeli
On Thu, Jan 31, 2008 at 09:04:39PM -0800, Christoph Lameter wrote:
 - Has page tables to track pages whose refcount was elevated(?) but
   no reverse maps.

Just a correction, rmaps exists or swap couldn't be sane, it's just
that it's not built on the page_t because the guest memory is really
virtual and not physical at all (hence it swaps really well, thanks to
the regular linux VM algorithms without requiring any KVM knowledge at
all, it all looks (shared) anonymous memory as far as linux is
concerned ;).
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/4] [RFC] EMMU Notifiers V5

2008-01-31 Thread Christoph Lameter
This is a patchset implementing MMU notifier callbacks based on Andrea's
earlier work. These are needed if Linux pages are referenced from something
else than tracked by the rmaps of the kernel (an external MMU).

The known immediate users are

KVM
- Establishes a refcount to the page via get_user_pages().
- External references are called spte.
- Has page tables to track pages whose refcount was elevated(?) but
  no reverse maps.

GRU
- Simple additional hardware TLB (possibly covering multiple instances of
  Linux)
- Needs TLB shootdown when the VM unmaps pages.
- Determines page address via follow_page (from interrupt context) but can
  fall back to get_user_pages().
- No page reference possible since no page status is kept..

XPmem
- Allows use of a processes memory by remote instances of Linux.
- Provides its own reverse mappings to track remote pte.
- Established refcounts on the exported pages.
- Must sleep in order to wait for remote acks of ptes that are being
  cleared.



Known issues:

- RCU quiescent periods are required on registering
  notifiers to guarantee visibility to other processors.

Andrea's mmu_notifier #4 -> RFC V1

- Merge subsystem rmap based with Linux rmap based approach
- Move Linux rmap based notifiers out of macro
- Try to account for what locks are held while the notifiers are
  called.
- Develop a patch sequence that separates out the different types of
  hooks so that we can review their use.
- Avoid adding include to linux/mm_types.h
- Integrate RCU logic suggested by Peter.

V1->V2:
- Improve RCU support
- Use mmap_sem for mmu_notifier register / unregister
- Drop invalidate_page from COW, mm/fremap.c and mm/rmap.c since we
  already have invalidate_range() callbacks there.
- Clean compile for !MMU_NOTIFIER
- Isolate filemap_xip strangeness into its own diff
- Pass a the flag to invalidate_range to indicate if a spinlock
  is held.
- Add invalidate_all()

V2->V3:
- Further RCU fixes
- Fixes from Andrea to fixup aging and move invalidate_range() in do_wp_page
  and sys_remap_file_pages() after the pte clearing.

V3->V4:
- Drop locking and synchronize_rcu() on ->release since we know on release that
  we are the only executing thread. This is also true for invalidate_all() so
  we could drop off the mmu_notifier there early. Use hlist_del_init instead
  of hlist_del_rcu.
- Do the invalidation as begin/end pairs with the requirement that the driver
  holds off new references in between.
- Fixup filemap_xip.c
- Figure out a potential way in which XPmem can deal with locks that are held.
- Robin's patches to make the mmu_notifier logic manage the PageRmapExported 
bit.
- Strip cc list down a bit.
- Drop Peters new rcu list macro
- Add description to the core patch

V4->V5:
- Provide missing callouts for mremap.
- Provide missing callouts for copy_page_range.
- Reduce mm_struct space to zero if !MMU_NOTIFIER by #ifdeffing out
  structure contents.
- Get rid of the invalidate_all() callback by moving ->release in place
  of invalidate_all.
- Require holding mmap_sem on register/unregister instead of acquiring it
  ourselves. In some contexts where we want to register/unregister we are
  already holding mmap_sem.
- Split out the rmap support patch so that there is no need to apply
  all patches for KVM and GRU.

-- 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[patch 0/4] [RFC] EMMU Notifiers V5

2008-01-31 Thread Christoph Lameter
This is a patchset implementing MMU notifier callbacks based on Andrea's
earlier work. These are needed if Linux pages are referenced from something
else than tracked by the rmaps of the kernel (an external MMU).

The known immediate users are

KVM
- Establishes a refcount to the page via get_user_pages().
- External references are called spte.
- Has page tables to track pages whose refcount was elevated(?) but
  no reverse maps.

GRU
- Simple additional hardware TLB (possibly covering multiple instances of
  Linux)
- Needs TLB shootdown when the VM unmaps pages.
- Determines page address via follow_page (from interrupt context) but can
  fall back to get_user_pages().
- No page reference possible since no page status is kept..

XPmem
- Allows use of a processes memory by remote instances of Linux.
- Provides its own reverse mappings to track remote pte.
- Established refcounts on the exported pages.
- Must sleep in order to wait for remote acks of ptes that are being
  cleared.



Known issues:

- RCU quiescent periods are required on registering
  notifiers to guarantee visibility to other processors.

Andrea's mmu_notifier #4 - RFC V1

- Merge subsystem rmap based with Linux rmap based approach
- Move Linux rmap based notifiers out of macro
- Try to account for what locks are held while the notifiers are
  called.
- Develop a patch sequence that separates out the different types of
  hooks so that we can review their use.
- Avoid adding include to linux/mm_types.h
- Integrate RCU logic suggested by Peter.

V1-V2:
- Improve RCU support
- Use mmap_sem for mmu_notifier register / unregister
- Drop invalidate_page from COW, mm/fremap.c and mm/rmap.c since we
  already have invalidate_range() callbacks there.
- Clean compile for !MMU_NOTIFIER
- Isolate filemap_xip strangeness into its own diff
- Pass a the flag to invalidate_range to indicate if a spinlock
  is held.
- Add invalidate_all()

V2-V3:
- Further RCU fixes
- Fixes from Andrea to fixup aging and move invalidate_range() in do_wp_page
  and sys_remap_file_pages() after the pte clearing.

V3-V4:
- Drop locking and synchronize_rcu() on -release since we know on release that
  we are the only executing thread. This is also true for invalidate_all() so
  we could drop off the mmu_notifier there early. Use hlist_del_init instead
  of hlist_del_rcu.
- Do the invalidation as begin/end pairs with the requirement that the driver
  holds off new references in between.
- Fixup filemap_xip.c
- Figure out a potential way in which XPmem can deal with locks that are held.
- Robin's patches to make the mmu_notifier logic manage the PageRmapExported 
bit.
- Strip cc list down a bit.
- Drop Peters new rcu list macro
- Add description to the core patch

V4-V5:
- Provide missing callouts for mremap.
- Provide missing callouts for copy_page_range.
- Reduce mm_struct space to zero if !MMU_NOTIFIER by #ifdeffing out
  structure contents.
- Get rid of the invalidate_all() callback by moving -release in place
  of invalidate_all.
- Require holding mmap_sem on register/unregister instead of acquiring it
  ourselves. In some contexts where we want to register/unregister we are
  already holding mmap_sem.
- Split out the rmap support patch so that there is no need to apply
  all patches for KVM and GRU.

-- 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/