Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-13 Thread Jack Steiner
> GRU
> - Simple additional hardware TLB (possibly covering multiple instances of
>   Linux)
> - Needs TLB shootdown when the VM unmaps pages.
> - Determines page address via follow_page (from interrupt context) but can
>   fall back to get_user_pages().
> - No page reference possible since no page status is kept..

I applied the latest mmuops patch to a 2.6.24 kernel & updated the
GRU driver to use it. As far as I can tell, everything works ok.
Although more testing is needed, all current tests of driver functionality
are working on both a system simulator and a hardware simulator.

The driver itself is still a few weeks from being ready to post but I can
send code fragments of the portions related to mmuops or external TLB
management if anyone is interested.


--- jack

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Andrew Morton wrote:

> Quite possibly none of the infiniband developers even know about it..

Well Andrea's initial approach was even featured on LWN a couple of 
weeks back.


-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 8 Feb 2008 16:05:00 -0800 (PST) Christoph Lameter <[EMAIL PROTECTED]> 
wrote:

> On Fri, 8 Feb 2008, Andrew Morton wrote:
> 
> > You took it correctly, and I didn't understand the answer ;)
> 
> We have done several rounds of discussion on linux-kernel about this so 
> far and the IB folks have not shown up to join in. I have tried to make 
> this as general as possible.

infiniband would appear to be the major present in-kernel client of this new
interface.  So as a part of proving its usefulness, correctness, etc we
should surely work on converting infiniband to use it, and prove its
goodness.

Quite possibly none of the infiniband developers even know about it..

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Andrew Morton wrote:

> You took it correctly, and I didn't understand the answer ;)

We have done several rounds of discussion on linux-kernel about this so 
far and the IB folks have not shown up to join in. I have tried to make 
this as general as possible.

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 8 Feb 2008 17:43:02 -0600 Robin Holt <[EMAIL PROTECTED]> wrote:

> On Fri, Feb 08, 2008 at 03:41:24PM -0800, Christoph Lameter wrote:
> > On Fri, 8 Feb 2008, Robin Holt wrote:
> > 
> > > > > What about ib_umem_get()?
> > 
> > Correct.
> > 
> > You missed the turn of the conversation to how ib_umem_get() works. 
> > Currently it seems to pin the same way that the SLES10 XPmem works.
> 
> Ah.  I took Andrew's question as more of a probe about whether we had
> worked with the IB folks to ensure this fits the ib_umem_get needs
> as well.
> 

You took it correctly, and I didn't understand the answer ;)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Robin Holt
On Fri, Feb 08, 2008 at 03:41:24PM -0800, Christoph Lameter wrote:
> On Fri, 8 Feb 2008, Robin Holt wrote:
> 
> > > > What about ib_umem_get()?
> 
> Correct.
> 
> You missed the turn of the conversation to how ib_umem_get() works. 
> Currently it seems to pin the same way that the SLES10 XPmem works.

Ah.  I took Andrew's question as more of a probe about whether we had
worked with the IB folks to ensure this fits the ib_umem_get needs
as well.

Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Robin Holt wrote:

> > > What about ib_umem_get()?
> > 
> > Ok. It pins using an elevated refcount. Same as XPmem right now. With that 
> > we effectively pin a page (page migration will fail) but we will 
> > continually be reclaiming the page and may repeatedly try to move it. We 
> > have issues with XPmem causing too many pages to be pinned and thus the 
> > OOM getting into weird behavior modes (OOM or stop lru scanning due to 
> > all_reclaimable set).
> > 
> > An elevated refcount will also not be noticed by any of the schemes under 
> > consideration to improve LRU scanning performance.
> 
> Christoph, I am not sure what you are saying here.  With v4 and later,
> I thought we were able to use the rmap invalidation to remove the ref
> count that XPMEM was holding and therefore be able to swapout.  Did I miss
> something?  I agree the existing XPMEM does pin.  I hope we are not saying
> the XPMEM based upon these patches will not be able to swap/migrate.

Correct.

You missed the turn of the conversation to how ib_umem_get() works. 
Currently it seems to pin the same way that the SLES10 XPmem works.




-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Robin Holt
On Fri, Feb 08, 2008 at 03:32:19PM -0800, Christoph Lameter wrote:
> On Fri, 8 Feb 2008, Andrew Morton wrote:
> 
> > What about ib_umem_get()?
> 
> Ok. It pins using an elevated refcount. Same as XPmem right now. With that 
> we effectively pin a page (page migration will fail) but we will 
> continually be reclaiming the page and may repeatedly try to move it. We 
> have issues with XPmem causing too many pages to be pinned and thus the 
> OOM getting into weird behavior modes (OOM or stop lru scanning due to 
> all_reclaimable set).
> 
> An elevated refcount will also not be noticed by any of the schemes under 
> consideration to improve LRU scanning performance.

Christoph, I am not sure what you are saying here.  With v4 and later,
I thought we were able to use the rmap invalidation to remove the ref
count that XPMEM was holding and therefore be able to swapout.  Did I miss
something?  I agree the existing XPMEM does pin.  I hope we are not saying
the XPMEM based upon these patches will not be able to swap/migrate.

Thanks,
Robin

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
On Fri, 8 Feb 2008, Andrew Morton wrote:

> What about ib_umem_get()?

Ok. It pins using an elevated refcount. Same as XPmem right now. With that 
we effectively pin a page (page migration will fail) but we will 
continually be reclaiming the page and may repeatedly try to move it. We 
have issues with XPmem causing too many pages to be pinned and thus the 
OOM getting into weird behavior modes (OOM or stop lru scanning due to 
all_reclaimable set).

An elevated refcount will also not be noticed by any of the schemes under 
consideration to improve LRU scanning performance.

 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


Re: [kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Andrew Morton
On Fri, 08 Feb 2008 14:06:16 -0800
Christoph Lameter <[EMAIL PROTECTED]> wrote:

> This is a patchset implementing MMU notifier callbacks based on Andrea's
> earlier work. These are needed if Linux pages are referenced from something
> else than tracked by the rmaps of the kernel (an external MMU). MMU
> notifiers allow us to get rid of the page pinning for RDMA and various
> other purposes. It gets rid of the broken use of mlock for page pinning.
> (mlock really does *not* pin pages)
> 
> More information on the rationale and the technical details can be found in
> the first patch and the README provided by that patch in
> Documentation/mmu_notifiers.
> 
> The known immediate users are
> 
> KVM
> - Establishes a refcount to the page via get_user_pages().
> - External references are called spte.
> - Has page tables to track pages whose refcount was elevated but
>   no reverse maps.
> 
> GRU
> - Simple additional hardware TLB (possibly covering multiple instances of
>   Linux)
> - Needs TLB shootdown when the VM unmaps pages.
> - Determines page address via follow_page (from interrupt context) but can
>   fall back to get_user_pages().
> - No page reference possible since no page status is kept..
> 
> XPmem
> - Allows use of a processes memory by remote instances of Linux.
> - Provides its own reverse mappings to track remote pte.
> - Established refcounts on the exported pages.
> - Must sleep in order to wait for remote acks of ptes that are being
>   cleared.
> 

What about ib_umem_get()?

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel


[kvm-devel] [patch 0/6] MMU Notifiers V6

2008-02-08 Thread Christoph Lameter
This is a patchset implementing MMU notifier callbacks based on Andrea's
earlier work. These are needed if Linux pages are referenced from something
else than tracked by the rmaps of the kernel (an external MMU). MMU
notifiers allow us to get rid of the page pinning for RDMA and various
other purposes. It gets rid of the broken use of mlock for page pinning.
(mlock really does *not* pin pages)

More information on the rationale and the technical details can be found in
the first patch and the README provided by that patch in
Documentation/mmu_notifiers.

The known immediate users are

KVM
- Establishes a refcount to the page via get_user_pages().
- External references are called spte.
- Has page tables to track pages whose refcount was elevated but
  no reverse maps.

GRU
- Simple additional hardware TLB (possibly covering multiple instances of
  Linux)
- Needs TLB shootdown when the VM unmaps pages.
- Determines page address via follow_page (from interrupt context) but can
  fall back to get_user_pages().
- No page reference possible since no page status is kept..

XPmem
- Allows use of a processes memory by remote instances of Linux.
- Provides its own reverse mappings to track remote pte.
- Established refcounts on the exported pages.
- Must sleep in order to wait for remote acks of ptes that are being
  cleared.

Andrea's mmu_notifier #4 -> RFC V1

- Merge subsystem rmap based with Linux rmap based approach
- Move Linux rmap based notifiers out of macro
- Try to account for what locks are held while the notifiers are
  called.
- Develop a patch sequence that separates out the different types of
  hooks so that we can review their use.
- Avoid adding include to linux/mm_types.h
- Integrate RCU logic suggested by Peter.

V1->V2:
- Improve RCU support
- Use mmap_sem for mmu_notifier register / unregister
- Drop invalidate_page from COW, mm/fremap.c and mm/rmap.c since we
  already have invalidate_range() callbacks there.
- Clean compile for !MMU_NOTIFIER
- Isolate filemap_xip strangeness into its own diff
- Pass a the flag to invalidate_range to indicate if a spinlock
  is held.
- Add invalidate_all()

V2->V3:
- Further RCU fixes
- Fixes from Andrea to fixup aging and move invalidate_range() in do_wp_page
  and sys_remap_file_pages() after the pte clearing.

V3->V4:
- Drop locking and synchronize_rcu() on ->release since we know on release that
  we are the only executing thread. This is also true for invalidate_all() so
  we could drop off the mmu_notifier there early. Use hlist_del_init instead
  of hlist_del_rcu.
- Do the invalidation as begin/end pairs with the requirement that the driver
  holds off new references in between.
- Fixup filemap_xip.c
- Figure out a potential way in which XPmem can deal with locks that are held.
- Robin's patches to make the mmu_notifier logic manage the PageRmapExported 
bit.
- Strip cc list down a bit.
- Drop Peters new rcu list macro
- Add description to the core patch

V4->V5:
- Provide missing callouts for mremap.
- Provide missing callouts for copy_page_range.
- Reduce mm_struct space to zero if !MMU_NOTIFIER by #ifdeffing out
  structure contents.
- Get rid of the invalidate_all() callback by moving ->release in place
  of invalidate_all.
- Require holding mmap_sem on register/unregister instead of acquiring it
  ourselves. In some contexts where we want to register/unregister we are
  already holding mmap_sem.
- Split out the rmap support patch so that there is no need to apply
  all patches for KVM and GRU.

V5->V6:
- Provide missing range callouts for mprotect
- Fix do_wp_page control path sequencing
- Clarify locking conventions
- GRU and XPmem confirmed to work with this patchset.
- Provide skeleton code for GRU/KVM type callback and for XPmem type.
- Rework documentation and put it into Documentation/mmu_notifier.

-- 

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
kvm-devel mailing list
kvm-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/kvm-devel