Bugge;
Donald Kerr; OpenFabrics General; Supalov, Alexander
Subject: Re: [ofa-general] Memory registration redux
Here's the test program:
#include fcntl.h
#include stdio.h
#include unistd.h
#include linux/types.h
#include linux/ioctl.h
#include sys/mman.h
#include sys/stat.h
#include sys/types.h
I assume there will be some ioctl to allow a program to discover at runtime
the version(s) of the device that are supported on a particular system ?
Yeah, I guess. I haven't really thought through the forwards compat
completely I guess.
___
, Alexander
Sent: Wednesday, June 03, 2009 12:26 PM
To: 'Roland Dreier'
Cc: Jeff Squyres; Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky;
H??kon Bugge; Donald Kerr; OpenFabrics General
Subject: RE: [ofa-general] Memory registration redux
Thanks. This is what I was looking for. Let me
Supalov, Alexander wrote:
Hi,
Intel MPI developers are in principle OK with this proposal. What way of delivery is envisioned? Will this become a part of OFED or of the mainstream kernel?
Roland is planing to push it to kernel 2.6.31
And OFED will take it from the kernel.
We will check if we
Are there any comparable Windows plans?
I believe that Windows already provides an equivalent functionality as part of
the OS (Windows 2008 / Vista). See SecureMemoryCacheCallback. There are no
plans for WinOF to provide anything separately from this. (It's likely
impossible without OS support
; Dontje; Lenny Verkhovsky;
H??kon Bugge; Donald Kerr; OpenFabrics General
Subject: Re: [ofa-general] Memory registration redux
Sorry, it's kind of difficult to deduce looking at this QA sequence
what works how and when. Is it possible to create a brief and direct
description of the proposed
: Wednesday, May 27, 2009 9:03 PM
To: Roland Dreier (rdreier)
Cc: Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky; H??kon Bugge;
Donald Kerr; OpenFabrics General; Supalov, Alexander
Subject: Re: [ofa-general] Memory registration redux
Other MPI implementors -- what do you think of this scheme
Sorry, it's kind of difficult to deduce looking at this QA sequence
what works how and when. Is it possible to create a brief and direct
description of the proposed solution?
Did you see the original patch description I sent:
As discussed in
The scheme looks fine to me !
Hans W. Ry
Jeff Squyres skrev:
Other MPI implementors -- what do you think of this scheme?
On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote:
/*
* If type field is INVAL, then user_cookie_counter holds the
* user_cookie for the region
Sounds good for me,
Jeff Squyres wrote:
Other MPI implementors -- what do you think of this scheme?
On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote:
/*
* If type field is INVAL, then user_cookie_counter holds the
* user_cookie for the region being reported; if the
On May 26, 2009, at 7:13 PM, Roland Dreier (rdreier) wrote:
/*
* If type field is INVAL, then user_cookie_counter holds the
* user_cookie for the region being reported; if the HINT flag is set
* then hint_start/hint_end hold the start and end of the mapping that
* was invalidated. (If HINT
/*
* If type field is INVAL, then user_cookie_counter holds the
* user_cookie for the region being reported; if the HINT flag is set
* then hint_start/hint_end hold the start and end of the mapping that
* was invalidated. (If HINT is not set, then multiple events
*
Fixed version below -- returns EINVAL for an attempt to reuse a user
cookie (since otherwise unregister would get confused).
===
ummunot: Userspace support for MMU notifications
As discussed in http://article.gmane.org/gmane.linux.drivers.openib/61925
and follow-up messages, libraries using
Other MPI implementors -- what do you think of this scheme?
On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote:
/*
* If type field is INVAL, then user_cookie_counter holds the
* user_cookie for the region being reported; if the HINT flag
is set
* then
Sigh... real version that returns EINVAL for an attempt to reuse a user
cookie (since otherwise unregister would get confused). Previous
posting was the old patch, sorry.
===
ummunot: Userspace support for MMU notifications
As discussed in
Or, ignore the overlapping problem, and use your original technique,
slightly modified:
- Userspace registers a counter with the kernel. Kernel pins the
page, sets up mmu notifiers and increments the counter when
invalidates intersect with registrations
-
Here's the test program:
#include fcntl.h
#include stdio.h
#include unistd.h
#include linux/types.h
#include linux/ioctl.h
#include sys/mman.h
#include sys/stat.h
#include sys/types.h
#define UMMUNOT_INTF_VERSION1
enum {
UMMUNOT_EVENT_TYPE_INVAL= 0,
On Tue, May 26, 2009 at 04:13:08PM -0700, Roland Dreier wrote:
Or, ignore the overlapping problem, and use your original technique,
slightly modified:
- Userspace registers a counter with the kernel. Kernel pins the
page, sets up mmu notifiers and increments the
On May 18, 2009, at 5:15 PM, Roland Dreier (rdreier) wrote:
So you want the registration cache to be reference counted per-page?
Seems like potentially a lot of overhead -- if someone registers a
million pages, then to check for a cache hit, you have to potentially
check millions of reference
On May 7, 2009, at 5:58 PM, Roland Dreier (rdreier) wrote:
Specifically: the actual dereg of 0x1000-0x3fff is blocked on also
releasing 0x2000-0x2fff.
If everyone is doing this, how do you handle the case that Jason
pointed
out, namely:
* you register 0x1000 ... 0x3fff
* you want to
On Mon, May 18, 2009 at 9:24 AM, Jeff Squyres jsquy...@cisco.com wrote:
On May 7, 2009, at 5:58 PM, Roland Dreier (rdreier) wrote:
Specifically: the actual dereg of 0x1000-0x3fff is blocked on also
releasing 0x2000-0x2fff.
If everyone is doing this, how do you handle the case that Jason
On May 18, 2009, at 2:02 PM, Caitlin Bestler wrote:
Specifically: the actual dereg of 0x1000-0x3fff is blocked on
also
releasing 0x2000-0x2fff.
If everyone is doing this, how do you handle the case that Jason
pointed
out, namely:
* you register 0x1000 ... 0x3fff
* you want to
When our memory hooks tell us that memory is about to be removed from
the process, we unregister all pages in the relevant region and remove
those entries from the cache. So the next time you look in the cache
for 0x3000-0x3fff, it won't be there -- it'll be treated as
cache-cold.
So
On Tue, May 05, 2009 at 04:57:09PM -0400, Jeff Squyres wrote:
Roland and I chatted on the phone today; I think I now understand
Roland's counter-proposal (I clearly didn't before). Let me try to
summarize:
1. Add a new verb for set this userspace flag to 1 if mr X ever becomes
invalid
On Thu, May 7, 2009 at 3:48 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
Right, I was only thinking of a new driver call that was along the
lines of update_mr_pages() that just updates the HCA's mapping with
new page table entires atomically. It really would be device
specific.
On Mon, May 11, 2009 at 02:23:58PM -0700, Caitlin Bestler wrote:
On Thu, May 7, 2009 at 3:48 PM, Jason Gunthorpe
jguntho...@obsidianresearch.com wrote:
Right, I was only thinking of a new driver call that was along the
lines of update_mr_pages() that just updates the HCA's mapping with
On May 6, 2009, at 4:10 PM, Roland Dreier (rdreier) wrote:
By the way, what's the desired behavior of the cache if a process
registers, say, address range 0x1000 ... 0x3fff, and then the same
process registers address range 0x2000 ... 0x2fff (with all the same
permissions, etc)?
The initial
Ry; Terry Dontje; Lenny
Verkhovsky; Håkon Bugge; Donald Kerr; OpenFabrics General;
Alexander Supalov
Subject: Re: [ofa-general] Memory registration redux
On May 6, 2009, at 4:10 PM, Roland Dreier (rdreier) wrote:
By the way, what's the desired behavior of the cache if a process
] On Behalf Of
Jeff Squyres
Sent: Thursday, May 07, 2009 8:54 AM
To: Roland Dreier
Cc: Pavel Shamis; Hans Westgaard Ry; Terry Dontje; Lenny
Verkhovsky; H?kon Bugge; Donald Kerr; OpenFabrics General;
Alexander Supalov
Subject: Re: [ofa-general] Memory registration redux
On May 6, 2009, at 4
No... every HCA just needs to support register and unregister. It
doesn't have to support changing the mapping without full unregister and
reregister.
Well, I would imagine this entire process to be a HCA specific
operation, so HW that supports a better method can use it,
I don't know what the other MPI's do in this scenario, but here's what
OMPI will do:
1. lookup 0x1000-0x3fff in the cache; not find any of it it, and
therefore register
- add each page to our cache with a refcount of 1
2. lookup 0x2000-0x2fff in the cache, find that all the pages
On Thu, May 07, 2009 at 02:46:55PM -0700, Roland Dreier wrote:
Using register/unregister exposes a race for the original case you
brought up - but that race is completely unfixable without hardware
support. At least it now becomes a hw specific race that can be
printk'd and someday
Jeff Squyres wrote:
Roland and I chatted on the phone today; I think I now understand
Roland's counter-proposal (I clearly didn't before). Let me try to
summarize:
1. Add a new verb for set this userspace flag to 1 if mr X ever
becomes invalid
2. Add a new verb for no longer tell me if mr X
On May 6, 2009, at 10:09 AM, Tziporet Koren wrote:
I think the new proposal is good (but I am not MPI expert)
If we implement it soon we will be able to enable it in OFED 1.5 too
That sounds good, as long as we don't diverge from upstream (like what
happened with XRC).
I think the cache
Roland and I chatted on the phone today; I think I now understand
Roland's counter-proposal (I clearly didn't before). Let me try to
summarize:
1. Add a new verb for set this userspace flag to 1 if mr X ever
becomes invalid
2. Add a new verb for no longer tell me if mr X ever
By the way, what's the desired behavior of the cache if a process
registers, say, address range 0x1000 ... 0x3fff, and then the same
process registers address range 0x2000 ... 0x2fff (with all the same
permissions, etc)?
The initial registration creates an MR that is still valid for the
smaller
On Wed, May 06, 2009 at 01:10:47PM -0700, Roland Dreier wrote:
By the way, what's the desired behavior of the cache if a process
registers, say, address range 0x1000 ... 0x3fff, and then the same
process registers address range 0x2000 ... 0x2fff (with all the same
permissions, etc)?
The
Yuk, doesn't this problem pretty much doom this method entirely? You
can't tear down the entire registration of 0x1000 ... 0x3fff if the app
does something to change 0x2000 .. 0x2fff because it may have active
RDMAs going on in 0x1000 ... 0x1fff.
Yes, I guess if we try to reuse
On Wed, May 06, 2009 at 02:56:25PM -0700, Roland Dreier wrote:
Yuk, doesn't this problem pretty much doom this method entirely? You
can't tear down the entire registration of 0x1000 ... 0x3fff if the app
does something to change 0x2000 .. 0x2fff because it may have active
RDMAs going
Well, this conceptually doesn't seem hard. Go through all the pages in
the MR, if any have changed then pin the new page and replace the
pages physical address in the HCA's page table. Once done, synchronize
with the hardware, then run through again and un-pin and release all
the
On Wed, May 06, 2009 at 03:39:54PM -0700, Roland Dreier wrote:
Well, this conceptually doesn't seem hard. Go through all the pages in
the MR, if any have changed then pin the new page and replace the
pages physical address in the HCA's page table. Once done, synchronize
with the
41 matches
Mail list logo