RE: [ofa-general] Memory registration redux

2009-06-16 Thread Woodruff, Robert J
Bugge; Donald Kerr; OpenFabrics General; Supalov, Alexander Subject: Re: [ofa-general] Memory registration redux Here's the test program: #include fcntl.h #include stdio.h #include unistd.h #include linux/types.h #include linux/ioctl.h #include sys/mman.h #include sys/stat.h #include sys/types.h

Re: [ofa-general] Memory registration redux

2009-06-16 Thread Roland Dreier
I assume there will be some ioctl to allow a program to discover at runtime the version(s) of the device that are supported on a particular system ? Yeah, I guess. I haven't really thought through the forwards compat completely I guess. ___

RE: [ofa-general] Memory registration redux

2009-06-08 Thread Supalov, Alexander
, Alexander Sent: Wednesday, June 03, 2009 12:26 PM To: 'Roland Dreier' Cc: Jeff Squyres; Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky; H??kon Bugge; Donald Kerr; OpenFabrics General Subject: RE: [ofa-general] Memory registration redux Thanks. This is what I was looking for. Let me

Re: [ofa-general] Memory registration redux

2009-06-08 Thread Tziporet Koren
Supalov, Alexander wrote: Hi, Intel MPI developers are in principle OK with this proposal. What way of delivery is envisioned? Will this become a part of OFED or of the mainstream kernel? Roland is planing to push it to kernel 2.6.31 And OFED will take it from the kernel. We will check if we

RE: [ofa-general] Memory registration redux

2009-06-08 Thread Sean Hefty
Are there any comparable Windows plans? I believe that Windows already provides an equivalent functionality as part of the OS (Windows 2008 / Vista). See SecureMemoryCacheCallback. There are no plans for WinOF to provide anything separately from this. (It's likely impossible without OS support

RE: [ofa-general] Memory registration redux

2009-06-03 Thread Supalov, Alexander
; Dontje; Lenny Verkhovsky; H??kon Bugge; Donald Kerr; OpenFabrics General Subject: Re: [ofa-general] Memory registration redux Sorry, it's kind of difficult to deduce looking at this QA sequence what works how and when. Is it possible to create a brief and direct description of the proposed

RE: [ofa-general] Memory registration redux

2009-06-02 Thread Supalov, Alexander
: Wednesday, May 27, 2009 9:03 PM To: Roland Dreier (rdreier) Cc: Pavel Shamis; Hans Westgaard Ry; Dontje; Lenny Verkhovsky; H??kon Bugge; Donald Kerr; OpenFabrics General; Supalov, Alexander Subject: Re: [ofa-general] Memory registration redux Other MPI implementors -- what do you think of this scheme

Re: [ofa-general] Memory registration redux

2009-06-02 Thread Roland Dreier
Sorry, it's kind of difficult to deduce looking at this QA sequence what works how and when. Is it possible to create a brief and direct description of the proposed solution? Did you see the original patch description I sent: As discussed in

Re: [ofa-general] Memory registration redux

2009-05-29 Thread Hans Westgaard Ry
The scheme looks fine to me ! Hans W. Ry Jeff Squyres skrev: Other MPI implementors -- what do you think of this scheme? On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote: /* * If type field is INVAL, then user_cookie_counter holds the * user_cookie for the region

Re: [ofa-general] Memory registration redux

2009-05-28 Thread Pavel Shamis (Pasha)
Sounds good for me, Jeff Squyres wrote: Other MPI implementors -- what do you think of this scheme? On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote: /* * If type field is INVAL, then user_cookie_counter holds the * user_cookie for the region being reported; if the

Re: [ofa-general] Memory registration redux

2009-05-27 Thread Jeff Squyres
On May 26, 2009, at 7:13 PM, Roland Dreier (rdreier) wrote: /* * If type field is INVAL, then user_cookie_counter holds the * user_cookie for the region being reported; if the HINT flag is set * then hint_start/hint_end hold the start and end of the mapping that * was invalidated. (If HINT

Re: [ofa-general] Memory registration redux

2009-05-27 Thread Roland Dreier
/* * If type field is INVAL, then user_cookie_counter holds the * user_cookie for the region being reported; if the HINT flag is set * then hint_start/hint_end hold the start and end of the mapping that * was invalidated. (If HINT is not set, then multiple events *

Re: [ofa-general] Memory registration redux

2009-05-27 Thread Roland Dreier
Fixed version below -- returns EINVAL for an attempt to reuse a user cookie (since otherwise unregister would get confused). === ummunot: Userspace support for MMU notifications As discussed in http://article.gmane.org/gmane.linux.drivers.openib/61925 and follow-up messages, libraries using

Re: [ofa-general] Memory registration redux

2009-05-27 Thread Jeff Squyres
Other MPI implementors -- what do you think of this scheme? On May 27, 2009, at 1:49 PM, Roland Dreier (rdreier) wrote: /* * If type field is INVAL, then user_cookie_counter holds the * user_cookie for the region being reported; if the HINT flag is set * then

Re: [ofa-general] Memory registration redux

2009-05-27 Thread Roland Dreier
Sigh... real version that returns EINVAL for an attempt to reuse a user cookie (since otherwise unregister would get confused). Previous posting was the old patch, sorry. === ummunot: Userspace support for MMU notifications As discussed in

Re: [ofa-general] Memory registration redux

2009-05-26 Thread Roland Dreier
Or, ignore the overlapping problem, and use your original technique, slightly modified: - Userspace registers a counter with the kernel. Kernel pins the page, sets up mmu notifiers and increments the counter when invalidates intersect with registrations -

Re: [ofa-general] Memory registration redux

2009-05-26 Thread Roland Dreier
Here's the test program: #include fcntl.h #include stdio.h #include unistd.h #include linux/types.h #include linux/ioctl.h #include sys/mman.h #include sys/stat.h #include sys/types.h #define UMMUNOT_INTF_VERSION1 enum { UMMUNOT_EVENT_TYPE_INVAL= 0,

Re: [ofa-general] Memory registration redux

2009-05-26 Thread Jason Gunthorpe
On Tue, May 26, 2009 at 04:13:08PM -0700, Roland Dreier wrote: Or, ignore the overlapping problem, and use your original technique, slightly modified: - Userspace registers a counter with the kernel. Kernel pins the page, sets up mmu notifiers and increments the

Re: [ofa-general] Memory registration redux

2009-05-19 Thread Jeff Squyres
On May 18, 2009, at 5:15 PM, Roland Dreier (rdreier) wrote: So you want the registration cache to be reference counted per-page? Seems like potentially a lot of overhead -- if someone registers a million pages, then to check for a cache hit, you have to potentially check millions of reference

Re: [ofa-general] Memory registration redux

2009-05-18 Thread Jeff Squyres
On May 7, 2009, at 5:58 PM, Roland Dreier (rdreier) wrote: Specifically: the actual dereg of 0x1000-0x3fff is blocked on also releasing 0x2000-0x2fff. If everyone is doing this, how do you handle the case that Jason pointed out, namely: * you register 0x1000 ... 0x3fff * you want to

Re: [ofa-general] Memory registration redux

2009-05-18 Thread Caitlin Bestler
On Mon, May 18, 2009 at 9:24 AM, Jeff Squyres jsquy...@cisco.com wrote: On May 7, 2009, at 5:58 PM, Roland Dreier (rdreier) wrote:   Specifically: the actual dereg of 0x1000-0x3fff is blocked on also   releasing 0x2000-0x2fff. If everyone is doing this, how do you handle the case that Jason

Re: [ofa-general] Memory registration redux

2009-05-18 Thread Jeff Squyres
On May 18, 2009, at 2:02 PM, Caitlin Bestler wrote: Specifically: the actual dereg of 0x1000-0x3fff is blocked on also releasing 0x2000-0x2fff. If everyone is doing this, how do you handle the case that Jason pointed out, namely: * you register 0x1000 ... 0x3fff * you want to

Re: [ofa-general] Memory registration redux

2009-05-18 Thread Roland Dreier
When our memory hooks tell us that memory is about to be removed from the process, we unregister all pages in the relevant region and remove those entries from the cache. So the next time you look in the cache for 0x3000-0x3fff, it won't be there -- it'll be treated as cache-cold. So

Re: [ofa-general] Memory registration redux

2009-05-11 Thread Jonathan Perkins
On Tue, May 05, 2009 at 04:57:09PM -0400, Jeff Squyres wrote: Roland and I chatted on the phone today; I think I now understand Roland's counter-proposal (I clearly didn't before). Let me try to summarize: 1. Add a new verb for set this userspace flag to 1 if mr X ever becomes invalid

Re: [ofa-general] Memory registration redux

2009-05-11 Thread Caitlin Bestler
On Thu, May 7, 2009 at 3:48 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Right, I was only thinking of a new driver call that was along the lines of update_mr_pages() that just updates the HCA's mapping with new page table entires atomically. It really would be device specific.

Re: [ofa-general] Memory registration redux

2009-05-11 Thread Jason Gunthorpe
On Mon, May 11, 2009 at 02:23:58PM -0700, Caitlin Bestler wrote: On Thu, May 7, 2009 at 3:48 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: Right, I was only thinking of a new driver call that was along the lines of update_mr_pages() that just updates the HCA's mapping with

Re: [ofa-general] Memory registration redux

2009-05-07 Thread Jeff Squyres
On May 6, 2009, at 4:10 PM, Roland Dreier (rdreier) wrote: By the way, what's the desired behavior of the cache if a process registers, say, address range 0x1000 ... 0x3fff, and then the same process registers address range 0x2000 ... 0x2fff (with all the same permissions, etc)? The initial

RE: [ofa-general] Memory registration redux

2009-05-07 Thread Tang, Changqing
Ry; Terry Dontje; Lenny Verkhovsky; Håkon Bugge; Donald Kerr; OpenFabrics General; Alexander Supalov Subject: Re: [ofa-general] Memory registration redux On May 6, 2009, at 4:10 PM, Roland Dreier (rdreier) wrote: By the way, what's the desired behavior of the cache if a process

RE: [ofa-general] Memory registration redux

2009-05-07 Thread Matthew Koop
] On Behalf Of Jeff Squyres Sent: Thursday, May 07, 2009 8:54 AM To: Roland Dreier Cc: Pavel Shamis; Hans Westgaard Ry; Terry Dontje; Lenny Verkhovsky; H?kon Bugge; Donald Kerr; OpenFabrics General; Alexander Supalov Subject: Re: [ofa-general] Memory registration redux On May 6, 2009, at 4

Re: [ofa-general] Memory registration redux

2009-05-07 Thread Roland Dreier
No... every HCA just needs to support register and unregister. It doesn't have to support changing the mapping without full unregister and reregister. Well, I would imagine this entire process to be a HCA specific operation, so HW that supports a better method can use it,

Re: [ofa-general] Memory registration redux

2009-05-07 Thread Roland Dreier
I don't know what the other MPI's do in this scenario, but here's what OMPI will do: 1. lookup 0x1000-0x3fff in the cache; not find any of it it, and therefore register - add each page to our cache with a refcount of 1 2. lookup 0x2000-0x2fff in the cache, find that all the pages

Re: [ofa-general] Memory registration redux

2009-05-07 Thread Jason Gunthorpe
On Thu, May 07, 2009 at 02:46:55PM -0700, Roland Dreier wrote: Using register/unregister exposes a race for the original case you brought up - but that race is completely unfixable without hardware support. At least it now becomes a hw specific race that can be printk'd and someday

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Tziporet Koren
Jeff Squyres wrote: Roland and I chatted on the phone today; I think I now understand Roland's counter-proposal (I clearly didn't before). Let me try to summarize: 1. Add a new verb for set this userspace flag to 1 if mr X ever becomes invalid 2. Add a new verb for no longer tell me if mr X

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Jeff Squyres
On May 6, 2009, at 10:09 AM, Tziporet Koren wrote: I think the new proposal is good (but I am not MPI expert) If we implement it soon we will be able to enable it in OFED 1.5 too That sounds good, as long as we don't diverge from upstream (like what happened with XRC). I think the cache

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Roland Dreier
Roland and I chatted on the phone today; I think I now understand Roland's counter-proposal (I clearly didn't before). Let me try to summarize: 1. Add a new verb for set this userspace flag to 1 if mr X ever becomes invalid 2. Add a new verb for no longer tell me if mr X ever

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Roland Dreier
By the way, what's the desired behavior of the cache if a process registers, say, address range 0x1000 ... 0x3fff, and then the same process registers address range 0x2000 ... 0x2fff (with all the same permissions, etc)? The initial registration creates an MR that is still valid for the smaller

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Jason Gunthorpe
On Wed, May 06, 2009 at 01:10:47PM -0700, Roland Dreier wrote: By the way, what's the desired behavior of the cache if a process registers, say, address range 0x1000 ... 0x3fff, and then the same process registers address range 0x2000 ... 0x2fff (with all the same permissions, etc)? The

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Roland Dreier
Yuk, doesn't this problem pretty much doom this method entirely? You can't tear down the entire registration of 0x1000 ... 0x3fff if the app does something to change 0x2000 .. 0x2fff because it may have active RDMAs going on in 0x1000 ... 0x1fff. Yes, I guess if we try to reuse

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Jason Gunthorpe
On Wed, May 06, 2009 at 02:56:25PM -0700, Roland Dreier wrote: Yuk, doesn't this problem pretty much doom this method entirely? You can't tear down the entire registration of 0x1000 ... 0x3fff if the app does something to change 0x2000 .. 0x2fff because it may have active RDMAs going

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Roland Dreier
Well, this conceptually doesn't seem hard. Go through all the pages in the MR, if any have changed then pin the new page and replace the pages physical address in the HCA's page table. Once done, synchronize with the hardware, then run through again and un-pin and release all the

Re: [ofa-general] Memory registration redux

2009-05-06 Thread Jason Gunthorpe
On Wed, May 06, 2009 at 03:39:54PM -0700, Roland Dreier wrote: Well, this conceptually doesn't seem hard. Go through all the pages in the MR, if any have changed then pin the new page and replace the pages physical address in the HCA's page table. Once done, synchronize with the