Hi, Mem reg caching has direct relation to the apps performance. Can we guarantee, while putting the caching into the kernel, that the algorithms used will be good for all apps? How will one control their parameters at runtime? Will one be able to change the algorithm if necessary?
Best regards. Alexander -----Original Message----- From: Jeff Squyres [mailto:[email protected]] Sent: Thursday, April 30, 2009 4:39 PM To: Barrett, Brian W Cc: Roland Dreier (rdreier); OpenFabrics General; Pavel Shamis; Hans Westgaard Ry; Terry Dontje; Lenny Verkhovsky; HÃ¥kon Bugge; Donald Kerr; Supalov, Alexander Subject: Re: [ofa-general] New proposal for memory management On Apr 29, 2009, at 4:45 PM, Barrett, Brian W wrote: > If you think this sounds like a hassle, think about what it looks > like from > the point of view of the MPI implementer (or any other developer > writing > libraries which sit between user data and OFED, like GASNet). > If you don't care about what pain MPI implementors have to go through (and you probably don't ;-) ) -- consider that this is a major roadblock to most *anyone* who wants to write to user verbs. <banging the same old drum> I heard lots of variations of "Why isn't OFED more popular?" in Sonoma this year. This is at least one big reason why: no (normal/non- superhuman programmers) can write verbs code (IMHO). MPI's *have* to support OpenFabrics -- HPC customers demand it. But non-HPC customers have a clear alternative: they'll just write sockets code. And the price/performance for using sockets over IB/iWARP may or may not be attractive depending on the customer's buying capacity. Hence -- they just buy gigE (10gigE, when the price drops low enough). Doesn't OpenFabrics want to grow beyond MPI? Woody said that verbs is designed to support a billion different things -- outside of MPI and a few storage protocols (none of which are widely adopted), how much is OFED used? </banging the same old drum> > Jeff and I talked for a while today, and we're pretty sure that as > long as > the byte set by the kernel notifier is written before the pages are > returned > into the unallocated list, there isn't actually a race condition. > [snip] > > However, there's still then the problem with the notifier concept of > how the > kernel passes which pages were given back to the kernel. It has to > pass a > (potentially very large) amount of data back to the user, so the > memory > ownership issues with kernel/user space are interesting. It also > has to > somewhat atomically prepare the list and undset the notifier byte, > which is > also problematic. But probably workable. > I feel compelled to amend this: this notifier concept *may be workable*, but it's still quite complex for the reasons Brian cited. The goal here is to *reduce* complexity, especially for applications/ ULPs using the verbs stack. If we put the registration cache in the network stack, application/ULP complexity will be reduced significantly. My $0.02 is that using a notifier solution is still fairly complex and introduces a new set of problems. FWIW: Putting the registration cache in the userspace verbs stack means that verbs will now have to do the horrid malloc/mmap/etc. intercept tricks that MPI implementations currently do. Take it from us -- this is not a business you want to be in. Such intercepts breaks tools like valgrind and other memory-checking debuggers. Even the best intercept hooks available today can still be subverted. Open MPI (and MX!) has to insert a pre-main hook to setup these intercepts, and then check later to ensure that no one else subverted our hooks. Yuck. It's memory management. And that belongs in the kernel. -- Jeff Squyres Cisco Systems --------------------------------------------------------------------- Intel GmbH Dornacher Strasse 1 85622 Feldkirchen/Muenchen Germany Sitz der Gesellschaft: Feldkirchen bei Muenchen Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer Registergericht: Muenchen HRB 47456 Ust.-IdNr. VAT Registration No.: DE129385895 Citibank Frankfurt (BLZ 502 109 00) 600119052 This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies. _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
