Re: [OMPI devel] RFC: convert send to ssend

2009-08-25 Thread Patrick Geoffray
Ralph Castain wrote: Not quite that simple, Patrick. Think of things like MPI_Sendrecv, where the "send" call is below that of the user's code. You have a point, Ralph. Although, that would be 8 more lines to add to the user MPI code to define a MPI_Sendrecv macro :-) Seriously, this

Re: [OMPI devel] RFC: convert send to ssend

2009-08-24 Thread Patrick Geoffray
George Bosilca wrote: I know the approach "because we can". We develop an MPI library, and we should keep it that way. Our main focus should not diverge to provide I would join George in the minority on this one. "Because we can" is a slippery slope, there is value in keeping things simple,

Re: [OMPI devel] Heads up on new feature to 1.3.4

2009-08-17 Thread Patrick Geoffray
Jeff, Jeff Squyres wrote: ignored it whenever presenting competitive data. The 1,000,000th time I saw this, I gave up arguing that our competitors were not being fair and simply changed our defaults to always leave memory pinned for OpenFabrics-based networks. Instead, you should have

Re: [OMPI devel] SM init failures

2009-03-30 Thread Patrick Geoffray
Jeff Squyres wrote: Why not? The "owning" process can do the touch; then it'll be affinity'ed properly. Right? Yes, that's what I meant by forcing allocation. From the thread, it looked like nobody touched the pages of the mapped file. If it's already done, no need to write in the whole

Re: [OMPI devel] SM init failures

2009-03-30 Thread Patrick Geoffray
George Bosilca wrote: performance hit on the startup time. And second, we will have to find a pretty smart way to do this or we will completely break the memory affinity stuff. I didn't look at the code, but I sure hope that the SM init code does touch each page to force allocation,

Re: [OMPI devel] 1.3.1 fails with GM

2009-03-20 Thread Patrick Geoffray
Hi Christian, Christian Siebert wrote: I just gave the new release 1.3.1 a go. While Ethernet and InfiniBand seem to work properly, I noticed that Myrinet/GM compiles fine but gives a segmentation violation in the first attempt to communicate (MPI_Send in a simple "hello world" application).

Re: [OMPI devel] RFC: sm Latency

2009-01-21 Thread Patrick Geoffray
Eugene Loh wrote: Possibly, you meant to ask how one does directed polling with a wildcard source MPI_ANY_SOURCE. If that was your question, the answer is we punt. We report failure to the ULP, which reverts to the standard code path. Sorry, I meant ANY_SOURCE. If you poll only the queue

Re: [OMPI devel] RFC: sm Latency

2009-01-20 Thread Patrick Geoffray
Eugene, All my remarks are related to the receive side. I think the send side optimizations are fine, but don't take my word for it. Eugene Loh wrote: > To recap: > 1) The work is already done. How do you do "directed polling" with ANY_TAG ? How do you ensure you check all incoming queues from

Re: [OMPI devel] RFC: sm Latency

2009-01-20 Thread Patrick Geoffray
Hi Eugene, Eugene Loh wrote: >> replace the fifo’s with a single link list per process in shared >> memory, with senders to this process adding match envelopes >> atomically, with each process reading its own link list (multiple > *) Doesn't strike me as a "simple" change. Actually, it's

Re: [OMPI devel] 1.3 PML default choice

2009-01-13 Thread Patrick Geoffray
Jeff Squyres wrote: Gaah! I specifically asked Patrick and George about this and they said that the README text was fine. Grr... When I looked at that time, I vaguely remember that _both_ PMLs were initialized but CM was eventually used because it was the last one. It looked broken, but it

Re: [OMPI devel] shared-memory allocations

2008-12-13 Thread Patrick Geoffray
Richard Graham wrote: Yes - it is polling volatile memory, so has to load from memory on every read. Actually, it will poll in cache, and only load from memory when the cache coherency protocol invalidates the cache line. Volatile semantic only prevents compiler optimizations. It does not

Re: [OMPI devel] More README questions

2008-11-15 Thread Patrick Geoffray
Jeff Squyres wrote: - There's a big chunk of text about MX that I have no idea if it's still up-to-date / correct or not. Looks good to me. Patrick

[OMPI devel] mallopt and registration cache

2008-10-31 Thread Patrick Geoffray
Gentlemen, I have been looking at a data corruption with the MX btl or mtl with the 1.3 branch when trying to use MX registration cache. The related ticket is #1525, opened by Tim. In 1.3, mallopt() is used to never trim memory, in replacement of the malloc overload by ptmalloc2. MX

Re: [OMPI devel] RFC: make mpi_leave_pinned=1 the default

2008-07-06 Thread Patrick Geoffray
Jeff Squyres wrote: WHAT: make mpi_leave_pinned=1 by default when a BTL is used that would benefit from it (when possible; 0 when not, obviously) Comments? The probable reason registration cache (aka leave_pinned) is disabled by default is that it may be unsafe. Even if you use mallocopt

Re: [OMPI devel] Notes from mem hooks call today

2008-05-29 Thread Patrick Geoffray
Hi Roland, Roland Dreier wrote: Stick in a separate library then? I don't think we want the complexity in the kernel -- I personally would argue against merging it upstream; and given that the userspace solution is actually faster, it becomes pretty hard to justify. Memory registration has

Re: [OMPI devel] Memory hooks stuff

2008-05-24 Thread Patrick Geoffray
Hi Jeff, Jeff Squyres wrote: the topic of the memory hooks came up again. Brian was wondering if we should [finally] revisit this topic -- there's a few things that could be done to make life "better". Two things jump to mind: - using mallopt on Linux What about using the (probably)

Re: [OMPI devel] RFC: Linuxes shipping libibverbs

2008-05-22 Thread Patrick Geoffray
Brian W. Barrett wrote: With MX, it's one initialization call (mx_init), and it's not clear from the errors it can return that you can differentiate between the two cases. If you run mx_init() on a machine without the MX driver loaded or no NIC detected by the driver, you get a specific error

Re: [OMPI devel] patch for building gm btl

2008-01-03 Thread Patrick Geoffray
Paul, Paul H. Hargrove wrote: discuss what tests we will run, but it will probably be a very minimal set. Once we both have MTT setup and running GM tests, we should compare configs to avoid overlap (and thus increase coverage). That would be great. I have only one 32-node 2G cluster I can

Re: [OMPI devel] patch for building gm btl

2008-01-03 Thread Patrick Geoffray
Hi Paul, Paul H. Hargrove wrote: The fact that this has gone unfixed for 2 months suggests to me that nobody is building the GM BTL. So, how would I go about checking ... a) ...if there exists any periodic build of the GM BTL via MTT? We are deploying MTT on all our clusters. Right now,

Re: [OMPI devel] SDP support for OPEN-MPI

2008-01-01 Thread Patrick Geoffray
Lenny Verkhovsky wrote: We would like to add SDP support for OPENMPI. SDP can be used to accelerate job start ( oob over sdp ) and IPoIB performance. I fail to see the reason to pollute the TCP btl with IB-specific SDP stuff. For the oob, this is arguable, but doesn't SDP allow for

Re: [OMPI devel] Dynamically Turning On and Off Memory Manager of Open MPI at Runtime??

2007-12-10 Thread Patrick Geoffray
Hi Peter, Peter Wong wrote: Open MPI defines its own malloc (by default), so malloc of glibc is not called. But, without calling malloc of glibc, the allocator of libhugetlbfs to back text and dynamic data by large pages, e.g., 16MB pages on POWER systems, is not used. You could modify

Re: [OMPI devel] collective problems

2007-11-08 Thread Patrick Geoffray
Hi Gleb, Gleb Natapov wrote: In the case of TCP, kernel is kind enough to progress message for you, but only if there was enough space in a kernel internal buffers. If there was no place there, TCP BTL will also buffer messages in userspace and will, eventually, have the same problem.

Re: [OMPI devel] collective problems

2007-11-07 Thread Patrick Geoffray
Jeff Squyres wrote: This is not a problem in the current code base. Remember that this is all in the context of Galen's proposal for btl_send() to be able to return NOT_ON_WIRE -- meaning that the send was successful, but it has not yet been sent (e.g., openib BTL buffered it because it

Re: [OMPI devel] PathScale 3.0 problems with Open MPI 1.2.[34]

2007-10-23 Thread Patrick Geoffray
Hi Bogdan, Bogdan Costescu wrote: I made some progress: if I configure with "--without-memory-manager" (along with all other options that I mentioned before), then it works. This was inspired by the fact that the segmentation fault occured in ptmalloc2. I have previously tried to remove the

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r15041

2007-06-13 Thread Patrick Geoffray
Jeff Squyres wrote: Let's take a step back and see exactly what we *want*. Then we can talk about how to have an interface for it. I must be missing something but why is the bandwidth/latency passed by the user (by whatever means) ? Would it be easier to automagically get these values by

Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-29 Thread Patrick Geoffray
Jeff Squyres (jsquyres) wrote: -Original Message- From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf Of Patrick Geoffray Sent: Wednesday, June 28, 2006 1:23 PM To: Open MPI Developers Subject: Re: [OMPI devel] Best bw/lat performance for microbenchmark

Re: [OMPI devel] Best bw/lat performance for microbenchmark/debug utility

2006-06-28 Thread Patrick Geoffray
messages, the host CPU overhead and the ability to progress. All of these metrics are measured by existing benchmarks, do you want to write one that covers everything or something like IMB ? Patrick -- Patrick Geoffray Myricom, Inc. http://www.myri.com