Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Mon, Aug 13, 2007 at 04:26:31PM -0500, Dirk Eddelbuettel wrote: > > I'll now compile the 1.2.3 release tarball and see if I can reproduce The 1.2.3 release also works fine: adi@debian:~$ ./ompi123/bin/mpirun -np 2 ring 0: sending message (0) to 1 0: sent message 1: waiting for message 1: got

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Dirk Eddelbuettel
Adrian, On 13 August 2007 at 22:28, Adrian Knoth wrote: | On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: | | > > We (as in the Debian maintainer for Open MPI) got this bug report from | > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD | > > kernel. | > > Any

Re: [OMPI devel] Collectives interface change

2007-08-13 Thread Li-Ta Lo
On Thu, 2007-08-09 at 14:49 -0600, Brian Barrett wrote: > Hi all - > > There was significant discussion this week at the collectives meeting > about improving the selection logic for collective components. While > we'd like the automated collectives selection logic laid out in the > Collv2

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 4:28 PM, Adrian Knoth wrote: I'll now compile the 1.2.3 release tarball and see if I can reproduce the segfaults. On the other hand, I guess nobody is using OMPI on GNU/kFreeBSD, so upgrading the openmpi-package to a subversion snapshot would also fix the problem (think

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: > > We (as in the Debian maintainer for Open MPI) got this bug report from > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > > kernel. > > Any input would be greatly appreciated! > I'll follow the QEMU

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 03:59:28PM -0400, Richard Graham wrote: > > > > On 8/13/07 3:52 PM, "Gleb Natapov" wrote: > > > On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > > Here are the > > items we have identified: > > > All those things sounds very

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Gleb Natapov
On Mon, Aug 13, 2007 at 09:12:33AM -0600, Galen Shipman wrote: > Here are the items we have identified: > All those things sounds very promising. Is there tmp branch where you are going to work on this? > > > > > 1)

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Richard Graham
On 8/13/07 12:34 PM, "Galen Shipman" wrote: > Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 11:12 AM, Galen Shipman wrote: 1) remove 0 byte optimization of not initializing the convertor This costs us an “if“ in MCA_PML_BASE_SEND_REQUEST_INIT and an “if“ in mca_pml_ob1_send_request_start_copy +++ Measure the convertor initialization before taking any other

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
Brian Barrett wrote: On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Galen Shipman
Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching: 1.79 ompi 1.2: 1.59 So on zero bytes ompi not so bad. Also we can see that header caching decrease the mvapich latency on 0.23 1 bytes mvapich with header caching: 1.58 mvapich

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Pavel Shamis (Pasha)
George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Brian Barrett
On Aug 13, 2007, at 9:33 AM, George Bosilca wrote: On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:28 AM, Pavel Shamis (Pasha) wrote: Jeff Squyres wrote: I guess reading the graph that Pasha sent is difficult; Pasha -- can you send the actual numbers? Ok here is the numbers on my machines: 0 bytes mvapich with header caching: 1.56 mvapich without header caching:

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
On Aug 13, 2007, at 11:07 AM, Jeff Squyres wrote: Such a scheme is certainly possible, but I see even less use for it than use cases for the existing microbenchmarks. Specifically, header caching *can* happen in real applications (i.e., repeatedly send short messages with the same MPI

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
We're working on it. Give us few weeks to finish implementing all the planned optimizations/cleanups in th PML and then we can talk about tricks. We're expecting/hoping to slim down the PML layer by more than 0.5 so this header caching optimization might not make any sense at that point.

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:49 AM, George Bosilca wrote: You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Christian Bell
On Sun, 12 Aug 2007, Gleb Natapov wrote: > > Any objections? We can discuss what approaches we want to take > > (there's going to be some complications because of the PML driver, > > etc.); perhaps in the Tuesday Mellanox teleconf...? > > > My main objection is that the only reason you

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread George Bosilca
You want a dirtier trick for benchmarks ... Here it is ... Implement a compression like algorithm based on checksum. The data- type engine can compute a checksum for each fragment and if the checksum match one in the peer [limitted] history (so we can claim our communication protocol is

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 10:34 AM, Jeff Squyres wrote: All this being said -- is there another reason to lower our latency? My main goal here is to lower the latency. If header caching is unattractive, then another method would be fine. Oops: s/reason/way/. That makes my sentence make much more

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 13, 2007, at 6:36 AM, Gleb Natapov wrote: Pallas, Presta (as i know) also use static rank. So lets start to fix all "bogus" benchmarks :-) ? All benchmarks are bogus. I have better optimization. Check a name of executable and if this is some know benchmark send one byte instead of real

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Jeff Squyres
On Aug 12, 2007, at 3:49 PM, Gleb Natapov wrote: - Mellanox tested MVAPICH with the header caching; latency was around 1.4us - Mellanox tested MVAPICH without the header caching; latency was around 1.9us As far as I remember Mellanox results and according to our testing difference between

Re: [OMPI devel] Problem in mpool rdma finalize

2007-08-13 Thread Jeff Squyres
FWIW: we fixed this recently in the openib BTL by ensuring that all registered memory is freed during the BTL finalize (vs. the mpool finalize). This is a new issue because the mpool finalize was just recently expanded to un-register all of its memory as part of the NIC-restart effort

Re: [OMPI devel] openib btl header caching

2007-08-13 Thread Terry D. Dontje
Jeff Squyres wrote: With Mellanox's new HCA (ConnectX), extremely low latencies are possible for short messages between two MPI processes. Currently, OMPI's latency is around 1.9us while all other MPI's (HP MPI, Intel MPI, MVAPICH[2], etc.) are around 1.4us. A big reason for this