Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-30 Thread Michael Kluskens
I have tested for the MPI_ABORT problem I was seeing and it appears to be fixed in the trunk. Michael On Oct 28, 2006, at 8:45 AM, Jeff Squyres wrote: Sorry for the delay on this -- is this still the case with the OMPI trunk? We think we finally have all the issues solved with MPI_ABORT on

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-28 Thread Ake Sandgren
On Sat, 2006-10-28 at 08:45 -0400, Jeff Squyres wrote: > Sorry for the delay on this -- is this still the case with the OMPI > trunk? > > We think we finally have all the issues solved with MPI_ABORT on the > trunk. > Nah, it was a problem with overutilization, i.e. 4tasks on 2 cpus in one n

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-28 Thread Jeff Squyres
Sorry for the delay on this -- is this still the case with the OMPI trunk? We think we finally have all the issues solved with MPI_ABORT on the trunk. On Oct 16, 2006, at 8:29 AM, Åke Sandgren wrote: On Mon, 2006-10-16 at 10:13 +0200, Åke Sandgren wrote: On Fri, 2006-10-06 at 00:04 -04

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-16 Thread Åke Sandgren
On Mon, 2006-10-16 at 10:13 +0200, Åke Sandgren wrote: > On Fri, 2006-10-06 at 00:04 -0400, Jeff Squyres wrote: > > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > > > > > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > > > Intel ifort 9.0.32 all tests with 4 processors (c

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-16 Thread Åke Sandgren
On Fri, 2006-10-06 at 00:04 -0400, Jeff Squyres wrote: > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > > > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > > Intel ifort 9.0.32 all tests with 4 processors (comments below) > > > > OpenMPi 1.1.1 patched and OpenMPI 1.1.2 p

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-10 Thread Michael Kluskens
On Oct 6, 2006, at 12:04 AM, Jeff Squyres wrote: On 10/5/06 2:42 PM, "Michael Kluskens" wrote: System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, Intel ifort 9.0.32 all tests with 4 processors (comments below) Good. Can you expand on what you mean by "slowed down"? Bad

[OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-10 Thread Michael Kluskens
On Oct 5, 2006, at 4:41 PM, George Bosilca wrote: Once you run the performance tests please let me know the outcome. Ignoring the other issue I just posted here are timings for BLACS 1.1p3 Tester with OpenMPI & MPICH2 on two nodes of a dual-opteron system running Debian Linux 3.1r3, compi

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-06 Thread Ralph Castain
On 10/5/06 10:04 PM, "Jeff Squyres" wrote: > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > >> The final auxiliary test is for BLACS_ABORT. >> Immediately after this message, all processes should be killed. >> If processes survive the call, your BLACS_ABORT is incorrect. >> {0,2}, pnum=2,

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-06 Thread Jeff Squyres
On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > Intel ifort 9.0.32 all tests with 4 processors (comments below) > > OpenMPi 1.1.1 patched and OpenMPI 1.1.2 patched: >C & F tests: no errors with default data set. F test

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-05 Thread George Bosilca
Thanks Michael. The seg-fault is related to some orterun problem. I notice it yesterday and we try to find a fix. For the rest I'm quite happy that the BLACS problem was solved. Thanks for your help, george. On Oct 5, 2006, at 2:42 PM, Michael Kluskens wrote: On Oct 4, 2006, at 7:

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-05 Thread Michael Kluskens
On Oct 4, 2006, at 7:51 PM, George Bosilca wrote: This is the correct patch (same as previous minus the debugging statements). On Oct 4, 2006, at 7:42 PM, George Bosilca wrote: The problem was found and fixed. Until the patch get applied to the 1.1 and 1.2 branches please use the attached pa

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
This is the correct patch (same as previous minus the debugging statements). Thanks, george. ddt.patch Description: Binary data On Oct 4, 2006, at 7:42 PM, George Bosilca wrote: The problem was found and fixed. Until the patch get applied to the 1.1 and 1.2 branches please use the

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
The problem was found and fixed. Until the patch get applied to the 1.1 and 1.2 branches please use the attached patch. Thanks for you help for discovering and fixing this bug, george. ddt.patch Description: Binary data On Oct 4, 2006, at 5:32 PM, George Bosilca wrote: That's just

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
That's just amazing. We pass all the trapezoidal tests but we fail the general ones (rectangular matrix) if the leading dimension of the matrix on the destination processor is greater than the leading dimension on the sender. At least now I narrow down the place where the error occur ...

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
OK, that was my 5 minutes hall of shame. Setting the verbosity level in bt.dat to 6 give me enough information to know exactly the data- type share. Now, I know how to fix things ... george. On Oct 4, 2006, at 4:35 PM, George Bosilca wrote: I'm working on this bug. As far as I see the pat

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
I'm working on this bug. As far as I see the patch from the bug 365 do not help us here. However, on my 64 bits machines (not Opteron but G5) I don't get the segfault. Anyway, I get the bad data transmission for test #1 and #51. So far my main problem is that I cannot reproduce these errors

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread Michael Kluskens
On Oct 4, 2006, at 8:22 AM, Harald Forbert wrote: The TRANSCOMM setting that we are using here and that I think is the correct one is "-DUseMpi2" since OpenMPI implements the corresponding mpi2 calls. You need a recent version of BLACS for this setting to be available (1.1 with patch 3 should be

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread Harald Forbert
> Additional note on the the BLACS vs. OpenMPI 1.1.1 & 1.3 problems: > > The BLACS install program xtc_CsameF77 says to not use -DCsameF77 > with OpenMPI; however, because of an oversight I used it in my first > tests -- for OpenMPI 1.1.1 the errors are the same with and without > this set

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Michael Kluskens
Additional note on the the BLACS vs. OpenMPI 1.1.1 & 1.3 problems: The BLACS install program xtc_CsameF77 says to not use -DCsameF77 with OpenMPI; however, because of an oversight I used it in my first tests -- for OpenMPI 1.1.1 the errors are the same with and without this setting; howeve

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Jeff Squyres
Thanks Michael -- I've updated ticket 356 with this info for v1.1, and created ticket 464 for the trunk (v1.3) issue. https://svn.open-mpi.org/trac/ompi/ticket/356 https://svn.open-mpi.org/trac/ompi/ticket/464 On 10/3/06 10:53 AM, "Michael Kluskens" wrote: > Summary: > > OpenMPI 1.1.1 and 1.3a

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Michael Kluskens
Summary: OpenMPI 1.1.1 and 1.3a1r11943 have different bugs with regards to BLACS 1.1p3. 1.3 fails where 1.1.1 passes and vice-versus. (1.1.1): Integer, real, double precision SDRV tests fail cases 1 & 51, then lots of errors until Integer SUM test then all tests pass. (1.3): No errors un