Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-30 Thread Michael Kluskens
I have tested for the MPI_ABORT problem I was seeing and it appears to be fixed in the trunk. Michael On Oct 28, 2006, at 8:45 AM, Jeff Squyres wrote: Sorry for the delay on this -- is this still the case with the OMPI trunk? We think we finally have all the issues solved with MPI_ABORT on

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-28 Thread Ake Sandgren
On Sat, 2006-10-28 at 08:45 -0400, Jeff Squyres wrote: > Sorry for the delay on this -- is this still the case with the OMPI > trunk? > > We think we finally have all the issues solved with MPI_ABORT on the > trunk. > Nah, it was a problem with overutilization, i.e. 4tasks on 2 cpus in one

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-16 Thread Åke Sandgren
On Mon, 2006-10-16 at 10:13 +0200, Åke Sandgren wrote: > On Fri, 2006-10-06 at 00:04 -0400, Jeff Squyres wrote: > > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > > > > > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > > > Intel ifort 9.0.32 all tests

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-07 Thread Ralph Castain
On 10/5/06 10:04 PM, "Jeff Squyres" wrote: > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > >> The final auxiliary test is for BLACS_ABORT. >> Immediately after this message, all processes should be killed. >> If processes survive the call, your

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-06 Thread Jeff Squyres
On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > Intel ifort 9.0.32 all tests with 4 processors (comments below) > > OpenMPi 1.1.1 patched and OpenMPI 1.1.2 patched: >C & F tests: no errors with default

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-05 Thread George Bosilca
Thanks Michael. The seg-fault is related to some orterun problem. I notice it yesterday and we try to find a fix. For the rest I'm quite happy that the BLACS problem was solved. Thanks for your help, george. On Oct 5, 2006, at 2:42 PM, Michael Kluskens wrote: On Oct 4, 2006, at

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-05 Thread Michael Kluskens
On Oct 4, 2006, at 7:51 PM, George Bosilca wrote: This is the correct patch (same as previous minus the debugging statements). On Oct 4, 2006, at 7:42 PM, George Bosilca wrote: The problem was found and fixed. Until the patch get applied to the 1.1 and 1.2 branches please use the attached

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
The problem was found and fixed. Until the patch get applied to the 1.1 and 1.2 branches please use the attached patch. Thanks for you help for discovering and fixing this bug, george. ddt.patch Description: Binary data On Oct 4, 2006, at 5:32 PM, George Bosilca wrote: That's just

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
That's just amazing. We pass all the trapezoidal tests but we fail the general ones (rectangular matrix) if the leading dimension of the matrix on the destination processor is greater than the leading dimension on the sender. At least now I narrow down the place where the error occur ...

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
OK, that was my 5 minutes hall of shame. Setting the verbosity level in bt.dat to 6 give me enough information to know exactly the data- type share. Now, I know how to fix things ... george. On Oct 4, 2006, at 4:35 PM, George Bosilca wrote: I'm working on this bug. As far as I see the

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread George Bosilca
I'm working on this bug. As far as I see the patch from the bug 365 do not help us here. However, on my 64 bits machines (not Opteron but G5) I don't get the segfault. Anyway, I get the bad data transmission for test #1 and #51. So far my main problem is that I cannot reproduce these

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread Michael Kluskens
On Oct 4, 2006, at 8:22 AM, Harald Forbert wrote: The TRANSCOMM setting that we are using here and that I think is the correct one is "-DUseMpi2" since OpenMPI implements the corresponding mpi2 calls. You need a recent version of BLACS for this setting to be available (1.1 with patch 3 should

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-04 Thread Harald Forbert
> Additional note on the the BLACS vs. OpenMPI 1.1.1 & 1.3 problems: > > The BLACS install program xtc_CsameF77 says to not use -DCsameF77 > with OpenMPI; however, because of an oversight I used it in my first > tests -- for OpenMPI 1.1.1 the errors are the same with and without > this

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Michael Kluskens
Additional note on the the BLACS vs. OpenMPI 1.1.1 & 1.3 problems: The BLACS install program xtc_CsameF77 says to not use -DCsameF77 with OpenMPI; however, because of an oversight I used it in my first tests -- for OpenMPI 1.1.1 the errors are the same with and without this setting;

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Jeff Squyres
Thanks Michael -- I've updated ticket 356 with this info for v1.1, and created ticket 464 for the trunk (v1.3) issue. https://svn.open-mpi.org/trac/ompi/ticket/356 https://svn.open-mpi.org/trac/ompi/ticket/464 On 10/3/06 10:53 AM, "Michael Kluskens" wrote: > Summary: > >

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-03 Thread Michael Kluskens
Summary: OpenMPI 1.1.1 and 1.3a1r11943 have different bugs with regards to BLACS 1.1p3. 1.3 fails where 1.1.1 passes and vice-versus. (1.1.1): Integer, real, double precision SDRV tests fail cases 1 & 51, then lots of errors until Integer SUM test then all tests pass. (1.3): No errors