other kind
Chunk 1/2: Trying 524288 of 256xdouble, chunked, 1073741824 bytes: successfully
read 524288
Chunk 2/2: Trying 524289 of 256xdouble, chunked, 1073743872 bytes: successfully
read 524289
- Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
#include
It seems like this might also be an issue for gatherv and reduce_scatter
as well.
- Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
On 23 May 9:37PM, Jonathan Dursi wrote:
On the other hand, it works everywhere if I pad the rcounts array with
an extra valid value (0 or 1, or for that matter 783), or replace the
allgatherv with an allgather.
.. and it fails with 7 even where it worked (but succeeds with 8) if I
pad
the
allgatherv with an allgather.
- Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada | www.SciNetHPC.ca
For what it's worth, 1.4.4 built with the intel 12.1.0.233 compilers has been
the default mpi at our centre for over a month and we haven't had any
problems...
- jonathan
--
Jonathan Dursi; SciNet, Compute/Calcul Canada
-Original Message-
From: Richard Walsh <richard
reading the data in again and print them out, I always have:
buf(0)=0
If you compile your code with -check bounds and run, you'll get an error
pointing out that buf(0) is an illegal access; in Fortran arrays start at 1.
- Jonathan
--
Jonathan Dursi | SciNet, Compute/Calcul Canada
encies, no
discernable bandwidth improvements).
- Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca> SciNet, Compute/Calcul Canada
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
ewer larger messages.
Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
shared memory transport with
gcc 4.4.x.
Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
of OpenMPI (1.3.2 and
1.3.3). It was working correctly with OpenMPI version 1.2.7.
[...]
GCC version :
$ mpicc --version
gcc (Ubuntu 4.4.1-4ubuntu7) 4.4.1
Does it work if you turn off the shared memory transport layer; that is,
mpirun -n 6 -mca btl ^sm ./testmpi
?
- Jonathan
--
Jonathan Dursi
isn't ready for
real production use on our system.
- Jonathan
On 2009-09-24, at 4:16PM, Eugene Loh wrote:
Jonathan Dursi wrote:
So to summarize:
OpenMPI 1.3.2 + gcc4.4.0
Test problem with periodic (left neighbour of proc 0 is proc N-1)
Sendrecv()s:
Default always hangs
% of certain
single-node jobs hang, turning off sm or setting num_fifos to NP-1
eliminates this.
- Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
to
if (leftneighbour .eq. -1) then
leftneighbour = MPI_PROC_NULL
endif
if (rightneighbour .eq. nprocs) then
rightneighbour = MPI_PROC_NULL
endif
On Sep 21, 2009, at 5:09 PM, Jonathan Dursi wrote:
Continuing the conversation with myself:
Google pointed me to Trac ticket #1944
randomly) with 1.3.3,
mpirun -np 6 -mca btl tcp,self ./diffusion-mpi
or
mpirun -np 6 -mca btl_sm_num_fifos 5 -mca btl sm,self ./diffusion-mpi
always succeeds, with (as one might guess) the second being much faster...
Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
frequently - it hangs one time out of every ten or
so. But obviously this is still far too often to deploy in a production
environment.
Where should we be looking to track down this problem?
- Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
config.log.gz
Description: G
10 times it will
be successful 9 or so times. But the hangs still occur.
- Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
-np 6 -mca btl self,tcp ./diffusion-mpi
never gives any problems.
Any suggestions? I notice a mention of `improved flow control in sm' in
the ChangeLog to 1.3.3; is that a significant bugfix?
- Jonathan
--
Jonathan Dursi <ljdu...@scinet.utoronto.ca>
program d
19 matches
Mail list logo