Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2013-12-18 Thread Ake Sandgren
On Wed, 2013-12-18 at 11:47 -0500, Noam Bernstein wrote: > Yes - I never characterized it fully, but we attached with gdb to every > single vasp running process, and all were stuck in the same > call to MPI_allreduce() every time. It's only happening on a rather large > jobs, so it's not the

[OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Ake Sandgren
Hi! Not sure if this has been caught already or not, but there is a typo in opal/memoryhooks/memory.h in 1.6.5. #ifndef OPAL_MEMORY_MEMORY_H #define OPAl_MEMORY_MEMORY_H Note the lower case "l" in the define. /Åke S.

Re: [OMPI users] Calling MPI_send MPI_recv from a fortran subroutine

2013-02-28 Thread Ake Sandgren
On Fri, 2013-03-01 at 01:24 +0900, Pradeep Jha wrote: > Sorry for those mistakes. I addressed all the three problems > - I put "implicit none" at the top of main program > - I initialized tag. > - changed MPI_INT to MPI_INTEGER > - "send_length" should be just "send", it was a typo. > > > But

[OMPI users] libmpi_f90 shared lib version number change in 1.6.3

2013-01-12 Thread Ake Sandgren
Hi! Was the change for libmpi_f90 in VERSION intentional or a typo? This is from openmpi 1.6.3 libmpi_f90_so_version=4:0:1 1.6.1 had libmpi_f90_so_version=2:0:1 -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 07:14 -0800, Ralph Castain wrote: > > Well, it isn't :-) > > configure says: > > --- MCA component grpcomm:pmi (m4 configuration macro) > > checking for MCA component grpcomm:pmi compile mode... dso > > checking if user requested PMI support... no > > checking if MCA

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 07:00 -0800, Ralph Castain wrote: > On Jan 3, 2013, at 6:52 AM, Ake Sandgren <ake.sandg...@hpc2n.umu.se> wrote: > > > On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote: > >> On Jan 3, 2013, at 3:01 AM, Ake Sandgren <ake.sandg...@hpc2n.umu

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 06:18 -0800, Ralph Castain wrote: > On Jan 3, 2013, at 3:01 AM, Ake Sandgren <ake.sandg...@hpc2n.umu.se> wrote: > > > On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote: > >> On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > &g

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 11:54 +0100, Ake Sandgren wrote: > On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > > Hi! > > > > The grpcomm component hier seems to have vanished between 1.6.1 and > > 1.6.3. > > Why? > > It seems that the versio

Re: [OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
On Thu, 2013-01-03 at 11:15 +0100, Ake Sandgren wrote: > Hi! > > The grpcomm component hier seems to have vanished between 1.6.1 and > 1.6.3. > Why? > It seems that the version of slurm we are using (not the latest at the > moment) is using it for star

[OMPI users] grpcomm component hier gone...

2013-01-03 Thread Ake Sandgren
Hi! The grpcomm component hier seems to have vanished between 1.6.1 and 1.6.3. Why? It seems that the version of slurm we are using (not the latest at the moment) is using it for startup. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90

Re: [OMPI users] fortran bindings for MPI_Op_commutative

2012-09-27 Thread Ake Sandgren
On Thu, 2012-09-27 at 16:31 +0200, Ake Sandgren wrote: > Hi! > > Building 1.6.1 and 1.6.2 i seem to be missing the actual fortran > bindings for MPI_Op_commutative and a bunch of other functions. > > My configure is > ./configure --enable-orterun-prefix-by-default --

[OMPI users] fortran bindings for MPI_Op_commutative

2012-09-27 Thread Ake Sandgren
. mpi_init_ is there (as a weak) as it should. All compilers give me the same result. Any ideas why? -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

[OMPI users] Bug in openmpi 1.5.4 in paffinity

2011-09-04 Thread Ake Sandgren
Hi! I'm getting a segfault in hwloc_setup_distances_from_os_matrix in the call to hwloc_bitmap_or due to objs or objs[i]->cpuset being freed and containing garbage, objs[i]->cpuset has infinite < 0. I only get this when using slurm with cgroups, asking for 2 nodes with 1 cpu each. The cpuset is

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
On Wed, 2010-09-22 at 14:16 +0200, Ake Sandgren wrote: > On Wed, 2010-09-22 at 07:42 -0400, Jeff Squyres wrote: > > This is a problem with the Pathscale compiler and old versions of GCC. See: > > > > > > http://www.open-mpi.org/faq/?category=building#pathscale-

Re: [OMPI users] PathScale problems persist

2010-09-22 Thread Ake Sandgren
------- > > [host1:29931] 3 more processes have sent help message > > help-mpi-errors.txt / mpi_errors_are_fatal > > [host1:29931] Set MCA parameter "orte_base_help_aggregate" to 0 to see > > all help / error messages > > > > There are no problems when Open MPI 1.4.2 is built with GCC (GCC 4.1.2). > > No problems are found with Open MPI 1.2.6 and PathScale either. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

[OMPI users] opal_mutex_lock(): Resource deadlock avoided

2010-05-06 Thread Ake Sandgren
likely caused by our setup. openmpi version is 1.4.2 (fails with 1.3.3 too) Filesystem used is GPFS openmpi built with mpi-threads but without progress-threads -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126

Re: [OMPI users] Segmentation fault in mca_btl_tcp

2010-04-15 Thread Ake Sandgren
cted non-mpi related packets coming in on the sockets will sometimes cause havoc. We've been getting http traffic in the jobs stdout/err sometimes. That really makes the users confused :-) And yes, we are going to block this but we haven't had time... -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
olatile__ ( SMPLOCK "cmpxchgl %3,%2 \n\t" "sete %0 \n\t" : "=qm" (ret), "+a" (oldval), "+m" (*addr) : "q"(newval)

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
On Wed, 2010-02-10 at 08:21 -0500, Jeff Squyres wrote: > On Feb 10, 2010, at 7:47 AM, Ake Sandgren wrote: > > > According to people who knows asm statements fairly well (compiler > > developers), it should be > > > static inline int opal_atomic_cmpset

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-10 Thread Ake Sandgren
asm statements fairly well (compiler developers), it should be static inline int opal_atomic_cmpset_32( volatile int32_t *addr, int32_t oldval, int32_t newval) { unsigned char ret; __asm__ __volatile__ ( SMPLOCK "cmpxchgl %3,%2 \n\t" "sete %0 \n\t" : "=qm" (ret), "=a" (oldval), "=m" (*addr) : "q"(newval), "2"(*addr), "1"(oldval) : "memory", "cc"); return (int)ret; } -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Ake Sandgren
e of SciCortex last year. I hope they will be able to release a new version fairly soon. In my opinion (working mostly with Fortran codes, shudder) it is the best compiler around. Although they have had problems over the years in coming out with fixes for bugs in a timely fashion. -- Ake Sa

Re: [OMPI users] Problems compiling OpenMPI 1.4 with PGI 9.0-3

2010-01-07 Thread Ake Sandgren
4.1-rc1 > which should work with PGI-10 and see if it fixes your problems too. Our PGI 9.0-3 doesn't have any problems building openmpi 1.3.3 or 1.4 -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW: http://www.hpc2n.umu.se

Re: [OMPI users] MPI_Irecv segmentation fault

2009-09-22 Thread Ake Sandgren
r, 1, ...) > The segfault disappears if I comment out the MPI_Irecv call in > recv_func so I'm assuming that there's something wrong with the > parameters that I'm passing to it. Thoughts? -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone

Re: [OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
On Fri, 2009-09-11 at 13:18 +0200, Ake Sandgren wrote: > Hi! > > The following code shows a bad behaviour when running over openib. Oops. Red Face big time. I happened to run the IB test between two systems that don't have IB connectivity. Goes and hide in a dark corner... -- Ake

[OMPI users] Bad MPI_Bcast behaviour when running over openib

2009-09-11 Thread Ake Sandgren
t should be allowed to behave as it does. This example is a bit engineered but there are codes where a similar situation can occur, i.e. the Bcast sender doing lots of other work after the Bcast before the next MPI call. VASP is a candidate for this. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sw

[OMPI users] Need help with tuning of IB for OpenMPI 1.3.3

2009-08-25 Thread Ake Sandgren
p up a lot better but not completely. OS: CentOS5.3 (OFED 1.3.2 and 1.4.2 tested) HW: Mellanox MT25208 InfiniHost III Ex (128MB) -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-28 Thread Ake Sandgren
mes to a complete standstill at the integer bsbr tests > >> It consumes cpu all the time but nothing happens. > > > > Actually if i'm not too inpatient i will progress but VERY slowly. > > A complete run of the blacstest takes +30min cpu-time... > >> From the bsbr t