[OMPI users] Bug in ompi/errhandler/errcode.h (1.8.6)?

2015-06-29 Thread Åke Sandgren
Hi! static inline int ompi_mpi_errnum_is_class ( int errnum ) { ompi_mpi_errcode_t *err; if (errno < 0) { return false; } I assume it should be errnum < 0. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134

[OMPI users] my_sense in ompi_osc_sm_module_t not always protected by OPAL_HAVE_POSIX_THREADS

2015-06-29 Thread Åke Sandgren
Hi! The my_sense entity in struct ompi_osc_sm_module_t is protected by OPAL_HAVE_POSIX_THREADS in the definition (ompi/mca/osc/sm/osc_sm.h) But in ./ompi/mca/osc/sm/osc_sm_active_target.c it is not. (Tripped on this due to a compiler problem which caused it to only partially detect threads

Re: [OMPI users] Bug in ompi/errhandler/errcode.h (1.8.6)?

2015-06-29 Thread Åke Sandgren
That's what i said. The code in openmpi checks errno and not errnum. On 06/29/2015 05:27 PM, Nathan Hjelm wrote: I see a typo. You are checking errno instead of errnum. -Nathan On Mon, Jun 29, 2015 at 01:28:11PM +0200, Åke Sandgren wrote: Hi! static inline int ompi_mpi_errnum_is_class

Re: [OMPI users] Bug in ompi/errhandler/errcode.h (1.8.6)?

2015-06-29 Thread Åke Sandgren
The interesting thing is that gcc/intel/portland all failed to detect this. Pathscale found it, and clang probably would. On 06/29/2015 05:37 PM, Jeff Squyres (jsquyres) wrote: Good catch; fixed. Thanks! On Jun 29, 2015, at 7:28 AM, Åke Sandgren <ake.sandg...@hpc2n.umu.se> wrot

Re: [OMPI users] my_sense in ompi_osc_sm_module_t not always protected by OPAL_HAVE_POSIX_THREADS

2015-06-29 Thread Åke Sandgren
, 1.8, and 1.10. -Nathan On Mon, Jun 29, 2015 at 05:26:30PM +0200, Åke Sandgren wrote: Hi! The my_sense entity in struct ompi_osc_sm_module_t is protected by OPAL_HAVE_POSIX_THREADS in the definition (ompi/mca/osc/sm/osc_sm.h) But in ./ompi/mca/osc/sm/osc_sm_active_target.c it is not. (Tripped

[OMPI users] Missing init of rc in modex (orte/mca/grpcomm/pmi/grpcomm_pmi_module.c), 1.8.6

2015-07-08 Thread Åke Sandgren
Hi! rc in modex in orte/mca/grpcomm/pmi/grpcomm_pmi_module.c is not properly initialized and is causing problems at least with the intel compiler. diff -ru site/orte/mca/grpcomm/pmi/grpcomm_pmi_module.c intel/orte/mca/grpcomm/pmi/grpcomm_pmi_module.c ---

Re: [OMPI users] Open MPI 1.8.8 and hcoll in system space

2015-08-11 Thread Åke Sandgren
Please fix the hcoll test (and code) to be correct. Any configure test that adds /usr/lib and/or /usr/include to any compile flags is broken. And if hcoll include files are under $HCOLL_HOME/include/hcoll (and hcoll/api) then the include directives in the source should be #include and

Re: [OMPI users] Open MPI 1.8.8 and hcoll in system space

2015-08-11 Thread Åke Sandgren
On 08/11/2015 10:22 AM, Gilles Gouaillardet wrote: i do not know the context, so i should not jump to any conclusion ... if xxx.h is in $HCOLL_HOME/include/hcoll in hcoll version Y, but in $HCOLL_HOME/include/hcoll/api in hcoll version Z, then the relative path to $HCOLL_HOME/include cannot be

Re: [OMPI users] Bug in ompi/errhandler/errcode.h (1.8.6)?

2015-08-14 Thread Åke Sandgren
This problem still exists in 1.8.8 On 06/29/2015 05:37 PM, Jeff Squyres (jsquyres) wrote: Good catch; fixed. Thanks! On Jun 29, 2015, at 7:28 AM, Åke Sandgren <ake.sandg...@hpc2n.umu.se> wrote: Hi! static inline int ompi_mpi_errnum_is_class ( int errnum ) { ompi_mpi_errcode_

Re: [OMPI users] forrtl: severe (174): SIGSEGV, segmentation fault occurred

2014-01-02 Thread Åke Sandgren
On 01/02/2014 11:08 AM, Hongyi Zhao wrote: Hi all, I compiled openmpi-1.6.5 with ifort-14.0.0, then I use the mpif90 wrapper of openmpi to compile the siesta package - a DFT package, obtain from here:http://departments.icmab.es/leem/siesta/ . After I successfully compile the siesta

[OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-01-27 Thread Åke Sandgren
Hi! I just started trying to build 1.7.4rc1 with the new Pathscale EkoPath5 compiler and stumbled onto this. When building without --enable-mpi-f08-subarray-prototype i get into problems with ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h It defines subroutine

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-01-27 Thread Åke Sandgren
On 01/27/2014 03:28 PM, Åke Sandgren wrote: Hi! I just started trying to build 1.7.4rc1 with the new Pathscale EkoPath5 compiler and stumbled onto this. When building without --enable-mpi-f08-subarray-prototype i get into problems with ompi/mpi/fortran/use-mpi-f08/mpi-f-interfaces-bind.h

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-01-27 Thread Åke Sandgren
On 01/27/2014 04:31 PM, Jeff Squyres (jsquyres) wrote: We *do* still have a problem in the mpi_f08 module that we probably won't fix before 1.7.4 is released. Here's the ticket: https://svn.open-mpi.org/trac/ompi/ticket/4157 Craig has a suggested patch, but a) I haven't had time to

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-01-27 Thread Åke Sandgren
On 01/27/2014 04:44 PM, Åke Sandgren wrote: On 01/27/2014 04:31 PM, Jeff Squyres (jsquyres) wrote: We *do* still have a problem in the mpi_f08 module that we probably won't fix before 1.7.4 is released. Here's the ticket: https://svn.open-mpi.org/trac/ompi/ticket/4157 Craig has

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-01-31 Thread Åke Sandgren
On 01/28/2014 08:26 PM, Jeff Squyres (jsquyres) wrote: Ok, will do. Yesterday, I put in a temporary behavioral test in configure that will exclude ekopath 5.0 in 1.7.4. We'll remove this behavioral test once OMPI fixes the bug correctly (for 1.7.5). I'm not 100% sure yet (my F2k3 spec is

Re: [OMPI users] openmpi 1.7.4rc1 and f08 interface

2014-02-03 Thread Åke Sandgren
On 02/01/2014 03:12 PM, Jeff Squyres (jsquyres) wrote: I think that ompi_funloc_variant1 needs to do IMPORT to have access to the callback_variant1 definition before using it to define "FN" I.e. ! function ompi_funloc_variant1(fn) use, intrinsic :: iso_c_binding, only:

[OMPI users] probable bug in 1.9a1r31409

2014-04-16 Thread Åke Sandgren
Hi! Found this problem when building r31409 with Pathscale 5.0 pshmem_barrier.c:81:6: error: redeclaration of 'pshmem_barrier_all' must have the 'overloadable' attribute void shmem_barrier_all(void) ^ ../../../../oshmem/shmem/c/profile/defines.h:193:37: note: expanded from macro

Re: [OMPI users] probable bug in 1.9a1r31409

2014-04-16 Thread Åke Sandgren
On 04/16/2014 02:25 PM, Åke Sandgren wrote: Hi! Found this problem when building r31409 with Pathscale 5.0 pshmem_barrier.c:81:6: error: redeclaration of 'pshmem_barrier_all' must have the 'overloadable' attribute void shmem_barrier_all(void) ^ ../../../../oshmem/shmem/c/profile

Re: [OMPI users] OpenMPI 1.8 and PGI compilers

2014-04-29 Thread Åke Sandgren
On 04/29/2014 12:15 AM, Jeff Squyres (jsquyres) wrote: Brian: Can you report this bug to PGI and see what they say? PGC-S-0094-Illegal type conversion required (btl_scif_component.c: 215) PGC/x86-64 Linux 14.3-0: compilation completed with severe errors make[2]: *** [btl_scif_component.lo]

Re: [OMPI users] OpenMPI 1.8 and PGI compilers

2014-04-29 Thread Åke Sandgren
On 04/29/2014 07:55 AM, Åke Sandgren wrote: On 04/29/2014 12:15 AM, Jeff Squyres (jsquyres) wrote: Brian: Can you report this bug to PGI and see what they say? PGC-S-0094-Illegal type conversion required (btl_scif_component.c: 215) PGC/x86-64 Linux 14.3-0: compilation completed with severe

Re: [OMPI users] OpenMPI 1.8 and PGI compilers

2014-04-30 Thread Åke Sandgren
On 04/29/2014 09:33 AM, Åke Sandgren wrote: On 04/29/2014 07:55 AM, Åke Sandgren wrote: On 04/29/2014 12:15 AM, Jeff Squyres (jsquyres) wrote: Brian: Can you report this bug to PGI and see what they say? PGC-S-0094-Illegal type conversion required (btl_scif_component.c: 215) PGC/x86-64 Linux

Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew

2009-10-01 Thread Åke Sandgren
On Thu, 2009-10-01 at 13:58 -0400, Jeff Squyres wrote: > Did that make it over to the v1.3 branch? No it didn't. And memalign is obsolete according to the manpage. posix_memalign is the one to use. > > > > I think Jeff has already addressed this problem. > > > >

Re: [OMPI users] memalign usage in OpenMPI and it's consequencesfor TotalVIew

2009-10-01 Thread Åke Sandgren
On Thu, 2009-10-01 at 19:56 +0100, Ashley Pittman wrote: > Simple malloc() returns pointers that are at least eight byte aligned > anyway, I'm not sure what the reason for calling memalign() with a value > of four would be be anyway. That is not necessarily true on all systems. -- Ake Sandgren,

Re: [OMPI users] memalign usage in OpenMPI and it's consequencesforTotalVIew

2009-10-01 Thread Åke Sandgren
On Thu, 2009-10-01 at 15:19 -0400, Jeff Squyres wrote: > On Oct 1, 2009, at 2:19 PM, Åke Sandgren wrote: > > > No it didn't. And memalign is obsolete according to the manpage. > > posix_memalign is the one to use. > > > > > This particular call is testing the m

Re: [OMPI users] ScaLAPACK and OpenMPI > 1.3.1

2010-01-21 Thread Åke Sandgren
On Thu, 2010-01-21 at 14:48 -0600, Champagne, Nathan J. (JSC-EV)[Jacobs Technology] wrote: > We started having a problem with OpenMPI beginning with version 1.3.2 > where the program output can be correct, junk, or NaNs (result is not > predictable). The output is the solution of a matrix equation

Re: [OMPI users] ScaLAPACK and OpenMPI > 1.3.1

2010-01-21 Thread Åke Sandgren
On Thu, 2010-01-21 at 15:40 -0600, Champagne, Nathan J. (JSC-EV)[Jacobs Technology] wrote: > >What is a correct result then? > > The correct results are output by v1.3.1. The filename in the archive is > "sol_1.3.1_96.txt". > > >How often do you get junk or NaNs compared to correct result. > We

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-01-25 Thread Åke Sandgren
1 - Do you have problems with openmpi 1.4 too? (I don't, haven't built 1.4.1 yet) 2 - There is a bug in the pathscale compiler with -fPIC and -g that generates incorrect dwarf2 data so debuggers get really confused and will have BIG problems debugging the code. I'm chasing them to get a fix... 3 -

Re: [OMPI users] Problems building Open MPI 1.4.1 with Pathscale

2010-02-09 Thread Åke Sandgren
On Tue, 2010-02-09 at 13:42 -0500, Jeff Squyres wrote: > Perhaps someone with a pathscale compiler support contract can investigate > this with them. > > Have them contact us if they want/need help understanding our atomics; we're > happy to explain, etc. (the atomics are fairly localized to a

Re: [OMPI users] MPI_Allreduce on local machine

2010-07-28 Thread Åke Sandgren
On Wed, 2010-07-28 at 11:48 -0400, Gus Correa wrote: > Hi Hugo, Jeff, list > > Hugo: I think David Zhang's suggestion was to use > MPI_REAL8 not MPI_REAL, instead of MPI_DOUBLE_PRECISION in your > MPI_Allreduce call. > > Still, to me it looks like OpenMPI is making double precision 4-byte >

Re: [OMPI users] [Fwd: MPI question/problem] including code attachments

2007-06-21 Thread Åke Sandgren
On Thu, 2007-06-21 at 13:27 -0500, Anthony Chan wrote: > It seems the hang only occurs when OpenMPI is built with > --enable-mpi-threads --enable-progress-threads. [My OpenMPI builds use > gcc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4)]. Probably > --enable-mpi-threads is the relevant option to cause

Re: [OMPI users] [Fwd: MPI question/problem] including code attachments

2007-06-22 Thread Åke Sandgren
On Thu, 2007-06-21 at 14:14 -0500, Anthony Chan wrote: > What test you are refering to ? > > config.log contains the test results of the features that configure is > looking for. Failure of some thread test does not mean OpenMPI can't > support threads. In fact, I was able to run a

[OMPI users] incorrect configure code (1.2.4 and earlier)

2007-09-27 Thread Åke Sandgren
Hi! There are a couple of bugs in the configure scripts regarding threads checking. In ompi_check_pthread_pids.m4 the actual code for testing is wrong and is also missing a CFLAG save/add-THREAD_CFLAGS/restore resulting in the linking always failing for the -pthread test with gcc. config.log

Re: [OMPI users] incorrect configure code (1.2.4 and earlier)

2007-09-27 Thread Åke Sandgren
On Thu, 2007-09-27 at 09:09 -0400, Tim Prins wrote: > Hi Ake, > > Looking at the svn logs it looks like you reported the problems with > these checks quite a while ago and we fixed them (in r13773 > https://svn.open-mpi.org/trac/ompi/changeset/13773), but we never moved > them to the 1.2

Re: [OMPI users] incorrect configure code (1.2.4 and earlier)

2007-09-27 Thread Åke Sandgren
On Thu, 2007-09-27 at 14:18 -0400, Tim Prins wrote: > Åke Sandgren wrote: > > On Thu, 2007-09-27 at 09:09 -0400, Tim Prins wrote: > >> Hi Ake, > >> > >> Looking at the svn logs it looks like you reported the problems with > >> these checks q

[OMPI users] Bug in common_mx.c (1.2.5a0r16522)

2007-10-24 Thread Åke Sandgren
Hi! In common_mx.c the following looks wrong. ompi_common_mx_finalize(void) { mx_return_t mx_return; ompi_common_mx_initialize_ref_cnt--; if(ompi_common_mx_initialize == 0) { That should be if(ompi_common_mx_initialize_ref_cnt == 0) right? -- Ake Sandgren, HPC2N, Umea University,

Re: [OMPI users] Bug in common_mx.c (1.2.5a0r16522)

2007-10-24 Thread Åke Sandgren
On Wed, 2007-10-24 at 09:00 +0200, Åke Sandgren wrote: > Hi! > > In common_mx.c the following looks wrong. > ompi_common_mx_finalize(void) > { > mx_return_t mx_return; > ompi_common_mx_initialize_ref_cnt--; > if(ompi_common_mx_initialize

Re: [OMPI users] mpicc Segmentation Fault with Intel Compiler

2007-11-06 Thread Åke Sandgren
On Tue, 2007-11-06 at 10:28 +0100, Michael Schulz wrote: > Hi, > > I've the same problem described by some other users, that I can't > compile anything if I'm using the open-mpi compiled with the Intel- > Compiler. > > > ompi_info --all > Segmentation fault > > OpenSUSE 10.3 > Kernel:

Re: [OMPI users] mpicc Segmentation Fault with Intel Compiler

2007-11-07 Thread Åke Sandgren
On Tue, 2007-11-06 at 20:49 -0500, Jeff Squyres wrote: > On Nov 6, 2007, at 4:42 AM, Åke Sandgren wrote: > > > I had the same problem with pathscale. > > There is a known outstanding problem with the pathscale problem. I am > still waiting for a solution from their eng

Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > Hello, > > After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get > > ->mpicxx --showme > Segmentation fault > > ->ompi_info > Segmentation fault > > The 10.1.008 is the only one I know that officially supports

Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/ openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote: > On Sun, 2007-12-02 at 21:27 -0500, de Almeida, Valmor F. wrote: > > Hello, > > > > After compiling ompi-1.2.4 with the intel compiler suite 10.1.008, I get > > > > ->mpicxx --showme > &g

Re: [OMPI users] Segmentation fault: intel 10.1.008 compilers w/openmpi-1.2.4

2007-12-04 Thread Åke Sandgren
On Tue, 2007-12-04 at 15:28 -0500, de Almeida, Valmor F. wrote: > > -Original Message- > > From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On > > > > On Tue, 2007-12-04 at 09:33 +0100, Åke Sandgren wrote: > > > On Sun, 2007-12-02 at

Re: [OMPI users] SCALAPACK: Segmentation Fault (11) and Signal code:Address not mapped (1)

2008-01-31 Thread Åke Sandgren
On Wed, 2008-01-30 at 10:01 -0600, Backlund, Daniel wrote: > Jeff, thank your for your suggestion, I am sure that the correct mpif.h is > being included. One > thing that I did not do in my original message was submit the job to SGE. I > did that and the > program still failed with the same

Re: [OMPI users] Help: Trouble building OpenMPI v1.2.4 with PGI v7.0-6

2008-02-01 Thread Åke Sandgren
On Thu, 2008-01-31 at 16:01 -0800, Adam Moody wrote: > Here is some more info. The build works if I do either of: > > (1) Build with PGI v7.1-3 instead of PGI v7.0-3 > (2) Or, drop the "-g" option in CXXFLAGS, i.e., > change: > CXXFLAGS="-Msignextend -g -O2" > to just: >

[OMPI users] Problems using Intel MKL with OpenMPI and Pathscale

2008-04-09 Thread Åke Sandgren
Hi! I have an annoying problem that i hope someone here has some info on. I'm trying to build a code with OpenMPI+Intel MKL+Pathscale. When using the sequential (non-threaded) MKL everything is ok, but when using the threaded MKL i get a segfault. This doesn't happen when using MVAPICH so i

Re: [OMPI users] Problems using Intel MKL with OpenMPI and Pathscale

2008-04-13 Thread Åke Sandgren
On Sun, 2008-04-13 at 08:00 -0400, Jeff Squyres wrote: > Do you get the same error if you disable the memory handling in Open > MPI? You can configure OMPI with: > > --disable-memory-manager Ah, I have apparently missed that config flag, will try on monday. -- Ake Sandgren, HPC2N, Umea

Re: [OMPI users] Problems using Intel MKL with OpenMPI and Pathscale

2008-04-14 Thread Åke Sandgren
On Sun, 2008-04-13 at 08:00 -0400, Jeff Squyres wrote: > Do you get the same error if you disable the memory handling in Open > MPI? You can configure OMPI with: > > --disable-memory-manager Doesn't help, it still compiles ptmalloc2 and trying to turn off ptmaloc2 during runtime doesn't

Re: [OMPI users] OpenMPI scaling > 512 cores

2008-06-04 Thread Åke Sandgren
On Wed, 2008-06-04 at 11:43 -0700, Scott Shaw wrote: > Hi, I was wondering if anyone had any comments with regarding to my > posting of questions. Am I off base with my questions or is this the > wrong forum for these types of questions? > > > > > Hi, I hope this is the right forum for my

[OMPI users] Problem with btl_openib_endpoint_post_rr

2008-08-26 Thread Åke Sandgren
Hi! We have a code that (at least sometimes) gets the following error message: [p-bc2909][0,1,98][btl_openib_endpoint.h:201:btl_openib_endpoint_post_rr] error posting receive errno says Numerical result out of range Any ideas as to where i should start searching for the problem? -- Ake

Re: [OMPI users] Problem with btl_openib_endpoint_post_rr

2008-08-26 Thread Åke Sandgren
On Tue, 2008-08-26 at 15:02 +0300, Pavel Shamis (Pasha) wrote: > Hi, > Can you please provide more information about your setup: > - OpenMPI version > - Runtime tuning > - Platform > - IB vendor and driver version openmpi: 1.2.6 runtime: mpirun -mca mpi_yield_when_idle 1 (PBS -l nodes=32:ppn=8)

[OMPI users] Bug in openmpi 1.3 orte/mca/plm/tm/Makefile.am

2009-02-11 Thread Åke Sandgren
Hi! orte/mca/plm/tm/Makefile.am is missing a mca_plm_tm_la_LIBADD = $(plm_tm_LIBS) like the corresponding line in orte/mca/ras/tm/Makefile.am mca_ras_tm_la_LIBADD... I think this is the cause for the "undefined symbol: tm_init" mail from 2009-02-09 20:41:45 by Brett Pemberton I have the same

Re: [OMPI users] undefined symbol: tm_init

2009-02-12 Thread Åke Sandgren
On Wed, 2009-02-11 at 17:14 -0700, Ralph Castain wrote: > Actually, this was also the subject of another email thread on the > user list earlier today. The user noted that we had lost an important > line in our Makefile.am for the tm plm module, and that this was the > root cause of the

Re: [OMPI users] openib RETRY EXCEEDED ERROR

2009-02-27 Thread Åke Sandgren
On Fri, 2009-02-27 at 09:54 -0700, Matt Hughes wrote: > 2009/2/26 Brett Pemberton : > > [[1176,1],0][btl_openib_component.c:2905:handle_wc] from tango092.vpac.org > > to: tango090 error polling LP CQ with status RETRY EXCEEDED ERROR status > > number 12 for wr_id 38996224 opcode 0

[OMPI users] valgrind complaint in openmpi 1.3 (mca_mpool_sm_alloc)

2009-03-10 Thread Åke Sandgren
Hi! Valgrind seems to think that there is an use of uninitialized value in mca_mpool_sm_alloc, i.e. the if(mpool_sm->mem_node >= 0) { Backtracking that i found that mem_node is not set during initializing in mca_mpool_sm_init. The resources parameter is never used and the mpool_module->mem_node

Re: [OMPI users] valgrind complaint in openmpi 1.3 (mca_mpool_sm_alloc)

2009-03-10 Thread Åke Sandgren
On Tue, 2009-03-10 at 09:23 -0800, Eugene Loh wrote: > Åke Sandgren wrote: > > >Hi! > > > >Valgrind seems to think that there is an use of uninitialized value in > >mca_mpool_sm_alloc, i.e. the if(mpool_sm->mem_node >= 0) { > >Backtracking that i found th

[OMPI users] Possible regression from 1.2 to 1.3 when BLACS is involved

2009-03-24 Thread Åke Sandgren
Hi! We're having problems with code that uses BLACS and openmpi 1.3.x. When compiled with memory-manager turned on (base only), code using BLACS either start leaking memory or gets into some kind of deadlock. The first code-case can be circumvented by using mpi_leave_pinned_pipeline 0, but the

Re: [OMPI users] [Open MPI Announce] Critical bug notice

2009-03-27 Thread Åke Sandgren
On Fri, 2009-03-27 at 11:34 -0700, Jeff Squyres wrote: > The Open MPI team has uncovered a serious bug in Open MPI v1.3.0 and > v1.3.1: when running on OpenFabrics-based networks, silent data > corruption is possible in some cases. There are two workarounds to > avoid the issue -- please

Re: [OMPI users] PGI Fortran pthread support

2009-04-14 Thread Åke Sandgren
On Mon, 2009-04-13 at 16:48 -0600, Orion Poplawski wrote: > Seeing the following building openmpi 1.3.1 on CentOS 5.3 with PGI pgf90 > 8.0-5 fortran compiler: > checking for PTHREAD_MUTEX_ERRORCHECK_NP... yes > checking for PTHREAD_MUTEX_ERRORCHECK... yes > checking for working POSIX threads

[OMPI users] Problems with "error polling LP CQ with status RNR"

2009-05-13 Thread Åke Sandgren
Hi! I'm having problem with getting the "error polling LP CQ with status RNR..." on an otherwise completely empty system. There are no errors visible in the error counters in any of the HCAs or switches or anywhere else. I'm running OMPI 1.3.2 built with pathscale 3.2 If i add -mca btl

Re: [OMPI users] Problems with "error polling LP CQ with status RNR"

2009-05-14 Thread Åke Sandgren
On Thu, 2009-05-14 at 09:24 -0400, Jeff Squyres wrote: > On May 13, 2009, at 4:55 PM, Åke Sandgren wrote: > > > I'm having problem with getting the "error polling LP CQ with status > > RNR..." on an otherwise completely empty system. > > There are no errors vi

Re: [OMPI users] OpenMPI 1.3.2 with PathScale 3.2

2009-05-14 Thread Åke Sandgren
On Thu, 2009-05-14 at 13:35 -0700, Joshua Bernstein wrote: > Greetings All, > > I'm trying to build OpenMPI 1.3.2 with the Pathscale compiler, version > 3.2. A > bit of the way through the build the compiler dies with what it things is a > bad > optimization. Has anybody else seen this,

Re: [OMPI users] Receiving MPI messages of unknown size

2009-06-04 Thread Åke Sandgren
On Thu, 2009-06-04 at 14:54 +1000, Lars Andersson wrote: > Hi Gus, > > Thanks for the suggestion. I've been thinking along those lines, but > it seems to have drawbacks. Consider the following MPI conversation: > > TimeNODE 1 NODE 2 > 0local work

[OMPI users] oob-tcp problem, unreachable in orted_comm

2009-06-06 Thread Åke Sandgren
Just got this in a user job. Any idea why it complains like this. The original error was the infamous "RETRY EXCEEDED ERROR" but instead of killing the job it showed this and never died. I have never seen this happen before. openmpi 1.3.2, built with intel 10.1 This binary is used ALOT (+50% of

Re: [OMPI users] Valgrind writev() errors with 1.3.2.

2009-06-09 Thread Åke Sandgren
On Tue, 2009-06-09 at 12:01 -0600, Ralph Castain wrote: > I can't speak to all of the OMPI code, but I can certainly create a > new configure option --valgrind-friendly that would initialize the OOB > comm buffers and other RTE-related memory to eliminate such warnings. > > I would prefer to

[OMPI users] openmpi 1.8.8: Problems with MPI_Send and mmap:ed buffer

2015-10-08 Thread Åke Sandgren
Hi! The attached code shows a problem when using mmap:ed buffer with MPI_Send and vader btl. With OMPI_MCA_btl='^vader' it works in all cases i have tested. Intel MPI also have problems with this, failing to receive the complete data, getting a NULL at position 6116 when the receiver is on

Re: [OMPI users] openmpi 1.8.8: Problems with MPI_Send and mmap:ed buffer

2015-11-18 Thread Åke Sandgren
Did anyone take notice of this? I haven't seen any respons. On 10/08/2015 11:14 AM, Åke Sandgren wrote: Hi! The attached code shows a problem when using mmap:ed buffer with MPI_Send and vader btl. With OMPI_MCA_btl='^vader' it works in all cases i have tested. Intel MPI also have problems

Re: [OMPI users] my_sense in ompi_osc_sm_module_t not always protected by OPAL_HAVE_POSIX_THREADS

2015-12-07 Thread Åke Sandgren
The #if OPAL_HAVE_POSIX_THREADS is still there around my_sense in osc_sm.h in 1.10.1 On 06/29/2015 05:42 PM, Åke Sandgren wrote: Yeah, i thought so. Well code reductions are good when correct :-) On 06/29/2015 05:39 PM, Nathan Hjelm wrote: Open MPI has required posix threads for some time

Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-11 Thread Åke Sandgren
Looks like you are compiling with slurm support. If so, you need to remove the "-pthread" from libslurm.la and libpmi.la On 07/11/2016 02:54 PM, Michael Di Domenico wrote: > I'm trying to get openmpi compiled using the PGI compiler. > > the configure goes through and the code starts to compile,

Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-14 Thread Åke Sandgren
No, you have to manually edit those two .la files by hand after installation. It's basically a libtool problem. It generates the .la file with an option that PGI dsoesn't understand. On 07/14/2016 04:06 PM, Michael Di Domenico wrote: > On Mon, Jul 11, 2016 at 9:52 AM, Åke Sandgren <ake

Re: [OMPI users] Blacs tester failure due to bug in datatype_unpack.c (?)

2006-09-09 Thread Åke Sandgren
On Fri, 2006-09-08 at 20:31 -0400, Jeff Squyres wrote: > Thanks Harald -- I filed bug 356 on this: > > http://svn.open-mpi.org/trac/ompi/ticket/356 > > > On 9/6/06 10:39 AM, "Harald Forbert" > wrote: > > > I think I traced a bug found by blacs

Re: [OMPI users] BLACS Tester installation errors

2006-09-21 Thread Åke Sandgren
On Thu, 2006-09-21 at 09:26 -0400, Benjamin Gaudio wrote: > I have installed OpenMPI 1.1.1 for the first time yesterday and am > now having trouble getting the BLACS Tester to install properly. > OpenMPI seemed to build without error, and BLACS also built without > any apparent errors. When I

Re: [OMPI users] BLACS & OpenMPI

2006-10-03 Thread Åke Sandgren
On Mon, 2006-10-02 at 18:39 -0400, Michael Kluskens wrote: > Having trouble getting BLACS to pass tests. > > OpenMPI, BLACS, and blacstester built just fine. Tester reports > errors for integer and real cases #1 and #51 and more for the other > types.. > >

[OMPI users] Bugs in config tests for threads (1.1.2rc3 at least)

2006-10-06 Thread Åke Sandgren
Hi! Attached is a patch that fixes some errors in the configure tests for pthreads on linux (both for gcc and pgi). -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46 90 7866134 Fax: +46 90 7866126 Mobile: +46 70 7716134 WWW:

Re: [OMPI users] Bugs in config tests for threads (1.1.2rc3 at least)

2006-10-06 Thread Åke Sandgren
On Fri, 2006-10-06 at 11:35 +0200, Åke Sandgren wrote: > Hi! > > Attached is a patch that fixes some errors in the configure tests for > pthreads on linux (both for gcc and pgi). Oops, forgot part of the patch. Here is an updated patch. diff -ru site/config/ompi_config

Re: [OMPI users] Bugs in config tests for threads (1.1.2rc3 at least)

2006-10-06 Thread Åke Sandgren
On Fri, 2006-10-06 at 10:18 -0400, Brian W. Barrett wrote: > Is there a platform on which this breaks? It seems to have worked well > for years... I'll take a closer look early next week. It should be a general problem as far as i know. It might have "worked well for years" but it has never

Re: [OMPI users] Bugs in config tests for threads (1.1.2rc3 at least)

2006-10-11 Thread Åke Sandgren
On Fri, 2006-10-06 at 10:18 -0400, Brian W. Barrett wrote: > Is there a platform on which this breaks? It seems to have worked well > for years... I'll take a closer look early next week. Have you had a chance to look at this yet? I could use a new "release" to build from since something

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-16 Thread Åke Sandgren
On Fri, 2006-10-06 at 00:04 -0400, Jeff Squyres wrote: > On 10/5/06 2:42 PM, "Michael Kluskens" wrote: > > > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3.3.5, > > Intel ifort 9.0.32 all tests with 4 processors (comments below) > > > > OpenMPi 1.1.1 patched

Re: [OMPI users] BLACS vs. OpenMPI 1.1.1 & 1.3

2006-10-16 Thread Åke Sandgren
On Mon, 2006-10-16 at 10:13 +0200, Åke Sandgren wrote: > On Fri, 2006-10-06 at 00:04 -0400, Jeff Squyres wrote: > > On 10/5/06 2:42 PM, "Michael Kluskens" <mk...@ieee.org> wrote: > > > > > System: BLACS 1.1p3 on Debian Linux 3.1r3 on dual-opteron, gcc 3

[OMPI users] Problems running Intel Mpi Benchmark (formerly PMB) with ompi 1.1.2 and 1.2b1

2006-11-16 Thread Åke Sandgren
Hi! I'm having problems running the Allgather test of the IMB 3.0. System: Ubuntu Dapper, dual Amd Opteron, Myricom MX 1.1.5 OMPI version: 1.1.2 and 1.2b1 buildflags -O0 -g started with mpirun -mca mpi_yield_when_idle 1 -mca mpi_keep_peer_hostnames 0 (The problem also exists when

Re: [OMPI users] Fortran90 interfaces--problem?

2007-03-06 Thread Åke Sandgren
On Tue, 2007-03-06 at 09:51 -0500, Jeff Squyres wrote: > On Mar 5, 2007, at 9:50 AM, Michael wrote: > > > I have discovered a problem with the Fortran90 interfaces for all > > types of communication when one uses derived datatypes (I'm currently > > using openmpi-1.3a1r13918 [for testing] and

[hwloc-users] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Åke Sandgren
Hi! On a 48 core AMD bulldozer node with Ubuntu kernel 3.13.0-57-generic i get this with hwloc 1.11.0 * hwloc 1.11.0 has encountered what looks like an error from the operating system. * * L3 (cpuset 0x03f0)

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Åke Sandgren
Yes the BIOS is the same. Anything else i should check? On 07/09/2015 04:10 PM, Brice Goglin wrote: Hello The 3.13 kernel reports invalid L3 cache information in sysfs. 0x3f0 is not possible on this processor, it should be either 0x3f or 0xfc (there's exactly one L3 per NUMA node, with the

Re: [hwloc-users] [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-09 Thread Åke Sandgren
Attached tar file with data from both systems. See Readme file for kernel versions On 07/09/2015 07:54 PM, Brice Goglin wrote: Can you send the output of this command on both nodes? cat /sys/devices/system/cpu/cpu{?,??}/cache/index3/shared_cpu_map | uniq -c And send the XML output of lstopo

Re: [hwloc-users] ***UNCHECKED*** Re: [WARNING: A/V UNSCANNABLE] hwloc 1.11.0 seems to have problem with 3.13 kernel on AMD bulldozer

2015-07-24 Thread Åke Sandgren
No i haven't yet. Went on summer vacation before i had time. On 07/24/2015 12:38 AM, Bill Broadley wrote: I have the same problem with ubuntu 14.04.2 (fully patched) using the 3.13.0-58 and hwloc-1.11.0: * hwloc 1.11.0 has encountered what looks like an error from the operating system. * * L3

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-23 Thread Åke Sandgren
E5-2697A which version? v4? On 03/23/2017 09:53 AM, Götz Waschk wrote: > Hi Åke, > > I have E5-2697A CPUs and Mellanox ConnectX-3 FDR Infiniband. I'm using > EL7.3 as the operating system. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se Phone: +46

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-23 Thread Åke Sandgren
Since i'm seeing similar Bus errors from both openmpi and other places on our system I'm wondering, what hardware do you have? CPU:s, interconnect etc. On 03/23/2017 08:45 AM, Götz Waschk wrote: > Hi Howard, > > I have attached my config.log file for version 2.1.0. I have based it > on the

Re: [OMPI users] Openmpi 1.10.4 crashes with 1024 processes

2017-03-23 Thread Åke Sandgren
Ok, we have E5-2690v4's and Connect-IB. On 03/23/2017 10:11 AM, Götz Waschk wrote: > On Thu, Mar 23, 2017 at 9:59 AM, Åke Sandgren <ake.sandg...@hpc2n.umu.se> > wrote: >> E5-2697A which version? v4? > Hi, yes, that one: > Intel(R) Xeon(R) CPU E5-2697A v4 @ 2.

Re: [OMPI users] Compiler error with PGI: pgcc-Error-Unknown switch: -pthread

2017-04-03 Thread Åke Sandgren
This usually comes from slurm, so we always do perl -pi -e 's/-pthread//' /lap/slurm/${version}/lib/libpmi.la /lap/slurm/${version}/lib/libslurm.la when installing a new slurm version. Thus no need for a fakepg wrapper. On 04/03/2017 04:20 PM, Prentice Bisbal wrote: > Greeting Open MPI users!

Re: [OMPI users] Compiler error with PGI: pgcc-Error-Unknown switch: -pthread

2017-04-03 Thread Åke Sandgren
: > This is the second suggestion to rebuild Slurm > > The other from Åke Sandgren, who recommended this: > >> This usually comes from slurm, so we always do >> >> perl -pi -e 's/-pthread//' /lap/slurm/${version}/lib/libpmi.la >> /lap/slurm/${version}/lib/libslurm.la >> &g

[OMPI users] Lustre support uses deprecated include.

2017-03-13 Thread Åke Sandgren
Hi! The lustre support in ompi/mca/fs/lustre/fs_lustre.h is using a deprecated include. #include is deprecated in newer lustre versions (at least from 2.8) and #include should be used instead. -- Ake Sandgren, HPC2N, Umea University, S-90187 Umea, Sweden Internet: a...@hpc2n.umu.se

Re: [OMPI users] [Open MPI Announce] Open MPI v2.1.2 released

2017-09-20 Thread Åke Sandgren
Hi! The OB1 PML problem, how long has it been around and, apart from the hang, how can i check if it is likely that i get hit by it? And are there any specific situations when it does appear? Will try 2.1.2 (and 3.0.0) out on our problem case soon but it takes a couple of days for the hang we're

[OMPI users] Problems building OpenMPI 2.1.1 on Intel KNL

2017-11-20 Thread Åke Sandgren
Hi! When the xppsl-libmemkind-dev package version 1.5.3 is installed building OpenMPI fails. opal/mca/mpool/memkind uses the macro MEMKIND_NUM_BASE_KIND which has been moved to memkind/internal/memkind_private.h Current master is also using that so I think that will also fail. Are there anyone

Re: [OMPI users] Problems building OpenMPI 2.1.1 on Intel KNL

2017-11-20 Thread Åke Sandgren
nks very much for reporting this, > > Howard > > > 2017-11-20 3:26 GMT-07:00 Åke Sandgren <ake.sandg...@hpc2n.umu.se > <mailto:ake.sandg...@hpc2n.umu.se>>: > > Hi! > > When the xppsl-libmemkind-dev package version 1.5.3 is installed >

Re: [hwloc-users] call for testing on KNL

2018-02-09 Thread Åke Sandgren
Any specific configure flags you'd want me to use? And does node config matter, i.e., hemi/snc2 etc? On 02/09/2018 05:56 PM, Brice Goglin wrote: > Hello > > As you may know, hwloc only discovers KNL MCDRAM Cache details if > hwloc-dump-hwdata ran as root earlier. There's an issue with that tool

Re: [OMPI users] Seg fault in opal_progress

2018-07-12 Thread Åke Sandgren
Are you running with ulimit -s unlimited? If not that looks like a out-of-stack crash, which VASP frequently causes. If you are running with unlimited stack, I could perhaps run that input case on our VASP build. (Which have a bunch of fixes for bad stack usage among other things) On 07/11/2018