Re: [OMPI users] building openmpi 1.8.1 with intel 14.0.1

2014-08-21 Thread Gus Correa
Hi Peter If I remember right from my compilation of OMPI on a Mac years ago, you need to have X-Code installed, in case you don't. If vampir-trace is the only problem, you can disable it when you configure OMPI (--disable-vt). My two cents, Gus Correa On 08/21/2014 03:35 PM, Bosler, Peter

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
Was the error that you listed the *first* error? Apparently various object files are missing from the ../../Modules/ directory, and were not compiled, suggesting something is amiss even before the compilation of the executable (epw.x). On 09/03/2014 05:20 PM, Elio Physics wrote: Dear all, I

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
they have a mailing list or bulletin board where you could get specific help for their software? (Either on EPW or on QuantumExpresso (which seems to be required): http://www.quantum-espresso.org/) That would probably be the right forum to ask your questions. My two cents, Gus Correa On 09/03

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
he EPW web site? http://epw.org.uk/Main/DownloadAndInstall ** I hope this helps, Gus Correa On 09/03/2014 06:48 PM, Elio Physics wrote: I have already done all of the steps you mentioned. I have installed the older version of quantum espresso, configured it and followed all the steps on the E

Re: [OMPI users] compilation problem with ifort

2014-09-03 Thread Gus Correa
d top EPW directory (which per the recipe is right below the top QE) plays a role. Anyway, phonons are not my playground, just trying to help two-cent-wise, although this is not really an MPI or OpenMPI issue, more or a Makefile/configure issue specific to QE and EPW. Thanks, Gus Correa On 09/03/2014 07:

Re: [OMPI users] compilation problem with ifort

2014-09-04 Thread Gus Correa
that it needs. And this is *exactly what the error message in your first email showed*, a bunch of object files that were not found. *** Sorry, but I cannot do any better than this. I hope this helps, Gus Correa On 09/03/2014 08:59 PM, Elio Physics wrote: Ray and Gus, Thanks a lot for your help. I fo

Re: [OMPI users] compilation problem with ifort

2014-09-04 Thread Gus Correa
, lapack, fft) and to build them. At least that is what seems to have happened on my computer. So, I don't think you need any other libraries. Good luck, Gus Correa On 09/04/2014 04:17 PM, Elio Physics wrote: Dear Gus, Firstly I really need to thank you for the effort you are doing to help me

Re: [OMPI users] About debugging and asynchronous communication

2014-09-18 Thread Gus Correa
There is no guarantee that the messages will be received in the same order that they were sent. Use tags or another mechanism to match the messages on send and recv ends. On 09/18/2014 10:42 AM, XingFENG wrote: I have found some thing strange. Basically, in my codes, processes send and receive

Re: [OMPI users] General question about running single-node jobs.

2014-10-02 Thread Gus Correa
and $OMPI/lib to LD_LIBRARY_PATH and are these environment variables propagated to the job execution nodes (specially those that are failing)? Anyway, just a bunch of guesses ... Gus Correa * QCSCRATCH Defines the directory in which Q-Chem will store

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-15 Thread Gus Correa
, Gus Correa On 10/15/2014 11:12 AM, Jeff Squyres (jsquyres) wrote: We talked off-list -- fixed this on master and just filed https://github.com/open-mpi/ompi-release/pull/33 to get this into the v1.8 branch. On Oct 14, 2014, at 7:39 PM, Ralph Castain <r...@open-mpi.org> wrote: On

Re: [OMPI users] Hybrid OpenMPI/OpenMP leading to deadlocks?

2014-10-16 Thread Gus Correa
+ short job queue time policy is very common out there. Here most problems with long runs (we have some non-restartable serial code die-hards), happen due to NFS issues (busy, slow response, etc), and code with poorly designed IO. My two cents, Gus Correa On 10/16/2014 10:16 AM, McGrattan, Kevin B. Dr

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-16 Thread Gus Correa
mpiexec options: -bind-to-core, rmaps_base_schedule_policy, orte_process_binding, etc. Thank you, Gus Correa On 10/15/2014 11:10 PM, Ralph Castain wrote: On Oct 15, 2014, at 11:46 AM, Gus Correa <g...@ldeo.columbia.edu <mailto:g...@ldeo.columbia.edu>> wrote: Thank you Ralph and Jeff fo

[OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
vidence I have that knem is active in 1.6.5 but not in 1.8.3 comes only from the statistics in /dev/knem. *** Thank you, Gus Correa *** PS - As an aside, I also have some questions on the knem setup, which I mostly copied from the knem web site (hopefully Brice Goglin is listening ...): -

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
? I am in CentOS 6.5, stock kernel 2.6.32, no 3.1,no CMA linux, so I believe I need knem for now. I tried '-mca btl_base_verbose 30' but no knem information came out. Many thanks, Gus Correa On 10/16/2014 04:40 PM, Aurélien Bouteiller wrote: Are you sure you are not using the vader BTL ? Setting

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
openib, etc)? How does it affect knem? What are vader's pros/cons w.r.t. using the other btls? In which conditions is it good or bad to use it vs. the other btls? What do I gain/lose if I do "btl = sm,self,openib" (which presumably will knock off tcp and "vader'), or maybe "btl=^tcp,^v

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
their MPI applications running in production mode, hopefully with Open MPI 1.8, can somebody explain more clearly what "vader" is about? Thank you, Gus Correa On Thu, Oct 16, 2014 at 01:49:09PM -0700, Ralph Castain wrote: FWIW: vader is the default in 1.8 On Oct 16, 2014, at 1:40 PM

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
me any incentive to upgrade our production codes to OMPI 1.8. Will this be fixed in the next Open MPI 1.8 release? Thank you, Gus Correa PS - Many thanks to Aurelien Boutelier for pointing out the existence of the vader btl. Without his tip I would still be in the dark side. On 10/16/2014 05:46

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
On 10/16/2014 05:38 PM, Nathan Hjelm wrote: On Thu, Oct 16, 2014 at 05:27:54PM -0400, Gus Correa wrote: Thank you, Aurelien! Aha, "vader btl", that is new to me! I tought Vader was that man dressed in black in Star Wars, Obi-Wan Kenobi's nemesis. That was a while ago, my kids wer

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
. On Oct 16, 2014, at 4:06 PM, Gus Correa <g...@ldeo.columbia.edu> wrote: Hi All Back to the original issue of knem in Open MPI 1.8.3. It really seems to be broken. I launched the Intel MPI benchmarks (IMB) job both with '-mca btl ^vader,tcp', and with '-mca btl sm,self,openib'. Both syntaxe

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-16 Thread Gus Correa
th-tm=/opt/torque/4.2.5/gnu-4.4.7 \ --with-verbs=/usr \ --with-knem=/opt/knem-1.1.1 \ 2>&1 | tee configure_${build_id}.log Many thanks, Gus On Oct 16, 2014, at 4:24 PM, Gus Correa <g...@ldeo.columbia.edu> wrote: On 10/16/2014 05:38 PM, Nathan Hjelm wrote: On Thu, Oct 16, 2014 a

Re: [OMPI users] Open MPI 1.8.3 openmpi-mca-params.conf: old and new parameters

2014-10-17 Thread Gus Correa
rocess placement conceptual model, along with its syntax and examples. Thank you, Gus Correa On 10/17/2014 12:10 AM, Ralph Castain wrote: I know this commit could be a little hard to parse, but I have updated the mpirun man page on the trunk and will port the change over to the 1.8 serie

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-17 Thread Gus Correa
omatically) * -mca btl openib,self (and vader will come along automatically) * -mca btl openib,self,vader (because vader is default only for 1-node jobs) * something else (or several alternatives) Whatever happened to the "self" btl in this new context? Gone? Still there? Many thanks, Gus Corr

Re: [OMPI users] New ib locked pages behavior?

2014-10-21 Thread Gus Correa
Hi Bill Maybe you're missing these settings in /etc/modprobe.d/mlx4_core.conf ? http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem I hope this helps, Gus Correa On 10/21/2014 06:36 PM, Bill Broadley wrote: I've setup several clusters over the years with OpenMPI. I often get

Re: [OMPI users] New ib locked pages behavior?

2014-10-21 Thread Gus Correa
this (but apparently no solution): http://www.open-mpi.org/community/lists/users/2013/02/21430.php Maybe Mellanox has more information about this? Gus Correa On 10/21/2014 08:15 PM, Bill Broadley wrote: On 10/21/2014 04:18 PM, Gus Correa wrote: Hi Bill Maybe you're missing these settings in /etc

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
time with the btl_vader_single_copy_mechanism parameter? Or must OMPI be configured with only one memory copy mechanism? Many thanks, Gus Correa On 10/30/2014 05:44 PM, Nathan Hjelm wrote: I want to close the loop on this issue. 1.8.5 will address it in several ways: - knem support in btl/sm has been fixed. A sanity c

Re: [OMPI users] knem in Open MPI 1.8.3

2014-10-30 Thread Gus Correa
questions below (specially the 12 vader parameters). Many thanks, Gus Correa On Oct 30, 2014, at 4:24 PM, Gus Correa <g...@ldeo.columbia.edu> wrote: Hi Nathan Thank you very much for addressing this problem. I read your notes on Jeff's blog about vader, and that clarified many

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-12 Thread Gus Correa
e sensible. :) Cheers, Gus Correa It tries so independent from the internal or external name of the headnode given in the machinefile - I hit ^C then. I attached the output of Open MPI 1.8.1 for this setup too. -- Reuti ___ users mailing list us...@op

Re: [OMPI users] How OMPI picks ethernet interfaces

2014-11-13 Thread Gus Correa
(... well, I don't have voting rights on that, but I'll vote anyway ...) is to keeep the current approach. It is wise and flexible, and easy to adjust and configure to specific machines with their own oddities, via MCA parameters, as I tried to explain in previous postings. My two cents, Gus Corre

Re: [OMPI users] Open mpi based program runs as root and gives SIGSEGV under unprivileged user

2014-12-10 Thread Gus Correa
number of open files is yet another hurdle. And if you're using Infinband, the max locked memory size should be unlimited. Check /etc/security/limits.conf and "ulimit -a". I hope this helps, Gus Correa On 12/10/2014 08:28 AM, Gilles Gouaillardet wrote: Luca, your email mentions ope

Re: [OMPI users] Icreasing OFED registerable memory

2014-12-30 Thread Gus Correa
=openfabrics#ib-locked-pages-more http://www.open-mpi.org/faq/?category=openfabrics#ib-low-reg-mem *** Having said that, a question remains unanswered: Why is Infiniband such a nightmare? *** I hope this helps, Gus Correa On 12/30/2014 09:16 AM, Waleed Lotfy wrote: Thank Devendar for your response. I'll

Re: [OMPI users] Icreasing OFED registerable memory

2015-01-06 Thread Gus Correa
ent you before for more details. I hope this helps, Gus Correa On 01/06/2015 01:37 PM, Deva wrote: Hi Waleed, -- Memlock limit: 65536 -- such a low limit should be due to per-user lock memory limit . Can you make sure it is set to "unlimited" on all nodes ( "

Re: [OMPI users] libpsm_infinipath issues?

2015-01-08 Thread Gus Correa
Hi Michael, Andrew, list knem is doesn't work in OMPI 1.8.3. See this thread: http://www.open-mpi.org/community/lists/users/2014/10/25511.php A fix was promised on OMPI 1.8.4: http://www.open-mpi.org/software/ompi/v1.8/ Have you tried it? I hope this helps, Gus Correa On 01/08/2015 04:36 PM

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-08 Thread Gus Correa
suggested a while back. I hope this helps, Gus Correa Thanks again Diego On 8 January 2015 at 23:24, George Bosilca <bosi...@icl.utk.edu <mailto:bosi...@icl.utk.edu>> wrote: Diego, Please find below the corrected example. There were several issues but the mos

Re: [OMPI users] MPI_Type_Create_Struct + MPI_TYPE_CREATE_RESIZED

2015-01-08 Thread Gus Correa
Hi Diego *EITHER* declare your QQ and PR (?) structure components as DOUBLE PRECISION *OR* keep them REAL(dp) but *fix* your "dp" definition, as George Bosilca suggested. Gus Correa On 01/08/2015 06:36 PM, Diego Avesani wrote: Dear Gus, Dear All, so are you suggesting to

Re: [OMPI users] MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-13 Thread Gus Correa
(as you did in your previous code, with all the surprises regarding alignment, etc), not array sections. Also, MPI type vector should be more easy going (and probably more efficient) than MPI type struct, with less memory alignment problems. I hope this helps, Gus Correa PS - These books have

Re: [OMPI users] MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-15 Thread Gus Correa
I/content6.html Gus Correa On 01/15/2015 06:53 PM, Diego Avesani wrote: dear George, dear Gus, dear all, Could you please tell me where I can find a good example? I am sorry but I can not understand the 3D array. Really Thanks Diego On 15 January 2015 at 20:13, George Bosilca <bosi..

[OMPI users] How to handle strides in MPI_Create_type_subarray - Re: MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-16 Thread Gus Correa
ere any simple example of how to achieve stride effect with MPI_Create_type_subarray in a multi-dimensional array? BTW, when are you gentlemen going to write an updated version of the "MPI - The Complete Reference"? :) Thank you, Gus Correa (Hijacking Diego Avesani's thread, apologies t

Re: [OMPI users] How to handle strides in MPI_Create_type_subarray - Re: MPI_type_create_struct + MPI_Type_vector + MPI_Type_contiguous

2015-01-16 Thread Gus Correa
Hi George Many thanks for your answer and interest in my questions. ... so ... more questions inline ... On 01/16/2015 03:41 PM, George Bosilca wrote: Gus, Please see my answers inline. On Jan 16, 2015, at 14:24 , Gus Correa <g...@ldeo.columbia.edu> wrote: Hi George It is still not

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
cure about this, not making clear the difference between /export/apps and /share/apps. Issuing the Rocks commands: "tentakel 'ls -d /export/apps'" "tentakel 'ls -d /share/apps'" may show something useful. I hope this helps, Gus Correa On 02/27/2015 11:47 AM, Syed Ahsan Ali wrote: I am try

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
Hi Syed Ahsan Ali To avoid any leftovers and further confusion, I suggest that you delete completely the old installation directory. Then start fresh from the configure step with the prefix pointing to --prefix=/share/apps/openmpi-1.8.4_gcc-4.9.2 I hope this helps, Gus Correa On 02/27/2015 12

Re: [OMPI users] mpirun fails across cluster

2015-02-27 Thread Gus Correa
that is a common cause of trouble. OpenMPI needs PATH and LD_LIBRARY_PATH at runtime also. I hope this helps, Gus Correa On Fri, Feb 27, 2015 at 10:44 PM, Syed Ahsan Ali <ahsansha...@gmail.com> wrote: Dear Gus Thanks once again for suggestion. Yes I did that before installation to new path

Re: [OMPI users] shared memory performance

2015-07-22 Thread Gus Correa
n this case, I guess the mpirun options would be: mpirun --machinefile machine_mpi_bug.txt --mca btl self,vader,tcp I am not even sure if with "vader" the "self" btl is needed, as it was the case with "sm". An OMPI developer could jump into this conversation and

Re: [OMPI users] pbs vs openmpi node allocation

2015-08-03 Thread Gus Correa
node file would be $PBS_NODEFILE. [You don't need to do it if Open MPI was built with Torque support.] I hope this helps. Gus Correa Thank you. -- Abhisek Mondal /Research Fellow / /Structural Biology and Bioinformatics / /Indian Institute of Chemical Biology/ /Kolkata 700032 / /INDIA / ___

Re: [OMPI users] How to reduce Isend & Irecv bandwidth?

2013-05-01 Thread Gus Correa
Maybe start the data exchange by sending a (presumably short) list/array/index-function of the dirty/not-dirty blocks status (say, 0=not-dirty,1=dirty), then putting if conditionals before the Isend/Irecv so that only dirty blocks are exchanged? I hope this helps, Gus Correa On 05/01/2013 01

Re: [OMPI users] How to reduce Isend & Irecv bandwidth?

2013-05-01 Thread Gus Correa
Hi Thomas/Jacky Maybe using MPI_Probe (and maybe also MPI_Cancel) to probe the message size, and receive only those with size>0? Anyway, I'm just code-guessing. I hope it helps, Gus Correa On 05/01/2013 05:14 PM, Thomas Watson wrote: Hi Gus, Thanks for your suggestion! The prob

Re: [OMPI users] mpirun error

2013-05-09 Thread Gus Correa
FAQ: http://www.open-mpi.org/faq/?category=running#run-prereqs I hope this helps, Gus Correa On 05/09/2013 12:15 PM, Pepito Perez wrote: Pradeep Jha<pradeep ccs.engg.nagoya-u.ac.jp> writes: Hello, When I am trying to run a perfectly running parallel code on a new Linux machine,

Re: [OMPI users] distributed file system

2013-05-16 Thread Gus Correa
use it. I hope it helps, Gus Correa On 05/16/2013 11:38 AM, Jeff Squyres (jsquyres) wrote: See http://www.open-mpi.org/faq/?category=building#where-to-install On May 16, 2013, at 11:30 AM, Ralph Castain<r...@open-mpi.org> wrote: no, as long as ompi is installed in same location

Re: [OMPI users] plm:tm: failed to spawn daemon, error code = 17000 Error when running jobs on 600 or more nodes

2013-05-16 Thread Gus Correa
g /tmp): http://www.supercluster.org/pipermail/torqueusers/2011-March/012453.html http://www.open-mpi.org/faq/?category=all#poor-sm-btl-performance http://www.open-mpi.org/faq/?category=all#network-vs-local I hope this helps, Gus Correa On 05/16/2013 12:21 PM, Ralph Castain wrote: Check the torque error co

Re: [OMPI users] Configuration with Intel C++ Composer 12.0.2 on OSX 10.7.5

2013-05-16 Thread Gus Correa
se/version on all of them, not mix 13.X.Y with 12.W.Z. However, the error message you sent seems to have happened very early during the configure step, and the compiler version mix is probably not the reason. I hope this helps, Gus Correa On 05/16/2013 02:16 PM, Geraldine Hochman-Klarenberg wrote

Re: [OMPI users] Error when build openmpi on Mac Pro

2013-06-13 Thread Gus Correa
Hi all On 06/13/2013 05:02 PM, Jeff Squyres (jsquyres) wrote: It looks like you might have a busted C++ compiler. Why not use CXX=g++? Would this be the problem? >> $ export CXX=gcc Gus Correa Can you compile any non-MPI C++ programs that use the STL? On Jun 12, 2013, at 6:58 AM

Re: [OMPI users] mpif90 error with different openmpi editions

2013-06-18 Thread Gus Correa
On 06/18/2013 12:28 AM, xu wrote: my code get this error under openmpi 1.6.4 mpif90 -O2 -m64 -fbounds-check -ffree-line-length-0 -c -o 2dem_mpi.o 2dem_mpi.f90 Fatal Error: Reading module mpi at line 110 column 30: Expected string If I use mpif90: Open MPI 1.3.3 It compiles ok. What the problem

Re: [OMPI users] Trouble with Sending Multiple messages to the Same Machine

2013-06-18 Thread Gus Correa
On 06/18/2013 01:14 PM, Claire Williams wrote: Hi guys ☺! I'm working with a simple "Hello, World" MPI program that has one master and is sending one message to each worker, receives a message back from each of the workers, and re-sends a new message. This unfortunately is not working :(. When

Re: [OMPI users] error with openmpi on snow leopard

2013-06-19 Thread Gus Correa
BRARY_PATH Actually, if I remember right, on a Mac, you need to set DYLD_LIBRARY_PATH, instead of LD_LIBRARY_PATH, right? And you need (or already have) X-code installed, right? Or am I missing something here? I hope this helps, Gus Correa On 06/19/2013 03:37 PM, Ralph Castain wrote: y

Re: [OMPI users] mpif90 error with different openmpi editions

2013-06-26 Thread Gus Correa
rs can help, but I'd suggest that besides the information above, you send in your configure command line for each OMPI version. It is hard to guess what is the problem from the tidbits of information that you sent. I hope this helps, Gus Correa On 06/26/2013 04:22 PM, xu wrote: No. I didn

Re: [OMPI users] errors testing openmpi1.6.5 ----

2013-07-24 Thread Gus Correa
Hi Yuping Did you set your PATH and LD_LIBRARY_PATH? Please, see these FAQ: http://www.open-mpi.org/faq/?category=running#run-prereqs http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path I hope this helps, Gus Correa On 07/24/2013 08:09 PM, Yuping Sun wrote: Dear All: I

Re: [OMPI users] errors testing openmpi1.6.5 ----

2013-07-25 Thread Gus Correa
the second case, you need to make sure your home directory is shared across the machines, or if it is not shared, you need to make the modifications above in each machine. I hope this helps. Gus Correa On 07/25/2013 03:11 PM, Yuping Sun wrote: Hi Gus: I went back and set the PATH and LD_LI

Re: [OMPI users] errors testing openmpi1.6.5 ----

2013-07-25 Thread Gus Correa
our environment. They will haunt you sooner or later. I hope this helps, Gus Correa On 07/25/2013 05:58 PM, Yuping Sun wrote: Hi Gus: Thank you. I did these as I use .bash_profile to add the path and LD, but it did not help. Thank you. Is there anything else you can think of? Best regards, Yuping -O

Re: [OMPI users] errors testing openmpi1.6.5 ----

2013-07-25 Thread Gus Correa
distributions. So, I am afraid I can't help you with this. Gus Correa On 07/25/2013 08:15 PM, Yuping Sun wrote: Hi Gus: I use yum install to install openmpi 1.4.7 and it went through. Then I tested a small code, hello world, and it worked. Now I have two questions for you: 1. do we have

Re: [OMPI users] openmpi+infiniband

2013-07-30 Thread Gus Correa
://www.open-mpi.org/faq/?category=openfabrics#ib-btl I hope this helps, Gus Correa On 07/30/2013 09:01 AM, christian schmitt wrote: Hallo, I´m trying to get openmpi(1.6.5) running with/over infiniband. My system is a centOS 6.3. I have installed the Mellanox OFED driver (2.0) and everything seems

[OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled

2013-08-12 Thread Gus Correa
he daemon, but nothing changed, the job continue to fail with the same errors. ** Any hints of what is going on, how to diagnose it, and how to fix it? Any gentler way than reboot everything and power cycling the IB switch? (And would this brute force method work, at least?) Thank you, Gus Correa

Re: [OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled

2013-08-12 Thread Gus Correa
it! :( ] Thank you, Gus Correa On 08/12/2013 03:01 PM, Ralph Castain wrote: Check ompi_info - was it built with openib support? Then check that the mca_btl_openib library is present in the prefix/lib/openmpi directory Sounds like it isn't finding the openib plugin On Aug 12, 2013, at 11:57 AM

Re: [OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled

2013-08-12 Thread Gus Correa
quot;large number" that would be OK for openib? I thought a 12GB memlock limit would be OK, but maybe it is not. The nodes have 64GB RAM. Thank you, Gus Correa *\ [node15.cluster][[8097,1],0][../../../../../ompi/mca/btl/openib/btl_openib_component

Re: [OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled

2013-08-12 Thread Gus Correa
on Infiniband, now complaining about the IB driver, but also that it cannot allocate memory. Weird because when I ssh to the node and do ibstat it responds (see below, please). I actually ran ibstat everywhere, and all IB host adapters seem OK. Thank you, Gus Correa *** the job

Re: [OMPI users] Error - BTLs attempted: self sm - on a cluster with IB and openib btl enabled

2013-08-13 Thread Gus Correa
try a less_than_unlimited memlock value later (tests are not easy on production machines). In any case, it is kind of hard to come up with a sensible number (RAM/number_of_cores? less? more? a magic value?). Any suggestions are welcome, of course. Thank you, Gus Correa On 08/12/2013 07:43 PM

Re: [OMPI users] Finalize() does not return

2013-08-13 Thread Gus Correa
the old-fashioned printf commands right before MPI_Finalize? Something like this: printf("Rank %d before MPI_Finalize\n",myrank); fflush(stdout); MPI_Finalize(); or Fortran: print *, 'Rank ', myrank, ' before MPI_Finalize' call flush(6) call MPI_Finalize(ierror) My two cents, Gus Cor

Re: [OMPI users] Changing directory from /tmp

2013-09-04 Thread Gus Correa
___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Hi Lee-Ping See this FAQ: http://www.open-mpi.org/faq/?category=sm#where-sm-file OMPI FAQ is your friend! And so is ompi_info. Gus Correa

Re: [OMPI users] OPEN MPI error

2013-09-18 Thread Gus Correa
". Simpler may be better: Have you tried to use just "--mca btl openib,sm,self" ? This way OMPI will find the Infiniband interface(s) for you. Justa guessed guess, Gus Correa On 09/18/2013 01:49 PM, justa tester tester wrote: I'm new to OPEN MPI and have a question in regards to the er

Re: [OMPI users] intermittent node file error running with torque/maui integration

2013-09-20 Thread Gus Correa
", just to make sure the file it is available and filled with the node list, before mpiexec takes over? My two cents, Gus Correa On 09/20/2013 09:55 AM, Noam Bernstein wrote: Hi - we've been using openmpi for a while, but only for the last few months with torque/maui. Intermittently (mayb

Re: [OMPI users] intermittent node file error running with torque/maui integration

2013-09-20 Thread Gus Correa
On 09/20/2013 12:48 PM, Noam Bernstein wrote: On Sep 20, 2013, at 11:52 AM, Gus Correa<g...@ldeo.columbia.edu> wrote: Hi Noam Could it be that Torque, or probably more likely NFS, is too slow to create/make available the PBS_NODEFILE? What if you insert a "sleep 2", or

Re: [OMPI users] non-functional mpif90 compiler

2013-09-30 Thread Gus Correa
extra" OMPI installations, which can be ahead of your path. "which mpif90" will tell you what you are actually using. For what it is worth, disabling shared libraries at configure time may be challenging. I hope this helps, Gus Correa On 09/30/2013 11:58 AM, Damiano Natali wrote: De

Re: [OMPI users] non-functional mpif90 compiler

2013-09-30 Thread Gus Correa
they are passed to OpenFOAM (which I don't know and don't use, sorry). My two cents, Gus Correa On 09/30/2013 01:48 PM, Damiano Natali wrote: Hi Gus, first of all thank you very much for you help. I really appreciate! Then you are right, I have OpenFOAM so 'which mpif90' addresses to another installation

Re: [OMPI users] non-functional mpif90 compiler

2013-09-30 Thread Gus Correa
the old build (make distclean, or just start fresh from the OMPI tarball.) On 09/30/2013 03:44 PM, Gus Correa wrote: Hi Damiano OpenFOAM may have something funny in the Makefiles, perhaps? Make sure you set the PATH and LD_LIBRARY_PATH right. A suggestion. Try compiling something VERY SIMPLE

Re: [OMPI users] non-functional mpif90 compiler

2013-10-01 Thread Gus Correa
tatic is faster" view still valid? Does it apply to Open MPI in particular? Was it ever true? Many thanks, Gus Correa On 10/01/2013 05:16 AM, Jeff Squyres (jsquyres) wrote: If you are using a TCP network for MPI communications,static is fine. However, if you're trying to use an OS-bypa

Re: [OMPI users] MPI_File_write hangs on NFS-mounted filesystem

2013-11-07 Thread Gus Correa
mounts, the various parallel file systems (PVFS/OrangeFS, Lustre, GlusterFS, etc). And perhaps provide some setup information, plus functionality, and performance comparisons. My two cents, Gus Correa On 11/07/2013 12:21 PM, Dmitry N. Mikushin wrote: Not sure if this is related, but: I've seen

Re: [OMPI users] openmpi+torque: How run job in a subset of the allocation?

2013-11-27 Thread Gus Correa
han independent work on two instances of the same problem. Ralph: Does the MPI model assume that MPMD/MIMD executables have to necessarily communicate with each other, or perhaps share a common MPI_COMM_WORLD? [I guess not.] Anyway, just a guess, Gus Correa On 11/27/2013 10:23 AM, Ralph Castain wro

Re: [OMPI users] typo in opal/memoryhooks/memory.h (1.6.5)

2013-12-16 Thread Gus Correa
, and keep OMPI 1.6.5 programs running? Would opal be lobotomized? Many thanks, Gus Correa On 12/16/2013 01:45 PM, Jeff Squyres (jsquyres) wrote: Fixed -- thanks! (I confirmed that it's not an issue in the 1.7 series, too) On Dec 16, 2013, at 1:36 PM, Ake Sandgren<ake.sandg...@hpc2n.umu.se>

[OMPI users] Segmentation fault on OMPI 1.6.5 built with gcc 4.4.7 and PGI pgfortran 11.10

2013-12-23 Thread Gus Correa
to address it? Thank you for your help, Gus Correa error message *** [1,31]:[node30:17008] *** Process received signal *** [1,31]:[node30:17008] Signal: Segmentation fault (11) [1,31]:[node30:17008] Signal code: Address not mapped (1) [1,31]:[node30:1700

Re: [OMPI users] simple test problem hangs on mpi_finalize and consumes all system resources

2014-01-24 Thread Gus Correa
CentOS 5.2, Linux kernel 2.6.18, gcc 4.1.2, Python 2.4.3, etc. Parallel programs compile and run with OMPI 1.6.5 without problems. I hope this helps, Gus Correa -Original Message- From: Fischer, Greg A. Sent: Friday, January 24, 2014 11:41 AM To: 'Open MPI Users' Cc: Fischer, Greg A. Subject

Re: [OMPI users] Open MPI and multiple Torque versions

2014-01-27 Thread Gus Correa
one, as it requires editing the intialization files every time you want to change the environment, but also works. I hope this helps, Gus Correa

[OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-02-27 Thread Gus Correa
? ** Thank you, Gus Correa Machine (P#0 total=134199384KB DMIProductName=H8DGU DMIProductVersion=1234567890 DMIProductSerial=1234567890 DMIProductUUID=534D4349-0002-8390-2500-839025001D97 DMIBoardVendor=Supermicro DMIBoardName=H8DGU DMIBoardVersion=1234567890 DMIBoardSerial=NM29S71392

Re: [OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-02-27 Thread Gus Correa
d replaced doesn't show the hwloc-gather-topology error, though. Does the error message below (Socket P#0 ...) suggest anything that I should be looking for on the hardware side? (Thermal compound on the heatsink, memory modules, etc) Thank you, Gus Correa [root@node14 ~]# /usr/bin/hwloc-gathe

Re: [OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-02-28 Thread Gus Correa
. After motherboard replacement I renistalled the OS on both, but it doesn't hurt to do it again. Gus Correa On 02/28/2014 03:26 AM, Brice Goglin wrote: Hello Gus, I'll need the tarball generated by gather-topology on node14 to debug this. node15 doesn't have any issue. We've seen issues on AMD

Re: [OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-02-28 Thread Gus Correa
de -s bios-version 3.0 [root@node15 ~]# dmidecode -s bios-release-date 08/31/2012 ** Thanks again for your help and advice. Gus Correa ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Machine (P#0 total=1

Re: [OMPI users] OpenMPI job initializing problem

2014-02-28 Thread Gus Correa
-compilers I hope this helps, Gus Correa On 02/28/2014 12:36 PM, Ralph Castain wrote: Almost certainly, the redhat package wasn't built with matching infiniband support and so we aren't picking it up. I'd suggest downloading the latest 1.7.4 or 1.7.5 nightly tarball, or even the latest 1.6 tarball if you

Re: [OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-03-03 Thread Gus Correa
ubdirectories should I check? Or alternatively which files in the hwloc-gather-topology output? Many thanks for your help, Gus Correa On 02/28/2014 03:53 PM, Brice Goglin wrote: Le 28/02/2014 21:30, Gus Correa a écrit : Hi Brice The (pdf) output of lstopo shows one L1d (16k) for each co

Re: [OMPI users] OpenMPI job initializing problem

2014-03-03 Thread Gus Correa
y to exist and be local to the cluster nodes. [But the cluster nodes may be diskless ... :( ] I hope this helps, Gus Correa On 03/03/2014 07:10 PM, Beichuan Yan wrote: How to set TMPDIR to a local filesystem? Is /home/yanb/tmp a local filesystem? I don't know how to tell a directory is local f

Re: [OMPI users] OpenMPI job initializing problem

2014-03-03 Thread Gus Correa
you create the directory /home/yanb/tmp beforehand? Anyway, you may need to ask the help of a system administrator of this machine. Gus Correa On 03/03/2014 07:43 PM, Beichuan Yan wrote: Gus, I am using this system: http://centers.hpc.mil/systems/unclassified.html#Spirit. I don't know exactly

Re: [OMPI users] OpenMPI job initializing problem

2014-03-04 Thread Gus Correa
ther possible/SGI/pre-existent MPI items? Those are pretty (ugly) common problems. ** I hope this helps, Gus Correa On 03/03/2014 10:13 PM, Beichuan Yan wrote: 1. info from a compute node -bash-4.1$ hostname r32i1n1 -bash-4.1$ df -h /home FilesystemSize Used Avail Use% Mounted on

Re: [OMPI users] bind-to core warning even with numactl

2014-03-04 Thread Gus Correa
hope this helps, Gus Correa On 03/04/2014 12:15 PM, Saliya Ekanayake wrote: I actually did a rebuild and install. Is there a quick test to see if these were picked up correctly. I checked OMPI_INFO and can see numaif.h has been founded. Is this the correct indication? I'll check the link

Re: [OMPI users] hwloc error in topology.c in OMPI 1.6.5

2014-03-04 Thread Gus Correa
On 03/03/2014 05:06 PM, Brice Goglin wrote: Le 03/03/2014 23:02, Gus Correa a écrit : I rebooted the node and ran hwloc-gather-topology again. This turn it didn't throw any errors on the terminal window, which may be a good sign. [root@node14 ~]# hwloc-gather-topology /tmp/`date +"%Y%m%

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Gus Correa
I hope this helps, Gus Correa On 03/06/2014 01:49 PM, Beichuan Yan wrote: 1. For $TMPDIR and $TCP, there are four combinations by commenting on/off (note the system's default TMPDIR=/work3/yanb): export TMPDIR=/work1/home/yanb/tmp TCP="--mca btl_tcp_if_include 10.148.0.0/16" 2.

Re: [OMPI users] OpenMPI job initializing problem

2014-03-06 Thread Gus Correa
d try simply -mca btl sm,openib,self, which is likely to give you the IB transport with verbs, plus shared memory intra-node, plus the (mandatory?) self (loopback interface?). In my experience, this will also help identify any malfunctioning IB HCA in the nodes (with a failure/error message). I

Re: [OMPI users] OpenMPI job initializing problem

2014-03-07 Thread Gus Correa
project. Gus Correa -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Thursday, March 06, 2014 13:51 To: Open MPI Users Subject: Re: [OMPI users] OpenMPI job initializing problem On 03/06/2014 03:35 PM, Beichuan Yan wrote: Gus, Yes

Re: [OMPI users] Question about '--mca btl tcp,self'

2014-03-17 Thread Gus Correa
y e xclude the use of "sm", but IMHO are too vague and somewhat misleading. I think this issue was reported/discussed before in the list, but somehow the FAQ were not fixed. Thank you, Gus Correa It is much faster than both the TCP loopback device (which OMPI excludes by default, BTW

Re: [OMPI users] OpenMPI job initializing problem

2014-03-20 Thread Gus Correa
MPI and Open MPI. Thanks, Beichuan -Original Message- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Friday, March 07, 2014 18:41 To: Open MPI Users Subject: Re: [OMPI users] OpenMPI job initializing problem On 03/06/2014 04:52 PM, Beichuan Yan wrote

Re: [OMPI users] OpenMPI job initializing problem

2014-03-20 Thread Gus Correa
essage- From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Gus Correa Sent: Friday, March 07, 2014 18:41 To: Open MPI Users Subject: Re: [OMPI users] OpenMPI job initializing problem On 03/06/2014 04:52 PM, Beichuan Yan wrote: No, I did all these and none worked. I just found, with exa

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-26 Thread Gus Correa
workload increases. Queue systems do support interactive jobs (even with X-windows GUIs, if needed). You submit the interactive job, the queue system puts you in a free node, and you work normally there. I hope this helps, Gus Correa

Re: [OMPI users] busy waiting and oversubscriptions

2014-03-27 Thread Gus Correa
nce my mindset. Gus Correa

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
friend! I hope this helps, Gus Correa On 03/27/2014 01:53 PM, Sasso, John (GE Power & Water, Non-GE) wrote: When a piece of software built against OpenMPI fails, I will see an error referring to the rank of the MPI task which incurred the failure. For example: MPI_ABORT was invoke

Re: [OMPI users] Mapping ranks to hosts (from MPI error messages)

2014-03-27 Thread Gus Correa
pi.org/faq/?category=tuning#setting-mca-params Again, the OMPI FAQ page is your friend! :) http://www.open-mpi.org/faq/ I hope this helps, Gus Correa On 03/27/2014 02:06 PM, Gus Correa wrote: Hi John Take a look at the mpiexec/mpirun options: -report-bindings (this one should report wha

  1   2   3   4   5   >