[OMPI users] question about running openmpi with different interconnects

2011-04-04 Thread Borenstein, Bernard S
We have added clusters with different interconnects and decided to build one OPENMPI 1.4.3 version to handle all the possible interconnects and run everywhere. I have two questions about this : 1 - is there a way for Openmpi to print out the interconnect it selected to use at run time? I am

[OMPI users] how to tell if opempi is using rsh or ssh

2010-09-29 Thread Borenstein, Bernard S
We are eliminating the use of rsh at our company and I'm trying to test out openmpi with the Nasa Overflow program using ssh. I've been testing other MPI's (MPICH1 and LAM/MPI) and if I tried to use rsh the programs would just die when running using PBS. I submitted my Overflow job using --mca

[OMPI users] need help with a code segment

2009-08-11 Thread Borenstein, Bernard S
I'm trying to build a code with OPENMPI 1.3.3 that compiles with LAM/MPI. It is using mpicc and here is the code segment and error : void drt_pll_init(int my_rank,int num_processors); #ifdef DRT_USE_MPI #include MPI_Comm drt_pll_mpi_split_comm_world(int key); #else int

[OMPI users] inconsistant FAQ entries - building openmpi with sge and running openmpi with sge

2009-03-09 Thread Borenstein, Bernard S
The building openmpi with sge faq says : For Open MPI v1.2, SGE support is built automatically; there is nothing that you need to do. Note that SGE support first appeared in v1.2. NOTE: For Open MPI v1.3, or starting with trunk revision number r16422, you will need to explicitly request the

[OMPI users] only see ras info doing ompi_info for sge

2009-03-09 Thread Borenstein, Bernard S
With version 1.3, should I see the both MCA ras and MCA pls when doing an ompi_info. After doing my build with 1.3, I only see the ras component. Bernie Borenstein Yes I know I didn't attach any info, but I'm just trying to determine if there is a problem or something has changed between

[OMPI users] Want to build a static openmpi with both myrinet and tcp

2008-07-27 Thread Borenstein, Bernard S
We now have a cluster with myrinet and another cluster with tcp. I want to build a static OPENMPI that will detect if there is myrinet on the cluster and use that, if myrinet is not available, run with tcp. I see the --enable-mca-static option but am confused as to how to use it for what I want

[OMPI users] different interconnects (myrinet and gige)

2008-05-21 Thread Borenstein, Bernard S
We now have a myrinet cluster along with our GIGE clusters and I was wondering how to have openmpi select the right interconnect. We use PBS and would like to have Openmpi select the right interconnect automatically, depending on whether we are on the Myrinet cluster or the Gige cluster. Any

[OMPI users] minor program build problem

2006-07-26 Thread Borenstein, Bernard S
When building the nasa overflow 2.0aa code with openmpi 1.1.1b3 using the intel compilers on a Opteron cluster running SLES 9 with the intel 9 compilers, I get the following warnings when linking : /acct/bsb3227/openmpi_1.1.1b3/bin/mpif90 -xW -O2 -convert big_endian -align all -o

[OMPI users] how can I tell for sure that I'm using mvapi

2006-04-13 Thread Borenstein, Bernard S
I'm running on a cluster with mvapi. I built with mvapi and it runs, but I want to make absolutely sure that I'm using the IB interconnect and nothing else. How can I tell specifically what interconnect I'm using when I run. Bernie Borenstein The Boeing Company

[O-MPI users] problem with Nasa Overflow 1.8ab code and open-mpi 1.0.1rc5

2005-12-05 Thread Borenstein, Bernard S
I build the Nasa Overflow code with Open-mpi 1.0.1rc5 and now get this error message : [w052:19034] *** An error occurred in MPI_Cart_get [w051:19104] *** An error occurred in MPI_Cart_get [w051:19104] *** on communicator MPI_COMM_WORLD [w051:19104] *** MPI_ERR_COMM: invalid communicator

[O-MPI users] another overflow 1.8ab problem

2005-11-21 Thread Borenstein, Bernard S
Just tried to run a very large case on another cluster with TCP. It cranks away for quite awhile then I get this message : FOR GRID 78 AT STEP 733 L2NORM = 0.30385345E-03^M FOR GRID 79 AT STEP 733 L2NORM = 0.26182533E+00^M [hsd660:02490] spawn: in

[O-MPI users] problem with overflow 1.8ab code using GM

2005-11-21 Thread Borenstein, Bernard S
Things have improved alot since I ran the code using the earlier betas, but it still fails near the end of the run. <> The error messages are : FOR GRID 4 AT STEP 466 L2NORM = 0.74601987E-09 FOR GRID 5 AT STEP 466 L2NORM = 0.86085437E-08 FOR GRID 6 AT

[O-MPI users] help with openmpi rc5r8005

2005-11-08 Thread Borenstein, Bernard S
I am again trying to build and run the Nasa Overflow 1.8ab version using Open MPI and have run into this error message : [hsd653:05053] *** An error occurred in MPI_Allreduce: the reduction operation M PI_OP_MIN is not defined on the MPI_DBLPREC datatype [hsd653:05053] *** on communicator

[O-MPI users] Continued problems with Nasa Overflow code

2005-10-06 Thread Borenstein, Bernard S
I built the Nasa Overflow 1.8ab code yesterday with openmpi-1.0a1r7632. It runs fine with 4 or 8 opteron processors on a myrinet linux cluster. But if I increase the number of processors to 20, I get errors like this : [e053:01260] *** An error occurred in MPI_Free_mem [e030:15585] *** An error

[O-MPI users] more information on my overflow problem

2005-09-28 Thread Borenstein, Bernard S
I posted an issue with the Nasa Overflow 1.8 code and have traced it further to a program failure in the malloc areas of the code (data in these areas gets corrupted). Overflow is mostly fortran, but since it is an old program, it uses some c routines to do dynamic memory allocation. I'm still

[O-MPI users] problem running on a myrinet linux cluster

2005-09-21 Thread Borenstein, Bernard S
I was able to get open-mpi working fine on a cluster with gige, but when building and trying to run the Nasa Overflow program on a cluster with Myrinet, it does not work. The program starts to run and then gives the following error : spawn: in job_state_callback(jobid = 1, state = 0xa) spawn: in