[OMPI users] Confused on simple MPI/OpenMP program

2013-04-03 Thread Ed Blosch
Consider this Fortran program snippet:

program test

  ! everybody except rank=0 exits.
  call mpi_init(ierr)
  call mpi_comm_rank(MPI_COMM_WORLD,irank,ierr)
  if (irank /= 0) then
call mpi_finalize(ierr)
stop
  endif

  ! rank 0 tries to set number of OpenMP threads to 4
  call omp_set_num_threads(4)
  nthreads = omp_get_max_threads()
  print*, "nthreads = ", nthreads

  call mpi_finalize(ierr)

end program test

It is compiled like this: 'mpif90 -o test -O2 -openmp test.f90'  (Intel
11.x)

When I run it like this:  mpirun -np 2 ./test

The output is:  "nthreads = 0"

Does that make sense?  I was expecting 4.

If I comment out the MPI lines and run the program serially (but still
compiled with mpif90), then I get the expected output value 4.

I'm sure I must be overlooking something basic here.  Please enlighten me.
Does this have anything to do with how I've configured OpenMPI?

Thanks,

Ed




Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Gus Correa

Hi Reza

It is hard to guess with little information.
Other things you could check:

1) Are you allowed to increase the stack size (say,
by the sys admin in limits.conf)?
If using a job queue system,
does it limit the stack size somehow?

2) If you can compile and
run the Open MPI examples (hello_c.c, ring_c.c, connectivity_c.c),
then it is unlikely that the problem is with Open MPI.
This is kind of a first line of defense to diagnose this type
of problem and the health of your Open MPI installation.

Your error message says "Connection reset by peer", so
I wonder if there is any firewall or other network roadblock
or configuration issue.  Worth testing Open MPI
with simpler MPI programs,
and even (for network setup) with shell commands like "hostname".

3) Make sure there is no mixup of MPI implementations (e.g. MPICH
and Open MPI) or versions, both for mpicc and mpiexec.
Make sure the LD_LIBRARY_PATH is pointing to the right OpenMPI
lib location (and to the right BLAS/LAPACK location, for that matter).

4) No mixup of architectures either (32 vs 64 bit).
I wonder why your Open MPI library is installed in
/usr/lib/openmpi not /usr/lib64,
but your HPL ARCH = intel64 and everything else seems to be x86_64.
If you apt-get an Open MPI package, check if it is
i386 or x86_64.
(It may be simpler to download and install
the Open MPI tarball in /usr/local or in your home directory.)

5) Check if you are using a threaded or OpenMP enabled
BLAS/Lapack library or a number of threads greater than 1.

6) Is the problem size (N) in your HPL.dat parameter file
consistent with the physical memory available?

I hope this helps,
Gus Correa

On 04/03/2013 02:32 PM, Ralph Castain wrote:

I agree with Gus - check your stack size. This isn't occurring in OMPI
itself, so I suspect it is in the system setup.


On Apr 3, 2013, at 10:17 AM, Reza Bakhshayeshi > wrote:


Thanks for your answers.

@Ralph Castain:
Do you mean what error I receive?
It's the output when I'm running the program:

*** Process received signal ***
Signal: Segmentation fault (11)
Signal code: Address not mapped (1)
Failing at address: 0x1b7f000
[ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6a84b524a0]
[ 1] hpcc(HPCC_Power2NodesMPIRandomAccessCheck+0xa04) [0x423834]
[ 2] hpcc(HPCC_MPIRandomAccess+0x87a) [0x41e43a]
[ 3] hpcc(main+0xfbf) [0x40a1bf]
[ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)
[0x7f6a84b3d76d]
[ 5] hpcc() [0x40aafd]
*** End of error message ***
[
][[53938,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--
mpirun noticed that process rank 1 with PID 4164 on node 192.168.100.6
exited on signal 11 (Segmentation fault).
--

@Gus Correa:
I did it both on server and on instances but it didn't solve the problem.


On 3 April 2013 19:14, Gus Correa > wrote:

Hi Reza

Check the system stacksize first ('limit stacksize' or 'ulimit -s').
If it is small, you can try to increase it
before you run the program.
Say (tcsh):

limit stacksize unlimited

or (bash):

ulimit -s unlimited

I hope this helps,
Gus Correa


On 04/03/2013 10:29 AM, Ralph Castain wrote:

Could you perhaps share the stacktrace from the segfault? It's
impossible to advise you on the problem without seeing it.


On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi

>>
wrote:

​Hi
​​I have installed HPCC benchmark suite and openmpi on a
private cloud
instances.
Unfortunately I get Segmentation fault error mostly when I
want to run
it simultaneously on two or more instances with:
mpirun -np 2 --hostfile ./myhosts hpcc

Everything is on Ubuntu server 12.04 (updated)
and this is my make.intel64 file:

shell
--__--__--
#

--__--__--
#
SHELL = /bin/sh
#
CD = cd
CP = cp
LN_S = ln -s
MKDIR = mkdir
RM = /bin/rm -f
TOUCH = touch
#
#

--__--__--
# - Platform identifier
--__--
#

--__--__--
#
 

Re: [OMPI users] memory per core/process

2013-04-03 Thread Ralph Castain
Here is a v1.6 port of what was committed to the trunk. Let me know if/how it 
works for you. The option you will want to use is:

mpirun -mca opal_set_max_sys_limits stacksize:unlimited

or whatever number you want to give (see ulimit for the units). Note that you 
won't see any impact if you run it with a non-OMPI executable like "sh ulimit" 
since it only gets called during MPI_Init.


On Apr 2, 2013, at 9:48 AM, Duke Nguyen  wrote:

> On 4/2/13 11:03 PM, Gus Correa wrote:
>> On 04/02/2013 11:40 AM, Duke Nguyen wrote:
>>> On 3/30/13 8:46 PM, Patrick Bégou wrote:
 Ok, so your problem is identified as a stack size problem. I went into
 these limitations using Intel fortran compilers on large data problems.
 
 First, it seems you can increase your stack size as "ulimit -s
 unlimited" works (you didn't enforce the system hard limit). The best
 way is to set this setting in your .bashrc file so it will works on
 every node.
 But setting it to unlimited may not be really safe. IE, if you got in
 a badly coded recursive function calling itself without a stop
 condition you can request all the system memory and crash the node. So
 set a large but limited value, it's safer.
 
>>> 
>>> Now I feel the pain you mentioned :). With -s unlimited now some of our
>>> nodes are easily down (completely) and needed to be hard reset!!!
>>> (whereas we never had any node down like that before even with the
>>> killed or badly coded jobs).
>>> 
>>> Looking for a safer number of ulimit -s other than "unlimited" now... :(
>>> 
>> 
>> In my opinion this is a trade off between who feels the pain.
>> It can be you (sys admin) feeling the pain of having
>> to power up offline nodes,
>> or it could be the user feeling the pain for having
>> her/his code killed by segmentation fault due to small memory
>> available for the stack.
> 
> ... in case that user is at a large institute that promises to provide best 
> service, unlimited resources/unlimited *everything* to end users. If not, 
> user should really think of how to make use the best of available resources. 
> Unfortunately many (most?) end users don't.
> 
>> There is only so much that can be done to make everybody happy.
> 
> So true... especially HPC resource is still luxurious here in Vietnam, and we 
> have a quite small (and not-so-strong) cluster.
> 
>> If you share the nodes among jobs, you could set the
>> stack size limit to
>> some part of the physical_memory divided by the number_of_cores,
>> saving some memory for the OS etc beforehand.
>> However, this can be a straitjacket for jobs that could run with
>> a bit more memory, and won't because of this limit.
>> If you do not share the nodes, then you could make stacksize
>> closer to physical memory.
> 
> Great. Thanks for this advice Gus.
> 
>> 
>> Anyway, this is less of an OpenMPI than of a
>> resource manager / queuing system conversation.
> 
> Yeah, and I have learned a lot other than just openmpi stuffs here :)
> 
>> 
>> Best,
>> Gus Correa
>> 
 I'm managing a cluster and I always set a maximum value to stack size.
 I also limit the memory available for each core for system stability.
 If a user request only one of the 12 cores of a node he can only
 access 1/12 of the node memory amount. If he needs more memory he has
 to request 2 cores, even if he uses a sequential code. This avoid
 crashing jobs of other users on the same node with memory
 requirements. But this is not configured on your node.
 
 Duke Nguyen a écrit :
> On 3/30/13 3:13 PM, Patrick Bégou wrote:
>> I do not know about your code but:
>> 
>> 1) did you check stack limitations ? Typically intel fortran codes
>> needs large amount of stack when the problem size increase.
>> Check ulimit -a
> 
> First time I heard of stack limitations. Anyway, ulimit -a gives
> 
> $ ulimit -a
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> scheduling priority (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 127368
> max locked memory (kbytes, -l) unlimited
> max memory size (kbytes, -m) unlimited
> open files (-n) 1024
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> real-time priority (-r) 0
> stack size (kbytes, -s) 10240
> cpu time (seconds, -t) unlimited
> max user processes (-u) 1024
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
> 
> So stack size is 10MB??? Does this one create problem? How do I
> change this?
> 
>> 
>> 2) did your node uses cpuset and memory limitation like fake numa to
>> set the maximum amount of memory available for a job ?
> 
> Not really understand (also first time heard of fake numa), but I am
> pretty sure we do not have such things. The server I tried was a
> dedicated server with 2 x5420 and 16GB 

Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Ralph Castain
I agree with Gus - check your stack size. This isn't occurring in OMPI itself, 
so I suspect it is in the system setup.


On Apr 3, 2013, at 10:17 AM, Reza Bakhshayeshi  wrote:

> Thanks for your answers.
> 
> @Ralph Castain: 
> Do you mean what error I receive?
> It's the output when I'm running the program:
> 
>   *** Process received signal ***
>   Signal: Segmentation fault (11)
>   Signal code: Address not mapped (1)
>   Failing at address: 0x1b7f000
>   [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6a84b524a0]
>   [ 1] hpcc(HPCC_Power2NodesMPIRandomAccessCheck+0xa04) [0x423834]
>   [ 2] hpcc(HPCC_MPIRandomAccess+0x87a) [0x41e43a]
>   [ 3] hpcc(main+0xfbf) [0x40a1bf]
>   [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed) 
> [0x7f6a84b3d76d]
>   [ 5] hpcc() [0x40aafd]
>   *** End of error message ***
> [ 
> ][[53938,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
>  mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
> --
> mpirun noticed that process rank 1 with PID 4164 on node 192.168.100.6 exited 
> on signal 11 (Segmentation fault).
> --
> 
> @Gus Correa:
> I did it both on server and on instances but it didn't solve the problem.
> 
> 
> On 3 April 2013 19:14, Gus Correa  wrote:
> Hi Reza
> 
> Check the system stacksize first ('limit stacksize' or 'ulimit -s').
> If it is small, you can try to increase it
> before you run the program.
> Say (tcsh):
> 
> limit stacksize unlimited
> 
> or (bash):
> 
> ulimit -s unlimited
> 
> I hope this helps,
> Gus Correa
> 
> 
> On 04/03/2013 10:29 AM, Ralph Castain wrote:
> Could you perhaps share the stacktrace from the segfault? It's
> impossible to advise you on the problem without seeing it.
> 
> 
> On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi  > wrote:
> 
> ​Hi
> ​​I have installed HPCC benchmark suite and openmpi on a private cloud
> instances.
> Unfortunately I get Segmentation fault error mostly when I want to run
> it simultaneously on two or more instances with:
> mpirun -np 2 --hostfile ./myhosts hpcc
> 
> Everything is on Ubuntu server 12.04 (updated)
> and this is my make.intel64 file:
> 
> shell --
> # --
> #
> SHELL = /bin/sh
> #
> CD = cd
> CP = cp
> LN_S = ln -s
> MKDIR = mkdir
> RM = /bin/rm -f
> TOUCH = touch
> #
> # --
> # - Platform identifier 
> # --
> #
> ARCH = intel64
> #
> # --
> # - HPL Directory Structure / HPL library --
> # --
> #
> TOPdir = ../../..
> INCdir = $(TOPdir)/include
> BINdir = $(TOPdir)/bin/$(ARCH)
> LIBdir = $(TOPdir)/lib/$(ARCH)
> #
> HPLlib = $(LIBdir)/libhpl.a
> #
> # --
> # - Message Passing library (MPI) --
> # --
> # MPinc tells the C compiler where to find the Message Passing library
> # header files, MPlib is defined to be the name of the library to be
> # used. The variable MPdir is only used for defining MPinc and MPlib.
> #
> MPdir = /usr/lib/openmpi
> MPinc = -I$(MPdir)/include
> MPlib = $(MPdir)/lib/libmpi.so
> #
> # --
> # - Linear Algebra library (BLAS or VSIPL) -
> # --
> # LAinc tells the C compiler where to find the Linear Algebra library
> # header files, LAlib is defined to be the name of the library to be
> # used. The variable LAdir is only used for defining LAinc and LAlib.
> #
> LAdir = /usr/local/ATLAS/obj64
> LAinc = -I$(LAdir)/include
> LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
> #
> # --
> # - F77 / C interface --
> # --
> # You can skip this section if and only if you are not planning to use
> # a BLAS library featuring a Fortran 77 interface. Otherwise, it is
> # necessary to fill out the F2CDEFS variable with the appropriate
> # options. **One and only one** option should be chosen in **each** of
> # the 3 following categories:
> #
> # 1) name space (How C calls a Fortran 77 routine)
> #
> # 

Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Reza Bakhshayeshi
Thanks for your answers.

@Ralph Castain:
Do you mean what error I receive?
It's the output when I'm running the program:

  *** Process received signal ***
  Signal: Segmentation fault (11)
  Signal code: Address not mapped (1)
  Failing at address: 0x1b7f000
  [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x364a0) [0x7f6a84b524a0]
  [ 1] hpcc(HPCC_Power2NodesMPIRandomAccessCheck+0xa04) [0x423834]
  [ 2] hpcc(HPCC_MPIRandomAccess+0x87a) [0x41e43a]
  [ 3] hpcc(main+0xfbf) [0x40a1bf]
  [ 4] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)
[0x7f6a84b3d76d]
  [ 5] hpcc() [0x40aafd]
  *** End of error message ***
[
][[53938,1],0][../../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:216:mca_btl_tcp_frag_recv]
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
--
mpirun noticed that process rank 1 with PID 4164 on node 192.168.100.6
exited on signal 11 (Segmentation fault).
--

@Gus Correa:
I did it both on server and on instances but it didn't solve the problem.


On 3 April 2013 19:14, Gus Correa  wrote:

> Hi Reza
>
> Check the system stacksize first ('limit stacksize' or 'ulimit -s').
> If it is small, you can try to increase it
> before you run the program.
> Say (tcsh):
>
> limit stacksize unlimited
>
> or (bash):
>
> ulimit -s unlimited
>
> I hope this helps,
> Gus Correa
>
>
> On 04/03/2013 10:29 AM, Ralph Castain wrote:
>
>> Could you perhaps share the stacktrace from the segfault? It's
>> impossible to advise you on the problem without seeing it.
>>
>>
>> On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi > > wrote:
>>
>>  ​Hi
>>> ​​I have installed HPCC benchmark suite and openmpi on a private cloud
>>> instances.
>>> Unfortunately I get Segmentation fault error mostly when I want to run
>>> it simultaneously on two or more instances with:
>>> mpirun -np 2 --hostfile ./myhosts hpcc
>>>
>>> Everything is on Ubuntu server 12.04 (updated)
>>> and this is my make.intel64 file:
>>>
>>> shell --**--**--
>>> # --**--**
>>> --
>>> #
>>> SHELL = /bin/sh
>>> #
>>> CD = cd
>>> CP = cp
>>> LN_S = ln -s
>>> MKDIR = mkdir
>>> RM = /bin/rm -f
>>> TOUCH = touch
>>> #
>>> # --**--**
>>> --
>>> # - Platform identifier --**
>>> --
>>> # --**--**
>>> --
>>> #
>>> ARCH = intel64
>>> #
>>> # --**--**
>>> --
>>> # - HPL Directory Structure / HPL library --
>>> # --**--**
>>> --
>>> #
>>> TOPdir = ../../..
>>> INCdir = $(TOPdir)/include
>>> BINdir = $(TOPdir)/bin/$(ARCH)
>>> LIBdir = $(TOPdir)/lib/$(ARCH)
>>> #
>>> HPLlib = $(LIBdir)/libhpl.a
>>> #
>>> # --**--**
>>> --
>>> # - Message Passing library (MPI) --**
>>> 
>>> # --**--**
>>> --
>>> # MPinc tells the C compiler where to find the Message Passing library
>>> # header files, MPlib is defined to be the name of the library to be
>>> # used. The variable MPdir is only used for defining MPinc and MPlib.
>>> #
>>> MPdir = /usr/lib/openmpi
>>> MPinc = -I$(MPdir)/include
>>> MPlib = $(MPdir)/lib/libmpi.so
>>> #
>>> # --**--**
>>> --
>>> # - Linear Algebra library (BLAS or VSIPL) -
>>> # --**--**
>>> --
>>> # LAinc tells the C compiler where to find the Linear Algebra library
>>> # header files, LAlib is defined to be the name of the library to be
>>> # used. The variable LAdir is only used for defining LAinc and LAlib.
>>> #
>>> LAdir = /usr/local/ATLAS/obj64
>>> LAinc = -I$(LAdir)/include
>>> LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
>>> #
>>> # --**--**
>>> --
>>> # - F77 / C interface --**
>>> 
>>> # --**--**
>>> --
>>> # You can skip this section if and only if you are not planning to use
>>> # a BLAS library featuring a Fortran 77 interface. Otherwise, it is
>>> # necessary to fill out the F2CDEFS variable with the appropriate
>>> # options. **One and only one** option should be chosen in **each** of
>>> # the 3 following categories:
>>> #
>>> # 1) name space (How C calls a Fortran 77 routine)
>>> #
>>> # 

Re: [OMPI users] FCA collectives disabled by default

2013-04-03 Thread Brock Palen
That would do it. 

Thanks!

Now to make even the normal ones work

Brock Palen
www.umich.edu/~brockp
CAEN Advanced Computing
bro...@umich.edu
(734)936-1985



On Apr 3, 2013, at 10:31 AM, Ralph Castain  wrote:

> Looking at the source code, it is because those other collectives aren't 
> implemented yet :-)
> 
> 
> On Apr 2, 2013, at 12:07 PM, Brock Palen  wrote:
> 
>> We are starting to play with FCA on our Mellonox based IB fabric.
>> 
>> I noticed from ompi_info that FCA support for a lot of collectives are 
>> disabled by default:
>> 
>> Any idea why only barrier/bcast/reduce  are on by default and all the more 
>> complex values are disabled?
>> 
>>   MCA coll: parameter "coll_fca_enable_barrier" (current value: 
>> <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_bcast" (current value: 
>> <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_reduce" (current value: 
>> <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_reduce_scatter" (current 
>> value: <0>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_allreduce" (current 
>> value: <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_allgather" (current 
>> value: <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_allgatherv" (current 
>> value: <1>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_gather" (current value: 
>> <0>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_gatherv" (current value: 
>> <0>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_alltoall" (current value: 
>> <0>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_alltoallv" (current 
>> value: <0>, data source: default value)
>>   MCA coll: parameter "coll_fca_enable_alltoallw" (current 
>> value: <0>, data source: default value)
>> 
>> Brock Palen
>> www.umich.edu/~brockp
>> CAEN Advanced Computing
>> bro...@umich.edu
>> (734)936-1985
>> 
>> 
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Gus Correa

Hi Reza

Check the system stacksize first ('limit stacksize' or 'ulimit -s').
If it is small, you can try to increase it
before you run the program.
Say (tcsh):

limit stacksize unlimited

or (bash):

ulimit -s unlimited

I hope this helps,
Gus Correa

On 04/03/2013 10:29 AM, Ralph Castain wrote:

Could you perhaps share the stacktrace from the segfault? It's
impossible to advise you on the problem without seeing it.


On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi > wrote:


​Hi
​​I have installed HPCC benchmark suite and openmpi on a private cloud
instances.
Unfortunately I get Segmentation fault error mostly when I want to run
it simultaneously on two or more instances with:
mpirun -np 2 --hostfile ./myhosts hpcc

Everything is on Ubuntu server 12.04 (updated)
and this is my make.intel64 file:

shell --
# --
#
SHELL = /bin/sh
#
CD = cd
CP = cp
LN_S = ln -s
MKDIR = mkdir
RM = /bin/rm -f
TOUCH = touch
#
# --
# - Platform identifier 
# --
#
ARCH = intel64
#
# --
# - HPL Directory Structure / HPL library --
# --
#
TOPdir = ../../..
INCdir = $(TOPdir)/include
BINdir = $(TOPdir)/bin/$(ARCH)
LIBdir = $(TOPdir)/lib/$(ARCH)
#
HPLlib = $(LIBdir)/libhpl.a
#
# --
# - Message Passing library (MPI) --
# --
# MPinc tells the C compiler where to find the Message Passing library
# header files, MPlib is defined to be the name of the library to be
# used. The variable MPdir is only used for defining MPinc and MPlib.
#
MPdir = /usr/lib/openmpi
MPinc = -I$(MPdir)/include
MPlib = $(MPdir)/lib/libmpi.so
#
# --
# - Linear Algebra library (BLAS or VSIPL) -
# --
# LAinc tells the C compiler where to find the Linear Algebra library
# header files, LAlib is defined to be the name of the library to be
# used. The variable LAdir is only used for defining LAinc and LAlib.
#
LAdir = /usr/local/ATLAS/obj64
LAinc = -I$(LAdir)/include
LAlib = $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
#
# --
# - F77 / C interface --
# --
# You can skip this section if and only if you are not planning to use
# a BLAS library featuring a Fortran 77 interface. Otherwise, it is
# necessary to fill out the F2CDEFS variable with the appropriate
# options. **One and only one** option should be chosen in **each** of
# the 3 following categories:
#
# 1) name space (How C calls a Fortran 77 routine)
#
# -DAdd_ : all lower case and a suffixed underscore (Suns,
# Intel, ...), [default]
# -DNoChange : all lower case (IBM RS6000),
# -DUpCase : all upper case (Cray),
# -DAdd__ : the FORTRAN compiler in use is f2c.
#
# 2) C and Fortran 77 integer mapping
#
# -DF77_INTEGER=int : Fortran 77 INTEGER is a C int, [default]
# -DF77_INTEGER=long : Fortran 77 INTEGER is a C long,
# -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
#
# 3) Fortran 77 string handling
#
# -DStringSunStyle : The string address is passed at the string loca-
# tion on the stack, and the string length is then
# passed as an F77_INTEGER after all explicit
# stack arguments, [default]
# -DStringStructPtr : The address of a structure is passed by a
# Fortran 77 string, and the structure is of the
# form: struct {char *cp; F77_INTEGER len;},
# -DStringStructVal : A structure is passed by value for each Fortran
# 77 string, and the structure is of the form:
# struct {char *cp; F77_INTEGER len;},
# -DStringCrayStyle : Special option for Cray machines, which uses
# Cray fcd (fortran character descriptor) for
# interoperation.
#
F2CDEFS =
#
# --
# - HPL includes / libraries / specifics ---
# --
#
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm
#
# - Compile time options ---
#
# -DHPL_COPY_L force the copy of the panel L before bcast;
# -DHPL_CALL_CBLAS call the cblas interface;
# -DHPL_CALL_VSIPL call the 

Re: [OMPI users] FCA collectives disabled by default

2013-04-03 Thread Ralph Castain
Looking at the source code, it is because those other collectives aren't 
implemented yet :-)


On Apr 2, 2013, at 12:07 PM, Brock Palen  wrote:

> We are starting to play with FCA on our Mellonox based IB fabric.
> 
> I noticed from ompi_info that FCA support for a lot of collectives are 
> disabled by default:
> 
> Any idea why only barrier/bcast/reduce  are on by default and all the more 
> complex values are disabled?
> 
>MCA coll: parameter "coll_fca_enable_barrier" (current value: 
> <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_bcast" (current value: 
> <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_reduce" (current value: 
> <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_reduce_scatter" (current 
> value: <0>, data source: default value)
>MCA coll: parameter "coll_fca_enable_allreduce" (current 
> value: <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_allgather" (current 
> value: <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_allgatherv" (current 
> value: <1>, data source: default value)
>MCA coll: parameter "coll_fca_enable_gather" (current value: 
> <0>, data source: default value)
>MCA coll: parameter "coll_fca_enable_gatherv" (current value: 
> <0>, data source: default value)
>MCA coll: parameter "coll_fca_enable_alltoall" (current value: 
> <0>, data source: default value)
>MCA coll: parameter "coll_fca_enable_alltoallv" (current 
> value: <0>, data source: default value)
>MCA coll: parameter "coll_fca_enable_alltoallw" (current 
> value: <0>, data source: default value)
> 
> Brock Palen
> www.umich.edu/~brockp
> CAEN Advanced Computing
> bro...@umich.edu
> (734)936-1985
> 
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Ralph Castain
Could you perhaps share the stacktrace from the segfault? It's impossible to 
advise you on the problem without seeing it.


On Apr 3, 2013, at 5:28 AM, Reza Bakhshayeshi  wrote:

> ​Hi
> ​​I have installed HPCC benchmark suite and openmpi on a private cloud 
> instances. 
> Unfortunately I get Segmentation fault error mostly when I want to run it 
> simultaneously on two or more instances with:
> mpirun -np 2 --hostfile ./myhosts hpcc
> 
> Everything is on Ubuntu server 12.04 (updated)
> and this is my make.intel64 file:
> 
> shell --
> # --
> #
> SHELL= /bin/sh
> #
> CD   = cd
> CP   = cp
> LN_S = ln -s
> MKDIR= mkdir
> RM   = /bin/rm -f
> TOUCH= touch
> #
> # --
> # - Platform identifier 
> # --
> #
> ARCH = intel64
> #
> # --
> # - HPL Directory Structure / HPL library --
> # --
> #
> TOPdir   = ../../..
> INCdir   = $(TOPdir)/include
> BINdir   = $(TOPdir)/bin/$(ARCH)
> LIBdir   = $(TOPdir)/lib/$(ARCH)
> #
> HPLlib   = $(LIBdir)/libhpl.a 
> #
> # --
> # - Message Passing library (MPI) --
> # --
> # MPinc tells the  C  compiler where to find the Message Passing library
> # header files,  MPlib  is defined  to be the name of  the library to be
> # used. The variable MPdir is only used for defining MPinc and MPlib.
> #
> MPdir= /usr/lib/openmpi
> MPinc= -I$(MPdir)/include
> MPlib= $(MPdir)/lib/libmpi.so
> #
> # --
> # - Linear Algebra library (BLAS or VSIPL) -
> # --
> # LAinc tells the  C  compiler where to find the Linear Algebra  library
> # header files,  LAlib  is defined  to be the name of  the library to be
> # used. The variable LAdir is only used for defining LAinc and LAlib.
> #
> LAdir= /usr/local/ATLAS/obj64
> LAinc= -I$(LAdir)/include
> LAlib= $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
> #
> # --
> # - F77 / C interface --
> # --
> # You can skip this section  if and only if  you are not planning to use
> # a  BLAS  library featuring a Fortran 77 interface.  Otherwise,  it  is
> # necessary  to  fill out the  F2CDEFS  variable  with  the  appropriate
> # options.  **One and only one**  option should be chosen in **each** of
> # the 3 following categories:
> #
> # 1) name space (How C calls a Fortran 77 routine)
> #
> # -DAdd_  : all lower case and a suffixed underscore  (Suns,
> #   Intel, ...),   [default]
> # -DNoChange  : all lower case (IBM RS6000),
> # -DUpCase: all upper case (Cray),
> # -DAdd__ : the FORTRAN compiler in use is f2c.
> #
> # 2) C and Fortran 77 integer mapping
> #
> # -DF77_INTEGER=int   : Fortran 77 INTEGER is a C int, [default]
> # -DF77_INTEGER=long  : Fortran 77 INTEGER is a C long,
> # -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
> #
> # 3) Fortran 77 string handling
> #
> # -DStringSunStyle: The string address is passed at the string loca-
> #   tion on the stack, and the string length is then
> #   passed as  an  F77_INTEGER  after  all  explicit
> #   stack arguments,   [default]
> # -DStringStructPtr   : The address  of  a  structure  is  passed  by  a
> #   Fortran 77  string,  and the structure is of the
> #   form: struct {char *cp; F77_INTEGER len;},
> # -DStringStructVal   : A structure is passed by value for each  Fortran
> #   77 string,  and  the  structure is  of the form:
> #   struct {char *cp; F77_INTEGER len;},
> # -DStringCrayStyle   : Special option for  Cray  machines,  which  uses
> #   Cray  fcd  (fortran  character  descriptor)  for
> #   interoperation.
> #
> F2CDEFS  =
> #
> # --
> # - HPL includes / libraries / specifics 

[OMPI users] Segmentation fault with HPCC benchmark

2013-04-03 Thread Reza Bakhshayeshi
​Hi
​​I have installed HPCC benchmark suite and openmpi on a private cloud
instances.
Unfortunately I get Segmentation fault error mostly when I want to run it
simultaneously on two or more instances with:
mpirun -np 2 --hostfile ./myhosts hpcc

Everything is on Ubuntu server 12.04 (updated)
and this is my make.intel64 file:

shell --
# --
#
SHELL= /bin/sh
#
CD   = cd
CP   = cp
LN_S = ln -s
MKDIR= mkdir
RM   = /bin/rm -f
TOUCH= touch
#
# --
# - Platform identifier 
# --
#
ARCH = intel64
#
# --
# - HPL Directory Structure / HPL library --
# --
#
TOPdir   = ../../..
INCdir   = $(TOPdir)/include
BINdir   = $(TOPdir)/bin/$(ARCH)
LIBdir   = $(TOPdir)/lib/$(ARCH)
#
HPLlib   = $(LIBdir)/libhpl.a
#
# --
# - Message Passing library (MPI) --
# --
# MPinc tells the  C  compiler where to find the Message Passing library
# header files,  MPlib  is defined  to be the name of  the library to be
# used. The variable MPdir is only used for defining MPinc and MPlib.
#
MPdir= /usr/lib/openmpi
MPinc= -I$(MPdir)/include
MPlib= $(MPdir)/lib/libmpi.so
#
# --
# - Linear Algebra library (BLAS or VSIPL) -
# --
# LAinc tells the  C  compiler where to find the Linear Algebra  library
# header files,  LAlib  is defined  to be the name of  the library to be
# used. The variable LAdir is only used for defining LAinc and LAlib.
#
LAdir= /usr/local/ATLAS/obj64
LAinc= -I$(LAdir)/include
LAlib= $(LAdir)/lib/libcblas.a $(LAdir)/lib/libatlas.a
#
# --
# - F77 / C interface --
# --
# You can skip this section  if and only if  you are not planning to use
# a  BLAS  library featuring a Fortran 77 interface.  Otherwise,  it  is
# necessary  to  fill out the  F2CDEFS  variable  with  the  appropriate
# options.  **One and only one**  option should be chosen in **each** of
# the 3 following categories:
#
# 1) name space (How C calls a Fortran 77 routine)
#
# -DAdd_  : all lower case and a suffixed underscore  (Suns,
#   Intel, ...),   [default]
# -DNoChange  : all lower case (IBM RS6000),
# -DUpCase: all upper case (Cray),
# -DAdd__ : the FORTRAN compiler in use is f2c.
#
# 2) C and Fortran 77 integer mapping
#
# -DF77_INTEGER=int   : Fortran 77 INTEGER is a C int, [default]
# -DF77_INTEGER=long  : Fortran 77 INTEGER is a C long,
# -DF77_INTEGER=short : Fortran 77 INTEGER is a C short.
#
# 3) Fortran 77 string handling
#
# -DStringSunStyle: The string address is passed at the string loca-
#   tion on the stack, and the string length is then
#   passed as  an  F77_INTEGER  after  all  explicit
#   stack arguments,   [default]
# -DStringStructPtr   : The address  of  a  structure  is  passed  by  a
#   Fortran 77  string,  and the structure is of the
#   form: struct {char *cp; F77_INTEGER len;},
# -DStringStructVal   : A structure is passed by value for each  Fortran
#   77 string,  and  the  structure is  of the form:
#   struct {char *cp; F77_INTEGER len;},
# -DStringCrayStyle   : Special option for  Cray  machines,  which  uses
#   Cray  fcd  (fortran  character  descriptor)  for
#   interoperation.
#
F2CDEFS  =
#
# --
# - HPL includes / libraries / specifics ---
# --
#
HPL_INCLUDES = -I$(INCdir) -I$(INCdir)/$(ARCH) $(LAinc) $(MPinc)
HPL_LIBS = $(HPLlib) $(LAlib) $(MPlib) -lm
#
# - Compile time options ---
#
# -DHPL_COPY_L   force the copy of the panel L before bcast;
# -DHPL_CALL_CBLAS   call the cblas interface;
#