Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Nick Wright
I think this issue is now resolved and thanks everybody for your help. I 
certainly learnt a lot!



For the first case you describe, as OPENMPI is now, the call sequence

from fortran is

mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank

For the second case, as MPICH is now, its

mpi_comm_rank -> PMPI_Comm_rank



AFAIK, all known/popular MPI implemention's fortran binding 
layer is implemented with C MPI functions, including
MPICH2 and OpenMPI.   If MPICH2's fortran layer was implemented 
the way you said, typical profiling tools including MPE will

fail to work with fortran applications.

e.g. check mpich2-xxx/src/binding/f77/sendf.c.


To answer this specific point see for example the comment in

src/binding/f77/comm_sizef.c

/* This defines the routine that we call, which must be the PMPI version
   since we're renameing the Fortran entry as the pmpi version */

and the workings of the definition in MPICH

#ifndef MPICH_MPI_FROM_PMPI

This is what makes MPICH behaviour different than OPENMPI's in this matter.

Regards, Nick.


A.Chan

So for the first case if I have a pure fortran/C++ code I have to 
profile at the C interface.


So is the patch now retracted ?

Nick.

I think you have an incorrect deffinition of "correctly" :). 
According 
to the MPI standard, an MPI implementation is free to either layer 
language bindings (and only allow profiling at the lowest layer)  or

not

layer the language bindings (and require profiling libraries
intercept 
each language).  The only requirement is that the implementation 
document what it has done.


Since everyone is pretty clear on what Open MPI has done, I don't
think 

you can claim Open MPI is doing it "incorrectly".  Different from
MPICH 

is not necessarily incorrect.  (BTW, LAM/MPI handles profiling the
same 

way as Open MPI).

Brian

On Fri, 5 Dec 2008, Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do 
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when
our 

tool is configured and add some macros to deal with that
accordingly. 

Is there an easy way to do this built into openmpi?

Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real

instrumentation,

you don't need to implement the fortran layer, you can simply

link

with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77

-lYourProfClib

where libYourProfClib.a is your profiling tool written in C. If
you 

don't want to intercept the MPI call twice for fortran program,
you need to implment fortran layer.  In that case, I would think

you

can just call C version of PMPI_xxx directly from your fortran
layer, 

e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile

switched

on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this

then

calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it

should.

So if one wants to create a library that can profile C and

Fortran

codes at the same time one ends up intercepting the mpi call
twice. 

Which is

not desirable and not what should happen (and indeed doesn't

happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a

fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
   printf("MPI_comm_rank call successfully intercepted\n");
   PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

   program hello
implicit none
include 'mpif.h'
integer ierr
integer myid,nprocs
character*24 fdate,host
call MPI_Init( ierr )
   myid=0
   call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
   call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
   call getenv('HOST',host)
   write (*,*) 'Hello World from proc',myid,' out

of',nprocs,host

   call mpi_finalize(ierr)
   end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list

Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Anthony Chan
Hi George,

- "George Bosilca"  wrote:

> On Dec 5, 2008, at 03:16 , Anthony Chan wrote:
> 
> > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> >printf("mpi_comm_rank call successfully intercepted\n");
> >*info = PMPI_Comm_rank(comm,rank);
> > }
> 
> Unfortunately this example is not correct. The real Fortran prototype 
> 
> for the MPI_Comm_rank function is
> void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *ierr).

Yes, you are right.  I was being sloppy (it was late, so just cut/paste
from Nick's code), the correct code should be

void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(MPI_Comm_f2c(*comm),*rank);
}


A.Chan
> 
> As you might notice there is no MPI_Comm (and believe me for Open MPI 
> 
> MPI_Comm is different than MPI_Fint), and there is no guarantee that 
> 
> the C int is the same as the Fortran int (looks weird but true).  
> Therefore, several conversions are required in order to be able to go 
> 
> from the Fortran layer into the C one.
> 
> As a result, a tool should never cross the language boundary by  
> itself. Instead it should call the pmpi function as provided by the  
> MPI library. This doesn't really fix the issue that started this email
>  
> thread, but at least clarify it a little bit.
> 
>george.
> 
> >
> > A.Chan
> >
> > - "Nick Wright"  wrote:
> >
> >> Hi
> >>
> >> I am trying to use the PMPI interface with OPENMPI to profile a
> >> fortran
> >> program.
> >>
> >> I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile
> switched
> >> on.
> >>
> >> The problem seems to be that if one eg. intercepts to call to
> >> mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this 
> 
> >> then
> >>
> >> calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.
> >>
> >> So if one wants to create a library that can profile C and Fortran
> >> codes
> >> at the same time one ends up intercepting the mpi call twice. Which
>  
> >> is
> >>
> >> not desirable and not what should happen (and indeed doesn't happen
>  
> >> in
> >>
> >> other MPI implementations).
> >>
> >> A simple example to illustrate is below. If somebody knows of a fix
>  
> >> to
> >>
> >> avoid this issue that would be great !
> >>
> >> Thanks
> >>
> >> Nick.
> >>
> >> pmpi_test.c: mpicc pmpi_test.c -c
> >>
> >> #include
> >> #include "mpi.h"
> >> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> >>   printf("mpi_comm_rank call successfully intercepted\n");
> >>   pmpi_comm_rank_(comm,rank,info);
> >> }
> >> int MPI_Comm_rank(MPI_Comm comm, int *rank) {
> >>   printf("MPI_comm_rank call successfully intercepted\n");
> >>   PMPI_Comm_rank(comm,rank);
> >> }
> >>
> >> hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o
> >>
> >>   program hello
> >>implicit none
> >>include 'mpif.h'
> >>integer ierr
> >>integer myid,nprocs
> >>character*24 fdate,host
> >>call MPI_Init( ierr )
> >>   myid=0
> >>   call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
> >>   call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
> >>   call getenv('HOST',host)
> >>   write (*,*) 'Hello World from proc',myid,' out
> of',nprocs,host
> >>   call mpi_finalize(ierr)
> >>   end
> >>
> >>
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Anthony Chan
Hi Nick,

- "Nick Wright"  wrote:

> For the first case you describe, as OPENMPI is now, the call sequence
> 
> from fortran is
> 
> mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank
> 
> For the second case, as MPICH is now, its
> 
> mpi_comm_rank -> PMPI_Comm_rank
> 

AFAIK, all known/popular MPI implemention's fortran binding 
layer is implemented with C MPI functions, including
MPICH2 and OpenMPI.   If MPICH2's fortran layer was implemented 
the way you said, typical profiling tools including MPE will
fail to work with fortran applications.

e.g. check mpich2-xxx/src/binding/f77/sendf.c.

A.Chan

> So for the first case if I have a pure fortran/C++ code I have to 
> profile at the C interface.
> 
> So is the patch now retracted ?
> 
> Nick.
> 
> > I think you have an incorrect deffinition of "correctly" :). 
> According 
> > to the MPI standard, an MPI implementation is free to either layer 
> > language bindings (and only allow profiling at the lowest layer)  or
> not
> > layer the language bindings (and require profiling libraries
> intercept 
> > each language).  The only requirement is that the implementation 
> > document what it has done.
> > 
> > Since everyone is pretty clear on what Open MPI has done, I don't
> think 
> > you can claim Open MPI is doing it "incorrectly".  Different from
> MPICH 
> > is not necessarily incorrect.  (BTW, LAM/MPI handles profiling the
> same 
> > way as Open MPI).
> > 
> > Brian
> > 
> > On Fri, 5 Dec 2008, Nick Wright wrote:
> > 
> >> Hi Antony
> >>
> >> That will work yes, but its not portable to other MPI's that do 
> >> implement the profiling layer correctly unfortunately.
> >>
> >> I guess we will just need to detect that we are using openmpi when
> our 
> >> tool is configured and add some macros to deal with that
> accordingly. 
> >> Is there an easy way to do this built into openmpi?
> >>
> >> Thanks
> >>
> >> Nick.
> >>
> >> Anthony Chan wrote:
> >>> Hope I didn't misunderstand your question.  If you implement
> >>> your profiling library in C where you do your real
> instrumentation,
> >>> you don't need to implement the fortran layer, you can simply
> link
> >>> with Fortran to C MPI wrapper library -lmpi_f77. i.e.
> >>>
> >>> /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77
> -lYourProfClib
> >>>
> >>> where libYourProfClib.a is your profiling tool written in C. If
> you 
> >>> don't want to intercept the MPI call twice for fortran program,
> >>> you need to implment fortran layer.  In that case, I would think
> you
> >>> can just call C version of PMPI_xxx directly from your fortran
> layer, 
> >>> e.g.
> >>>
> >>> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> >>> printf("mpi_comm_rank call successfully intercepted\n");
> >>> *info = PMPI_Comm_rank(comm,rank);
> >>> }
> >>>
> >>> A.Chan
> >>>
> >>> - "Nick Wright"  wrote:
> >>>
>  Hi
> 
>  I am trying to use the PMPI interface with OPENMPI to profile a
>  fortran program.
> 
>  I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile
> switched
>  on.
> 
>  The problem seems to be that if one eg. intercepts to call to 
>  mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this
> then
> 
>  calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it
> should.
> 
>  So if one wants to create a library that can profile C and
> Fortran
>  codes at the same time one ends up intercepting the mpi call
> twice. 
>  Which is
> 
>  not desirable and not what should happen (and indeed doesn't
> happen in
> 
>  other MPI implementations).
> 
>  A simple example to illustrate is below. If somebody knows of a
> fix to
> 
>  avoid this issue that would be great !
> 
>  Thanks
> 
>  Nick.
> 
>  pmpi_test.c: mpicc pmpi_test.c -c
> 
>  #include
>  #include "mpi.h"
>  void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> printf("mpi_comm_rank call successfully intercepted\n");
> pmpi_comm_rank_(comm,rank,info);
>  }
>  int MPI_Comm_rank(MPI_Comm comm, int *rank) {
> printf("MPI_comm_rank call successfully intercepted\n");
> PMPI_Comm_rank(comm,rank);
>  }
> 
>  hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o
> 
> program hello
>  implicit none
>  include 'mpif.h'
>  integer ierr
>  integer myid,nprocs
>  character*24 fdate,host
>  call MPI_Init( ierr )
> myid=0
> call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
> call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
> call getenv('HOST',host)
> write (*,*) 'Hello World from proc',myid,' out
> of',nprocs,host
> call mpi_finalize(ierr)
> end
> 
> 
> 
>  ___
>  users mailing list
> 

Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Anthony Chan
Hi Nick,

- "Nick Wright"  wrote:

> Hi Antony
> 
> That will work yes, but its not portable to other MPI's that do 
> implement the profiling layer correctly unfortunately.

I guess I must have missed something here.  What is not portable ?

> 
> I guess we will just need to detect that we are using openmpi when our
> 
> tool is configured and add some macros to deal with that accordingly.
> Is 
> there an easy way to do this built into openmpi?

MPE by default provides a fortran to C wrapper library, that way user
does not have to know about the MPI implementation's fortran to C layer.
MPE user can specify the fortran to C layer that implementation have
during MPE configure.

Since MPI implementation's fortran to C library does not change often,
so writing a configure test to check for libmpi_f77.*, libfmpich.*,
or libfmpi.* should get you covered for most platforms.

A.Chan
> 
> Thanks
> 
> Nick.
> 
> Anthony Chan wrote:
> > Hope I didn't misunderstand your question.  If you implement
> > your profiling library in C where you do your real instrumentation,
> > you don't need to implement the fortran layer, you can simply link
> > with Fortran to C MPI wrapper library -lmpi_f77. i.e.
> > 
> > /bin/mpif77 -o foo foo.f -L/lib -lmpi_f77
> -lYourProfClib
> > 
> > where libYourProfClib.a is your profiling tool written in C. 
> > If you don't want to intercept the MPI call twice for fortran
> program,
> > you need to implment fortran layer.  In that case, I would think
> you
> > can just call C version of PMPI_xxx directly from your fortran
> layer, e.g.
> > 
> > void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> > printf("mpi_comm_rank call successfully intercepted\n");
> > *info = PMPI_Comm_rank(comm,rank);
> > }
> > 
> > A.Chan
> > 
> > - "Nick Wright"  wrote:
> > 
> >> Hi
> >>
> >> I am trying to use the PMPI interface with OPENMPI to profile a
> >> fortran 
> >> program.
> >>
> >> I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile
> switched
> >> on.
> >>
> >> The problem seems to be that if one eg. intercepts to call to 
> >> mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this
> then
> >>
> >> calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.
> >>
> >> So if one wants to create a library that can profile C and Fortran
> >> codes 
> >> at the same time one ends up intercepting the mpi call twice. Which
> is
> >>
> >> not desirable and not what should happen (and indeed doesn't happen
> in
> >>
> >> other MPI implementations).
> >>
> >> A simple example to illustrate is below. If somebody knows of a fix
> to
> >>
> >> avoid this issue that would be great !
> >>
> >> Thanks
> >>
> >> Nick.
> >>
> >> pmpi_test.c: mpicc pmpi_test.c -c
> >>
> >> #include
> >> #include "mpi.h"
> >> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
> >>printf("mpi_comm_rank call successfully intercepted\n");
> >>pmpi_comm_rank_(comm,rank,info);
> >> }
> >> int MPI_Comm_rank(MPI_Comm comm, int *rank) {
> >>printf("MPI_comm_rank call successfully intercepted\n");
> >>PMPI_Comm_rank(comm,rank);
> >> }
> >>
> >> hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o
> >>
> >>program hello
> >> implicit none
> >> include 'mpif.h'
> >> integer ierr
> >> integer myid,nprocs
> >> character*24 fdate,host
> >> call MPI_Init( ierr )
> >>myid=0
> >>call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
> >>call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
> >>call getenv('HOST',host)
> >>write (*,*) 'Hello World from proc',myid,' out
> of',nprocs,host
> >>call mpi_finalize(ierr)
> >>end
> >>
> >>
> >>
> >> ___
> >> users mailing list
> >> us...@open-mpi.org
> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread George Bosilca
After spending few hours pondering about this problem, we came to the  
conclusion that the best approach is to keep what we had before (i.e.  
the original approach). This means I'll undo my patch in the trunk,  
and not change the behavior on the next releases (1.3 and 1.2.9). This  
approach, while different from others MPI implementations, is as legal  
as possible from the MPI standard point of view. Any suggestions on  
this topic or about the inconsistent behavior between the MPI  
implementations, should be directed to the MPI Forum Tools group for  
further evaluation.


The main reason for this is being nice with tool developers. In the  
current incarnation, they can either catch the Fortran calls or the C  
calls. If they provide both, then they will have to figure out how to  
cope with the double calls (as your example highlight).


Here is the behavior Open MPI will stick too:
Fortran MPI  -> C MPI
Fortran PMPI -> C MPI

  george.

PS: There was another possible approach, which could avoid the double  
calls while preserving the tool writers friendliness. This possible  
approach will do:

Fortran MPI  -> C MPI
Fortran PMPI -> C PMPI
  ^
Unfortunately, we will have to heavily modify all files in the Fortran  
interface layer in order to support this approach. We're too close to  
a major release to start such time consuming work.


  george.

On Dec 5, 2008, at 13:27 , Nick Wright wrote:


Brian

Sorry I picked the wrong word there. I guess this is more  
complicated than I thought it was.


For the first case you describe, as OPENMPI is now, the call  
sequence from fortran is


mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank

For the second case, as MPICH is now, its

mpi_comm_rank -> PMPI_Comm_rank

So for the first case if I have a pure fortran/C++ code I have to  
profile at the C interface.


So is the patch now retracted ?

Nick.

I think you have an incorrect deffinition of "correctly" :).   
According to the MPI standard, an MPI implementation is free to  
either layer language bindings (and only allow profiling at the  
lowest layer)  or not
layer the language bindings (and require profiling libraries  
intercept each language).  The only requirement is that the  
implementation document what it has done.
Since everyone is pretty clear on what Open MPI has done, I don't  
think you can claim Open MPI is doing it "incorrectly".  Different  
from MPICH is not necessarily incorrect.  (BTW, LAM/MPI handles  
profiling the same way as Open MPI).

Brian
On Fri, 5 Dec 2008, Nick Wright wrote:

Hi Antony

That will work yes, but its not portable to other MPI's that do  
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when  
our tool is configured and add some macros to deal with that  
accordingly. Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 - 
lYourProfClib


where libYourProfClib.a is your profiling tool written in C. If  
you don't want to intercept the MPI call twice for fortran program,
you need to implment fortran layer.  In that case, I would think  
you
can just call C version of PMPI_xxx directly from your fortran  
layer, e.g.


void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   *info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile  
switched

on.

The problem seems to be that if one eg. intercepts to call to  
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_  
this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call  
twice. Which is


not desirable and not what should happen (and indeed doesn't  
happen in


other MPI implementations).

A simple example to illustrate is below. If somebody knows of a  
fix to


avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
  printf("MPI_comm_rank call successfully intercepted\n");
  PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

  program hello
   implicit none
   include 'mpif.h'
   integer 

Re: [OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Justin
The reason i'd like to disable these eager buffers is to help detect the 
deadlock better.  I would not run with this for a normal run but it 
would be useful for debugging.  If the deadlock is indeed due to our 
code then disabling any shared buffers or eager sends would make that 
deadlock reproduceable.In addition we might be able to lower the 
number of processors down.  Right now determining which processor is 
deadlocks when we are using 8K cores and each processor has hundreds of 
messages sent out would be quite difficult.


Thanks for your suggestions,
Justin
Brock Palen wrote:
OpenMPI has differnt eager limits for all the network types, on your 
system run:


ompi_info --param btl all

and look for the eager_limits
You can set these values to 0 using the syntax I showed you before. 
That would disable eager messages.

There might be a better way to disable eager messages.
Not sure why you would want to disable them, they are there for 
performance.


Maybe you would still see a deadlock if every message was below the 
threshold. I think there is a limit of the number of eager messages a 
receving cpus will accept. Not sure about that though.  I still kind 
of doubt it though.


Try tweaking your buffer sizes,  make the openib  btl eager limit the 
same as shared memory. and see if you get locks up between hosts and 
not just shared memory.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 2:10 PM, Justin wrote:

Thank you for this info.  I should add that our code tends to post a 
lot of sends prior to the other side posting receives.  This causes a 
lot of unexpected messages to exist.  Our code explicitly matches up 
all tags and processors (that is we do not use MPI wild cards).  If 
we had a dead lock I would think we would see it regardless of 
weather or not we cross the roundevous threshold.  I guess one way to 
test this would be to to set this threshold to 0.  If it then dead 
locks we would likely be able to track down the deadlock.  Are there 
any other parameters we can send mpi that will turn off buffering?


Thanks,
Justin

Brock Palen wrote:
When ever this happens we found the code to have a deadlock.  users 
never saw it until they cross the eager->roundevous threshold.


Yes you can disable shared memory with:

mpirun --mca btl ^sm

Or you can try increasing the eager limit.

ompi_info --param btl sm

MCA btl: parameter "btl_sm_eager_limit" (current value:
  "4096")

You can modify this limit at run time,  I think (can't test it right 
now) it is just:


mpirun --mca btl_sm_eager_limit 40960

I think you can also in tweaking these values use env Vars in place 
of putting it all in the mpirun line:


export OMPI_MCA_btl_sm_eager_limit=40960

See:
http://www.open-mpi.org/faq/?category=tuning


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 12:22 PM, Justin wrote:


Hi,

We are currently using OpenMPI 1.3 on Ranger for large processor 
jobs (8K+).  Our code appears to be occasionally deadlocking at 
random within point to point communication (see stacktrace below).  
This code has been tested on many different MPI versions and as far 
as we know it does not contain a deadlock.  However, in the past we 
have ran into problems with shared memory optimizations within MPI 
causing deadlocks.  We can usually avoid these by setting a few 
environment variables to either increase the size of shared memory 
buffers or disable shared memory optimizations all together.   Does 
OpenMPI have any known deadlocks that might be causing our 
deadlocks?  If are there any work arounds?  Also how do we disable 
shared memory within OpenMPI?


Here is an example of where processors are hanging:

#0  0x2b2df3522683 in mca_btl_sm_component_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
#1  0x2b2df2cb46bf in mca_bml_r2_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
#2  0x2b2df0032ea4 in opal_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
#3  0x2b2ded0d7622 in ompi_request_default_wait_some () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
#4  0x2b2ded109e34 in PMPI_Waitsome () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0



Thanks,
Justin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Brock Palen
OpenMPI has differnt eager limits for all the network types, on your  
system run:


ompi_info --param btl all

and look for the eager_limits
You can set these values to 0 using the syntax I showed you before.  
That would disable eager messages.

There might be a better way to disable eager messages.
Not sure why you would want to disable them, they are there for  
performance.


Maybe you would still see a deadlock if every message was below the  
threshold. I think there is a limit of the number of eager messages a  
receving cpus will accept. Not sure about that though.  I still kind  
of doubt it though.


Try tweaking your buffer sizes,  make the openib  btl eager limit the  
same as shared memory. and see if you get locks up between hosts and  
not just shared memory.


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 2:10 PM, Justin wrote:

Thank you for this info.  I should add that our code tends to post  
a lot of sends prior to the other side posting receives.  This  
causes a lot of unexpected messages to exist.  Our code explicitly  
matches up all tags and processors (that is we do not use MPI wild  
cards).  If we had a dead lock I would think we would see it  
regardless of weather or not we cross the roundevous threshold.  I  
guess one way to test this would be to to set this threshold to 0.   
If it then dead locks we would likely be able to track down the  
deadlock.  Are there any other parameters we can send mpi that will  
turn off buffering?


Thanks,
Justin

Brock Palen wrote:
When ever this happens we found the code to have a deadlock.   
users never saw it until they cross the eager->roundevous threshold.


Yes you can disable shared memory with:

mpirun --mca btl ^sm

Or you can try increasing the eager limit.

ompi_info --param btl sm

MCA btl: parameter "btl_sm_eager_limit" (current value:
  "4096")

You can modify this limit at run time,  I think (can't test it  
right now) it is just:


mpirun --mca btl_sm_eager_limit 40960

I think you can also in tweaking these values use env Vars in  
place of putting it all in the mpirun line:


export OMPI_MCA_btl_sm_eager_limit=40960

See:
http://www.open-mpi.org/faq/?category=tuning


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 12:22 PM, Justin wrote:


Hi,

We are currently using OpenMPI 1.3 on Ranger for large processor  
jobs (8K+).  Our code appears to be occasionally deadlocking at  
random within point to point communication (see stacktrace  
below).  This code has been tested on many different MPI versions  
and as far as we know it does not contain a deadlock.  However,  
in the past we have ran into problems with shared memory  
optimizations within MPI causing deadlocks.  We can usually avoid  
these by setting a few environment variables to either increase  
the size of shared memory buffers or disable shared memory  
optimizations all together.   Does OpenMPI have any known  
deadlocks that might be causing our deadlocks?  If are there any  
work arounds?  Also how do we disable shared memory within OpenMPI?


Here is an example of where processors are hanging:

#0  0x2b2df3522683 in mca_btl_sm_component_progress () from / 
opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
#1  0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/ 
intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
#2  0x2b2df0032ea4 in opal_progress () from /opt/apps/ 
intel10_1/openmpi/1.3/lib/libopen-pal.so.0
#3  0x2b2ded0d7622 in ompi_request_default_wait_some () from / 
opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
#4  0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/ 
intel10_1/openmpi/1.3//lib/libmpi.so.0



Thanks,
Justin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread George Bosilca

On Dec 5, 2008, at 03:16 , Anthony Chan wrote:


void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   *info = PMPI_Comm_rank(comm,rank);
}


Unfortunately this example is not correct. The real Fortran prototype  
for the MPI_Comm_rank function is

void mpi_comm_rank_(MPI_Fint *comm, MPI_Fint *rank, MPI_Fint *ierr).

As you might notice there is no MPI_Comm (and believe me for Open MPI  
MPI_Comm is different than MPI_Fint), and there is no guarantee that  
the C int is the same as the Fortran int (looks weird but true).  
Therefore, several conversions are required in order to be able to go  
from the Fortran layer into the C one.


As a result, a tool should never cross the language boundary by  
itself. Instead it should call the pmpi function as provided by the  
MPI library. This doesn't really fix the issue that started this email  
thread, but at least clarify it a little bit.


  george.



A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran
program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this  
then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes
at the same time one ends up intercepting the mpi call twice. Which  
is


not desirable and not what should happen (and indeed doesn't happen  
in


other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix  
to


avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
  printf("MPI_comm_rank call successfully intercepted\n");
  PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

  program hello
   implicit none
   include 'mpif.h'
   integer ierr
   integer myid,nprocs
   character*24 fdate,host
   call MPI_Init( ierr )
  myid=0
  call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
  call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
  call getenv('HOST',host)
  write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
  call mpi_finalize(ierr)
  end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Justin
Thank you for this info.  I should add that our code tends to post a lot 
of sends prior to the other side posting receives.  This causes a lot of 
unexpected messages to exist.  Our code explicitly matches up all tags 
and processors (that is we do not use MPI wild cards).  If we had a dead 
lock I would think we would see it regardless of weather or not we cross 
the roundevous threshold.  I guess one way to test this would be to to 
set this threshold to 0.  If it then dead locks we would likely be able 
to track down the deadlock.  Are there any other parameters we can send 
mpi that will turn off buffering?


Thanks,
Justin

Brock Palen wrote:
When ever this happens we found the code to have a deadlock.  users 
never saw it until they cross the eager->roundevous threshold.


Yes you can disable shared memory with:

mpirun --mca btl ^sm

Or you can try increasing the eager limit.

ompi_info --param btl sm

MCA btl: parameter "btl_sm_eager_limit" (current value:
  "4096")

You can modify this limit at run time,  I think (can't test it right 
now) it is just:


mpirun --mca btl_sm_eager_limit 40960

I think you can also in tweaking these values use env Vars in place of 
putting it all in the mpirun line:


export OMPI_MCA_btl_sm_eager_limit=40960

See:
http://www.open-mpi.org/faq/?category=tuning


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 12:22 PM, Justin wrote:


Hi,

We are currently using OpenMPI 1.3 on Ranger for large processor jobs 
(8K+).  Our code appears to be occasionally deadlocking at random 
within point to point communication (see stacktrace below).  This 
code has been tested on many different MPI versions and as far as we 
know it does not contain a deadlock.  However, in the past we have 
ran into problems with shared memory optimizations within MPI causing 
deadlocks.  We can usually avoid these by setting a few environment 
variables to either increase the size of shared memory buffers or 
disable shared memory optimizations all together.   Does OpenMPI have 
any known deadlocks that might be causing our deadlocks?  If are 
there any work arounds?  Also how do we disable shared memory within 
OpenMPI?


Here is an example of where processors are hanging:

#0  0x2b2df3522683 in mca_btl_sm_component_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
#1  0x2b2df2cb46bf in mca_bml_r2_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
#2  0x2b2df0032ea4 in opal_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
#3  0x2b2ded0d7622 in ompi_request_default_wait_some () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
#4  0x2b2ded109e34 in PMPI_Waitsome () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0



Thanks,
Justin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Brock Palen
When ever this happens we found the code to have a deadlock.  users  
never saw it until they cross the eager->roundevous threshold.


Yes you can disable shared memory with:

mpirun --mca btl ^sm

Or you can try increasing the eager limit.

ompi_info --param btl sm

MCA btl: parameter "btl_sm_eager_limit" (current value:
  "4096")

You can modify this limit at run time,  I think (can't test it right  
now) it is just:


mpirun --mca btl_sm_eager_limit 40960

I think you can also in tweaking these values use env Vars in place  
of putting it all in the mpirun line:


export OMPI_MCA_btl_sm_eager_limit=40960

See:
http://www.open-mpi.org/faq/?category=tuning


Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Dec 5, 2008, at 12:22 PM, Justin wrote:


Hi,

We are currently using OpenMPI 1.3 on Ranger for large processor  
jobs (8K+).  Our code appears to be occasionally deadlocking at  
random within point to point communication (see stacktrace below).   
This code has been tested on many different MPI versions and as far  
as we know it does not contain a deadlock.  However, in the past we  
have ran into problems with shared memory optimizations within MPI  
causing deadlocks.  We can usually avoid these by setting a few  
environment variables to either increase the size of shared memory  
buffers or disable shared memory optimizations all together.   Does  
OpenMPI have any known deadlocks that might be causing our  
deadlocks?  If are there any work arounds?  Also how do we disable  
shared memory within OpenMPI?


Here is an example of where processors are hanging:

#0  0x2b2df3522683 in mca_btl_sm_component_progress () from / 
opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
#1  0x2b2df2cb46bf in mca_bml_r2_progress () from /opt/apps/ 
intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
#2  0x2b2df0032ea4 in opal_progress () from /opt/apps/intel10_1/ 
openmpi/1.3/lib/libopen-pal.so.0
#3  0x2b2ded0d7622 in ompi_request_default_wait_some () from / 
opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
#4  0x2b2ded109e34 in PMPI_Waitsome () from /opt/apps/intel10_1/ 
openmpi/1.3//lib/libmpi.so.0



Thanks,
Justin
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users






Re: [OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Scott Atchley

On Dec 5, 2008, at 12:22 PM, Justin wrote:

Does OpenMPI have any known deadlocks that might be causing our  
deadlocks?


Known deadlocks, no. We are assisting a customer, however, with a  
deadlock that occurs in IMB Alltoall (and some other IMB tests) when  
using 128 hosts and the MX BTL. We have not yet determined if it is a  
problem with MX, the MX BTL, or something else.


Scott


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Nick Wright

Brian

Sorry I picked the wrong word there. I guess this is more complicated 
than I thought it was.


For the first case you describe, as OPENMPI is now, the call sequence 
from fortran is


mpi_comm_rank -> MPI_Comm_rank -> PMPI_Comm_rank

For the second case, as MPICH is now, its

mpi_comm_rank -> PMPI_Comm_rank

So for the first case if I have a pure fortran/C++ code I have to 
profile at the C interface.


So is the patch now retracted ?

Nick.

I think you have an incorrect deffinition of "correctly" :).  According 
to the MPI standard, an MPI implementation is free to either layer 
language bindings (and only allow profiling at the lowest layer)  or not
layer the language bindings (and require profiling libraries intercept 
each language).  The only requirement is that the implementation 
document what it has done.


Since everyone is pretty clear on what Open MPI has done, I don't think 
you can claim Open MPI is doing it "incorrectly".  Different from MPICH 
is not necessarily incorrect.  (BTW, LAM/MPI handles profiling the same 
way as Open MPI).


Brian

On Fri, 5 Dec 2008, Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do 
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when our 
tool is configured and add some macros to deal with that accordingly. 
Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib

where libYourProfClib.a is your profiling tool written in C. If you 
don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran layer, 
e.g.


void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call twice. 
Which is


not desirable and not what should happen (and indeed doesn't happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
   printf("MPI_comm_rank call successfully intercepted\n");
   PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

   program hello
implicit none
include 'mpif.h'
integer ierr
integer myid,nprocs
character*24 fdate,host
call MPI_Init( ierr )
   myid=0
   call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
   call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
   call getenv('HOST',host)
   write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
   call mpi_finalize(ierr)
   end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Processor/core selection/affinity for large shared memory systems

2008-12-05 Thread V. Ram
Terry Frankcombe wrote:

> Isn't it up to the OS scheduler what gets run where?

I was under the impression that the processor affinity API was designed
to let the OS (at least Linux) know how a given task preferred to be
bound in terms of the system topology.
-- 
  V. Ram
  v_r_...@fastmail.fm

-- 
http://www.fastmail.fm - Accessible with your email software
  or over the web



Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Brian W. Barrett

Nick -

I think you have an incorrect deffinition of "correctly" :).  According to 
the MPI standard, an MPI implementation is free to either layer language 
bindings (and only allow profiling at the lowest layer) or not layer the 
language bindings (and require profiling libraries intercept each 
language).  The only requirement is that the implementation document what 
it has done.


Since everyone is pretty clear on what Open MPI has done, I don't think 
you can claim Open MPI is doing it "incorrectly".  Different from MPICH is 
not necessarily incorrect.  (BTW, LAM/MPI handles profiling the same way 
as Open MPI).


Brian

On Fri, 5 Dec 2008, Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do implement the 
profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when our tool 
is configured and add some macros to deal with that accordingly. Is there an 
easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib

where libYourProfClib.a is your profiling tool written in C. If you don't 
want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call twice. Which 
is


not desirable and not what should happen (and indeed doesn't happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
   printf("MPI_comm_rank call successfully intercepted\n");
   PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

   program hello
implicit none
include 'mpif.h'
integer ierr
integer myid,nprocs
character*24 fdate,host
call MPI_Init( ierr )
   myid=0
   call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
   call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
   call getenv('HOST',host)
   write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
   call mpi_finalize(ierr)
   end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Processor/core selection/affinity for large shared memory systems

2008-12-05 Thread V. Ram
Ralph Castain wrote:

> Thanks - yes, that helps. Can you do add --display-map to you cmd
> line? That will tell us what mpirun thinks it is doing.

The output from display map is below.  Note that I've sanitized a few
items, but nothing relevant to this:

[granite:29685]  Map for job: 1  Generated by mapping mode: byslot
Starting vpid: 0Vpid range: 16  Num app_contexts: 1
Data for app_context: index 0   app: /path/to/executable
Num procs: 16
Argv[0]: /path/to/executable
Env[0]: OMPI_MCA_rmaps_base_display_map=1
Env[1]:

OMPI_MCA_orte_precondition_transports=e16b0004a956445e-0515b892592a4a02
Env[2]: OMPI_MCA_rds=proxy
Env[3]: OMPI_MCA_ras=proxy
Env[4]: OMPI_MCA_rmaps=proxy
Env[5]: OMPI_MCA_pls=proxy
Env[6]: OMPI_MCA_rmgr=proxy
Working dir: /home/user/case (user: 0)
Num maps: 0
Num elements in nodes list: 1
Mapped node:
Cell: 0 Nodename: graniteLaunch id: -1   Username:
NULL
Daemon name:
Data type: ORTE_PROCESS_NAMEData Value: NULL
Oversubscribed: TrueNum elements in procs list: 16
Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,0]
Proc Rank: 0Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,1]
Proc Rank: 1Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,2]
Proc Rank: 2Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,3]
Proc Rank: 3Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,4]
Proc Rank: 4Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,5]
Proc Rank: 5Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,6]
Proc Rank: 6Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,7]
Proc Rank: 7Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,8]
Proc Rank: 8Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,9]
Proc Rank: 9Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,10]
Proc Rank: 10   Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,11]
Proc Rank: 11   Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,12]
Proc Rank: 12   Proc PID: 0 App_context
index: 0

Mapped proc:
Proc Name:
Data type: ORTE_PROCESS_NAMEData Value:
[0,1,13]
Proc Rank: 13   

Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Nick Wright
I hope you are aware, that *many* tools and application actually profile 
the fortran MPI layer by intercepting the C function calls. This allows 
them to not have to deal with f2c translation of MPI objects and not 
worry about the name mangling issue. Would there be a way to have both 
options  e.g. as a configure flag? The current commit basically breaks 
all of these applications...


Edgar,

I haven't seen the fix so I can't comment on that.

Anyway, in general though this can't be true. Such a profiling tool 
would *only* work with openmpi if it were written that way today. I 
guess such a fix will break openmpi specific tools (are there any?).


For MPICH for example, one must provide a hook into eg mpi_comm_rank_ as 
that calls PMPI_Comm_rank (as it should) and thus if one was only 
intercepting C calls one would not see any fortran profiling information.


Nick.



George Bosilca wrote:

Nick,

Thanks for noticing this. It's unbelievable that nobody noticed that 
over the last 5 years. Anyway, I think we have a one line fix for this 
problem. I'll test it asap, and then push it in the 1.3.


  Thanks,
george.

On Dec 5, 2008, at 10:14 , Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do 
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when 
our tool is configured and add some macros to deal with that 
accordingly. Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.
/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib
where libYourProfClib.a is your profiling tool written in C. If you 
don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran 
layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   *info = PMPI_Comm_rank(comm,rank);
}
A.Chan
- "Nick Wright"  wrote:

Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call twice. 
Which is


not desirable and not what should happen (and indeed doesn't happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
  printf("MPI_comm_rank call successfully intercepted\n");
  PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

  program hello
   implicit none
   include 'mpif.h'
   integer ierr
   integer myid,nprocs
   character*24 fdate,host
   call MPI_Init( ierr )
  myid=0
  call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
  call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
  call getenv('HOST',host)
  write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
  call mpi_finalize(ierr)
  end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Jeff Squyres

On Dec 5, 2008, at 11:29 AM, Nick Wright wrote:


I think we can just look at OPEN_MPI as you say and then

OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION & OMPI_RELEASE_VERSION

from mpi.h and if version is less than 1.2.9 implement a work around  
as Antony suggested. Its not the most elegant solution but it will  
work I think?


Ya, that should work.

--
Jeff Squyres
Cisco Systems



[OMPI users] Deadlock on large numbers of processors

2008-12-05 Thread Justin

Hi,

We are currently using OpenMPI 1.3 on Ranger for large processor jobs 
(8K+).  Our code appears to be occasionally deadlocking at random within 
point to point communication (see stacktrace below).  This code has been 
tested on many different MPI versions and as far as we know it does not 
contain a deadlock.  However, in the past we have ran into problems with 
shared memory optimizations within MPI causing deadlocks.  We can 
usually avoid these by setting a few environment variables to either 
increase the size of shared memory buffers or disable shared memory 
optimizations all together.   Does OpenMPI have any known deadlocks that 
might be causing our deadlocks?  If are there any work arounds?  Also 
how do we disable shared memory within OpenMPI?


Here is an example of where processors are hanging:

#0  0x2b2df3522683 in mca_btl_sm_component_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_btl_sm.so
#1  0x2b2df2cb46bf in mca_bml_r2_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/openmpi/mca_bml_r2.so
#2  0x2b2df0032ea4 in opal_progress () from 
/opt/apps/intel10_1/openmpi/1.3/lib/libopen-pal.so.0
#3  0x2b2ded0d7622 in ompi_request_default_wait_some () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0
#4  0x2b2ded109e34 in PMPI_Waitsome () from 
/opt/apps/intel10_1/openmpi/1.3//lib/libmpi.so.0



Thanks,
Justin


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Edgar Gabriel

George,

I hope you are aware, that *many* tools and application actually profile 
the fortran MPI layer by intercepting the C function calls. This allows 
them to not have to deal with f2c translation of MPI objects and not 
worry about the name mangling issue. Would there be a way to have both 
options  e.g. as a configure flag? The current commit basically breaks 
all of these applications...


Thanks
Edgar

George Bosilca wrote:

Nick,

Thanks for noticing this. It's unbelievable that nobody noticed that 
over the last 5 years. Anyway, I think we have a one line fix for this 
problem. I'll test it asap, and then push it in the 1.3.


  Thanks,
george.

On Dec 5, 2008, at 10:14 , Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do 
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when our 
tool is configured and add some macros to deal with that accordingly. 
Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.
/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib
where libYourProfClib.a is your profiling tool written in C. If you 
don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran layer, 
e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   *info = PMPI_Comm_rank(comm,rank);
}
A.Chan
- "Nick Wright"  wrote:

Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call twice. 
Which is


not desirable and not what should happen (and indeed doesn't happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
  printf("MPI_comm_rank call successfully intercepted\n");
  PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

  program hello
   implicit none
   include 'mpif.h'
   integer ierr
   integer myid,nprocs
   character*24 fdate,host
   call MPI_Init( ierr )
  myid=0
  call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
  call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
  call getenv('HOST',host)
  write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
  call mpi_finalize(ierr)
  end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Nick Wright

I think we can just look at OPEN_MPI as you say and then

OMPI_MAJOR_VERSION, OMPI_MINOR_VERSION & OMPI_RELEASE_VERSION

from mpi.h and if version is less than 1.2.9 implement a work around as 
Antony suggested. Its not the most elegant solution but it will work I 
think?


Nick.

Jeff Squyres wrote:

On Dec 5, 2008, at 10:55 AM, David Skinner wrote:


FWIW, if that one-liner fix works (George and I just chatted about this
on the phone), we can probably also push it into v1.2.9.


great! thanks.



It occurs to me that this is likely not going to be enough for you, 
though.  :-\


Like it or not, there's still installed OMPI's out there that will show 
this old behavior.  Do you need to know / adapt for those?  If so, I can 
see two ways of you figuring it out:


1. At run time, do a simple call to (Fortran) MPI_INITIALIZED and see if 
you intercept it twice (both in Fortran and in C).


2. If that's not attractive, we can probably add a line into the 
ompi_info output that you can grep for when using OMPI (you can look for 
the OPEN_MPI macro from our  to know if it's Open MPI or not).  
Specifically, this line can be there for the "fixed" versions, and it 
simply won't be there for non-fixed versions.




Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Jeff Squyres

On Dec 5, 2008, at 10:55 AM, David Skinner wrote:

FWIW, if that one-liner fix works (George and I just chatted about  
this

on the phone), we can probably also push it into v1.2.9.


great! thanks.



It occurs to me that this is likely not going to be enough for you,  
though.  :-\


Like it or not, there's still installed OMPI's out there that will  
show this old behavior.  Do you need to know / adapt for those?  If  
so, I can see two ways of you figuring it out:


1. At run time, do a simple call to (Fortran) MPI_INITIALIZED and see  
if you intercept it twice (both in Fortran and in C).


2. If that's not attractive, we can probably add a line into the  
ompi_info output that you can grep for when using OMPI (you can look  
for the OPEN_MPI macro from our  to know if it's Open MPI or  
not).  Specifically, this line can be there for the "fixed" versions,  
and it simply won't be there for non-fixed versions.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Jeff Squyres
FWIW, if that one-liner fix works (George and I just chatted about  
this on the phone), we can probably also push it into v1.2.9.



On Dec 5, 2008, at 10:49 AM, George Bosilca wrote:


Nick,

Thanks for noticing this. It's unbelievable that nobody noticed that  
over the last 5 years. Anyway, I think we have a one line fix for  
this problem. I'll test it asap, and then push it in the 1.3.


 Thanks,
   george.

On Dec 5, 2008, at 10:14 , Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do  
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when  
our tool is configured and add some macros to deal with that  
accordingly. Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.
/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib
where libYourProfClib.a is your profiling tool written in C. If  
you don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran  
layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  *info = PMPI_Comm_rank(comm,rank);
}
A.Chan
- "Nick Wright"  wrote:

Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile  
switched

on.

The problem seems to be that if one eg. intercepts to call to  
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this  
then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call  
twice. Which is


not desirable and not what should happen (and indeed doesn't  
happen in


other MPI implementations).

A simple example to illustrate is below. If somebody knows of a  
fix to


avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
 printf("mpi_comm_rank call successfully intercepted\n");
 pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
 printf("MPI_comm_rank call successfully intercepted\n");
 PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

 program hello
  implicit none
  include 'mpif.h'
  integer ierr
  integer myid,nprocs
  character*24 fdate,host
  call MPI_Init( ierr )
 myid=0
 call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
 call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
 call getenv('HOST',host)
 write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
 call mpi_finalize(ierr)
 end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Hybrid program

2008-12-05 Thread Jeff Squyres

Nifty -- good to know.  Thanks for looking into this!

Do any kernel-hacker types on this list know roundabout what version  
thread-affinity was brought into the Linux kernel?


FWIW: all the same concepts here (using pid==0) should also work for  
PLPA, so you can set via socket/core, etc.



On Dec 5, 2008, at 10:45 AM, Edgar Gabriel wrote:

ok, so I digged a little deeper, and have some good news. Let me  
start with a set of routines, that we didn't even discuss yet, but  
which works for setting thread affinity, and discuss then libnuma  
and sched_setaffinity() again.


---
On linux systems, the pthread library has a set of routines to  
modify and determine thread-affinity related information:


#define __USE_GNU

int pthread_setaffinity_np (pthread_t __th, size_t __cpusetsize,
   const cpu_set_t *__cpuset);
int pthread_getaffinity_np (pthread_t __th, size_t __cpusetsize,
   cpu_set_t *__cpuset)

These two routines can be used to modify the affinity of an existing  
thread. If you would like to modify the affinity of a thread  
*before* creating it, you can use a similar routines.


int pthread_attr_setaffinity_np (pthread_attr_t *__attr,
size_t __cpusetsize,
const cpu_set_t *__cpuset)

I tested the first two routines, and they did work for me.
---

Now to libnuma vs. sched_setaffinity: after digging a little deeper  
in the libnuma sources, I realized that one of the differences on  
what they do vs. what I did in my testcases was, that libnuma uses  
the sched_setaffinity() calls with a pid of 0, instead of  
determining the pid using the getpid() function. According to the  
sched_setaffinity() manpages, pid of zero means 'apply the new rules  
to the current process', but it does in fact mean 'to the current  
task/thread'. I wrote a set of tests, where I used  
sched_setaffinity() with the pid zero, and I was in fact able to  
modify the affinity of an individual thread using  
sched_setaffinity(). If you pass in the pid of the process, it will  
affect the affinity of all threads of that process.


Bottom line is, you can modify the affinity of a thread using both  
libnuma on a per socket basis and the sched_setaffinity() calls on a  
per core basis. Alternatively, you can use the  
pthread_setaffinity_np() function to modify the affinity of a thread  
using a cpu_set_t similar to sched_setaffinity.


Thanks
Edgar


Jeff Squyres wrote:
Fair enough; let me know what you find.  It would be good to  
understand exactly why you're seeing what you're seeing...

On Dec 2, 2008, at 5:47 PM, Edgar Gabriel wrote:
its on OpenSuSE 11 with kernel 2.6.25.11. I don't know the libnuma  
library version, but I suspect that its fairly new.


I will try to investigate that in the next days a little more. I  
do think that they use sched_setaffinity() underneath the hood  
(because in one of my failed attempts when I passed in the wrong  
argument, I got actually the same error message that I got earlier  
with sched_setaffinity), but they must do something additionally  
underneath.


Anyway, I just wanted to report the result, and that there is  
obviously a difference, even if can't explain it right now in  
details.


Thanks
Edgar

Jeff Squyres wrote:

On Dec 2, 2008, at 11:27 AM, Edgar Gabriel wrote:
so I ran a couple of tests today and I can not confirm your  
statement. I wrote simple a simple test code where a process  
first sets an affinity mask and than spawns a number of threads.  
The threads modify the affinity mask and every thread  
( including the master thread) print out there affinity mask at  
the end.


With sched_getaffinity() and sched_setaffinity() it was indeed  
such that the master thread had the same affinity mask as the  
thread that it spawned. This means, that the modification of the  
affinity mask by a new thread in fact did affect the master  
thread.


Executing the same codesquence however using the libnuma calls,  
the master thread however was not affected by the new affinity  
mask of the children. So clearly, libnuma must be doing  
something differently.
What distro/version of Linux are you using, and what version of  
libnuma?
Libnuma v2.0.x very definitely is just a wrapper around the  
syscall for sched_setaffinity().  I downloaded it from:

  ftp://oss.sgi.com/www/projects/libnuma/download


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Edgar Gabriel
Assistant Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science   

Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread George Bosilca

Nick,

Thanks for noticing this. It's unbelievable that nobody noticed that  
over the last 5 years. Anyway, I think we have a one line fix for this  
problem. I'll test it asap, and then push it in the 1.3.


  Thanks,
george.

On Dec 5, 2008, at 10:14 , Nick Wright wrote:


Hi Antony

That will work yes, but its not portable to other MPI's that do  
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when  
our tool is configured and add some macros to deal with that  
accordingly. Is there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.
/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib
where libYourProfClib.a is your profiling tool written in C. If you  
don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran  
layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   *info = PMPI_Comm_rank(comm,rank);
}
A.Chan
- "Nick Wright"  wrote:

Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran program.

I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to  
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this  
then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes at the same time one ends up intercepting the mpi call  
twice. Which is


not desirable and not what should happen (and indeed doesn't  
happen in


other MPI implementations).

A simple example to illustrate is below. If somebody knows of a  
fix to


avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
  printf("mpi_comm_rank call successfully intercepted\n");
  pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
  printf("MPI_comm_rank call successfully intercepted\n");
  PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

  program hello
   implicit none
   include 'mpif.h'
   integer ierr
   integer myid,nprocs
   character*24 fdate,host
   call MPI_Init( ierr )
  myid=0
  call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
  call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
  call getenv('HOST',host)
  write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
  call mpi_finalize(ierr)
  end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()

2008-12-05 Thread Jeff Squyres

On Dec 5, 2008, at 10:33 AM, Jens wrote:


thanks a lot. This fixed a bug in my code.
I already like open-mpi for this :)


LOL!  Glad to help.  :-)

FWIW, we're working on new Fortran bindings for MPI-3 that fix some of  
the shortcomings of the F90 bindings.


--
Jeff Squyres
Cisco Systems



Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()

2008-12-05 Thread Jens
Hi Jeff,

thanks a lot. This fixed a bug in my code.
I already like open-mpi for this :)

Greeting
Jens

Jeff Squyres schrieb:
> These functions do exist in Open MPI, but your code is not quite
> correct.  Here's a new version that is correct:
> 
> -
> program main
> use mpi
> implicit none
> integer :: ierr, rank, size
> integer :: mpi1_val
> integer(kind = MPI_ADDRESS_KIND) :: mpi2_val
> logical :: attr_flag
> 
> call MPI_INIT(ierr)
> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
> call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
> 
> call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, mpi2_val, attr_flag, ierr)
> call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, mpi1_val, attr_flag, ierr)
> 
> print *, "Hello, world, I am ", rank, " of ", size
> call MPI_FINALIZE(ierr)
> end
> -
> 
> Note three things:
> 
> 1. attr_flag is supposed to be of type logical, not integer
> 2. In MPI-1 (MPI_ATTR_GET) the type of the value is integer
> 2. In MPI-2 (MPI_COMM_GET_ATTR), the type of the value is
> integer(kind=MPI_ADDRESS_KIND)
> 
> F90 is strongly typed, so the F90 compiler is correct in claiming that
> functions of the signature you specified were not found.
> 
> Make sense?
> 
> I'm not sure why your original code works with MPICH2 -- perhaps they
> don't have F90 bindings for these functions, and therefore they're
> falling through to the F77 bindings (where no type checking is
> done)...?  If so, you're getting lucky that it works; perhaps
> sizeof(INTEGER) == sizeof(LOGICAL), and sizeof(INTEGER) ==
> sizeof(INTEGER(KIND=MPI_ADDRESS_KIND)).  That's a guess.
> 
> 
> 
> On Dec 5, 2008, at 4:49 AM, Jens wrote:
> 
>> Hi,
>>
>> I just switched from MPICH2 to openmpi because of sge-support, but I am
>> missing some mpi-functions for fortran 90.
>>
>> Does anyone know why
>> MPI_COMM_GET_ATTR()
>> MPI_ATTR_GET()
>> are not available? They work fine with MPICH2.
>>
>> I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU-compilers
>> and Intel 11.0. Both give me the same error:
>>
>> GNU:
>> Error: There is no specific subroutine for the generic 'mpi_attr_get'
>> at (1)
>>
>> Intel 11.0:
>> hello_f90.f90(22): error #6285: There is no matching specific subroutine
>> for this generic subroutine call.   [MPI_ATTR_GET]
>>call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)
>>
>> Any ideas ...?
>>
>> Greetings
>> Jens
>>
>> 
>> program main
>> use mpi
>> implicit none
>> integer :: ierr, rank, size
>> integer :: attr_val, attr_flag
>>
>> call MPI_INIT(ierr)
>> call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
>> call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)
>>
>> call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)
>> call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)
>>
>> print *, "Hello, world, I am ", rank, " of ", size
>> call MPI_FINALIZE(ierr)
>> end
>> ---
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Nick Wright

Hi Antony

That will work yes, but its not portable to other MPI's that do 
implement the profiling layer correctly unfortunately.


I guess we will just need to detect that we are using openmpi when our 
tool is configured and add some macros to deal with that accordingly. Is 
there an easy way to do this built into openmpi?


Thanks

Nick.

Anthony Chan wrote:

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib

where libYourProfClib.a is your profiling tool written in C. 
If you don't want to intercept the MPI call twice for fortran program,

you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:


Hi

I am trying to use the PMPI interface with OPENMPI to profile a
fortran 
program.


I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
on.

The problem seems to be that if one eg. intercepts to call to 
mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then


calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.

So if one wants to create a library that can profile C and Fortran
codes 
at the same time one ends up intercepting the mpi call twice. Which is


not desirable and not what should happen (and indeed doesn't happen in

other MPI implementations).

A simple example to illustrate is below. If somebody knows of a fix to

avoid this issue that would be great !

Thanks

Nick.

pmpi_test.c: mpicc pmpi_test.c -c

#include
#include "mpi.h"
void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
   printf("mpi_comm_rank call successfully intercepted\n");
   pmpi_comm_rank_(comm,rank,info);
}
int MPI_Comm_rank(MPI_Comm comm, int *rank) {
   printf("MPI_comm_rank call successfully intercepted\n");
   PMPI_Comm_rank(comm,rank);
}

hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o

   program hello
implicit none
include 'mpif.h'
integer ierr
integer myid,nprocs
character*24 fdate,host
call MPI_Init( ierr )
   myid=0
   call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
   call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
   call getenv('HOST',host)
   write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
   call mpi_finalize(ierr)
   end



___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()

2008-12-05 Thread Jeff Squyres
These functions do exist in Open MPI, but your code is not quite  
correct.  Here's a new version that is correct:


-
program main
use mpi
implicit none
integer :: ierr, rank, size
integer :: mpi1_val
integer(kind = MPI_ADDRESS_KIND) :: mpi2_val
logical :: attr_flag

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)

call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, mpi2_val, attr_flag,  
ierr)

call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, mpi1_val, attr_flag, ierr)

print *, "Hello, world, I am ", rank, " of ", size
call MPI_FINALIZE(ierr)
end
-

Note three things:

1. attr_flag is supposed to be of type logical, not integer
2. In MPI-1 (MPI_ATTR_GET) the type of the value is integer
2. In MPI-2 (MPI_COMM_GET_ATTR), the type of the value is  
integer(kind=MPI_ADDRESS_KIND)


F90 is strongly typed, so the F90 compiler is correct in claiming that  
functions of the signature you specified were not found.


Make sense?

I'm not sure why your original code works with MPICH2 -- perhaps they  
don't have F90 bindings for these functions, and therefore they're  
falling through to the F77 bindings (where no type checking is  
done)...?  If so, you're getting lucky that it works; perhaps  
sizeof(INTEGER) == sizeof(LOGICAL), and sizeof(INTEGER) ==  
sizeof(INTEGER(KIND=MPI_ADDRESS_KIND)).  That's a guess.




On Dec 5, 2008, at 4:49 AM, Jens wrote:


Hi,

I just switched from MPICH2 to openmpi because of sge-support, but I  
am

missing some mpi-functions for fortran 90.

Does anyone know why
MPI_COMM_GET_ATTR()
MPI_ATTR_GET()
are not available? They work fine with MPICH2.

I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU- 
compilers

and Intel 11.0. Both give me the same error:

GNU:
Error: There is no specific subroutine for the generic  
'mpi_attr_get' at (1)


Intel 11.0:
hello_f90.f90(22): error #6285: There is no matching specific  
subroutine

for this generic subroutine call.   [MPI_ATTR_GET]
   call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag,  
ierr)


Any ideas ...?

Greetings
Jens


program main
use mpi
implicit none
integer :: ierr, rank, size
integer :: attr_val, attr_flag

call MPI_INIT(ierr)
call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)

call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag,  
ierr)

call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)

print *, "Hello, world, I am ", rank, " of ", size
call MPI_FINALIZE(ierr)
end
---
___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users



--
Jeff Squyres
Cisco Systems



[OMPI users] MCA parameter

2008-12-05 Thread Yasmine Yacoub
Thank you for your response, and these are the details for my problem:

I have installed pwscf and then I have tried to run scf calculations, but 
before having the output I got this warning message:
 
 
WARNING: There are more than one active ports on host 'stallo-2.local', but the
default subnet GID prefix was detected on more than one of these
ports.  If these ports are connected to different physical IB
networks, this configuration will fail in Open MPI.  This version of
Open MPI requires that every physically separate IB subnet that is
used between connected MPI processes must have different subnet ID
values.
 
Please see this FAQ entry for more details:
 
  http://www..open-mpi.org/faq/?category=openfabrics#ofa-default-subnet-gid
 
NOTE: You can turn off this warning by setting the MCA parameter
  btl_openib_warn_default_gid_prefix to 0.
 
So the question is how can I turn off this warning or how can I change MCA 
parameter from 1 to 0? which command I have to use? I have tried with the link 
above but it doesn't work. perhaps I'm not using the right command.
 
 
Thanks,


  

[OMPI users] Fortran90 functions missing: MPI_COMM_GET_ATTR / MPI_ATTR_GET()

2008-12-05 Thread Jens
Hi,

I just switched from MPICH2 to openmpi because of sge-support, but I am
missing some mpi-functions for fortran 90.

Does anyone know why
 MPI_COMM_GET_ATTR()
 MPI_ATTR_GET()
are not available? They work fine with MPICH2.

I compiled openmpi 1.2.8/1.3rc on a clean CentOS 5.2 with GNU-compilers
and Intel 11.0. Both give me the same error:

GNU:
Error: There is no specific subroutine for the generic 'mpi_attr_get' at (1)

Intel 11.0:
hello_f90.f90(22): error #6285: There is no matching specific subroutine
for this generic subroutine call.   [MPI_ATTR_GET]
call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)

Any ideas ...?

Greetings
Jens


program main
 use mpi
 implicit none
 integer :: ierr, rank, size
 integer :: attr_val, attr_flag

 call MPI_INIT(ierr)
 call MPI_COMM_RANK(MPI_COMM_WORLD, rank, ierr)
 call MPI_COMM_SIZE(MPI_COMM_WORLD, size, ierr)

 call MPI_COMM_GET_ATTR(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)
 call MPI_ATTR_GET(MPI_COMM_WORLD, MPI_IO, attr_val, attr_flag, ierr)

 print *, "Hello, world, I am ", rank, " of ", size
 call MPI_FINALIZE(ierr)
end
---


Re: [OMPI users] Issue with Profiling Fortran code

2008-12-05 Thread Anthony Chan

Hope I didn't misunderstand your question.  If you implement
your profiling library in C where you do your real instrumentation,
you don't need to implement the fortran layer, you can simply link
with Fortran to C MPI wrapper library -lmpi_f77. i.e.

/bin/mpif77 -o foo foo.f -L/lib -lmpi_f77 -lYourProfClib

where libYourProfClib.a is your profiling tool written in C. 
If you don't want to intercept the MPI call twice for fortran program,
you need to implment fortran layer.  In that case, I would think you
can just call C version of PMPI_xxx directly from your fortran layer, e.g.

void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
printf("mpi_comm_rank call successfully intercepted\n");
*info = PMPI_Comm_rank(comm,rank);
}

A.Chan

- "Nick Wright"  wrote:

> Hi
> 
> I am trying to use the PMPI interface with OPENMPI to profile a
> fortran 
> program.
> 
> I have tried with 1.28 and 1.3rc1 with --enable-mpi-profile switched
> on.
> 
> The problem seems to be that if one eg. intercepts to call to 
> mpi_comm_rank_ (the fortran hook) then calls pmpi_comm_rank_ this then
> 
> calls MPI_Comm_rank (the C hook) not PMPI_Comm_rank as it should.
> 
> So if one wants to create a library that can profile C and Fortran
> codes 
> at the same time one ends up intercepting the mpi call twice. Which is
> 
> not desirable and not what should happen (and indeed doesn't happen in
> 
> other MPI implementations).
> 
> A simple example to illustrate is below. If somebody knows of a fix to
> 
> avoid this issue that would be great !
> 
> Thanks
> 
> Nick.
> 
> pmpi_test.c: mpicc pmpi_test.c -c
> 
> #include
> #include "mpi.h"
> void mpi_comm_rank_(MPI_Comm *comm, int *rank, int *info) {
>printf("mpi_comm_rank call successfully intercepted\n");
>pmpi_comm_rank_(comm,rank,info);
> }
> int MPI_Comm_rank(MPI_Comm comm, int *rank) {
>printf("MPI_comm_rank call successfully intercepted\n");
>PMPI_Comm_rank(comm,rank);
> }
> 
> hello_mpi.f: mpif77 hello_mpi.f pmpi_test.o
> 
>program hello
> implicit none
> include 'mpif.h'
> integer ierr
> integer myid,nprocs
> character*24 fdate,host
> call MPI_Init( ierr )
>myid=0
>call mpi_comm_rank(MPI_COMM_WORLD, myid, ierr )
>call mpi_comm_size(MPI_COMM_WORLD , nprocs, ierr )
>call getenv('HOST',host)
>write (*,*) 'Hello World from proc',myid,' out of',nprocs,host
>call mpi_finalize(ierr)
>end
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users