Hi Ken,

Could you post the output of your ompi_info?

I have PrgEnv-gnu/5.2.56 and gcc/4.9.2 loaded in my env on nersc system.
Following configure line:

./configure --enable-mpi-java --prefix=my_favorite_install_location

The general rule of thumb on cray's with master (not with older versions
though) is you should be able to
do a ./configure (install location) and you're ready to go, no need for
complicated platform files, etc.
to just build vanilla.

As you're probably guessing, I'm going to say it works for me, at least up
to 68 slave ranks.

I do notice there's some glitch with the mapping of the ranks though.  The
binding logic seems
to think there's oversubscription of cores even when there should not be.
I had to use the

--bind-to none

option on the command line once I asked for more than 22 slave ranks.
 edison system has
has 24 cores/node.

Howard



2015-06-11 12:10 GMT-06:00 Leiter, Kenneth W CIV USARMY ARL (US) <
kenneth.w.leiter2....@mail.mil>:

> I will try on a non-cray machine as well.
>
> - Ken
>
> -----Original Message-----
> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Howard
> Pritchard
> Sent: Thursday, June 11, 2015 12:21 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] orted segmentation fault in pmix on master
>
> Hello Ken,
>
> Could you give the details of the allocation request (qsub args) as well
> as the mpirun command line args? I'm trying to reproduce on the nersc
> system.
>
> It would be interesting if you have access to a similar size non-cray
> cluster if you get the same problems.
>
> Howard
>
>
> 2015-06-11 9:13 GMT-06:00 Ralph Castain <r...@open-mpi.org <mailto:
> r...@open-mpi.org> >:
>
>
>         I don’t have a Cray, but let me see if I can reproduce this on
> something else
>
>         > On Jun 11, 2015, at 7:26 AM, Leiter, Kenneth W CIV USARMY ARL
> (US) <kenneth.w.leiter2....@mail.mil <mailto:
> kenneth.w.leiter2....@mail.mil> > wrote:
>         >
>         > Hello,
>         >
>         > I am attempting to use the openmpi development master for a code
> that uses
>         > dynamic process management (i.e. MPI_Comm_spawn) on our Cray
> XC40 at the
>         > Army Research Laboratory. After reading through the mailing list
> I came to
>         > the conclusion that the master branch is the only hope for
> getting this to
>         > work on the newer Cray machines.
>         >
>         > To test I am using the cpi-master.c cpi-worker.c example. The
> test works
>         > when executing on a small number of processors, five or less,
> but begins to
>         > fail with segmentation faults in orted when using more
> processors. Even with
>         > five or fewer processors, I am spreading the computation to more
> than one
>         > node. I am using the cray ugni btl through the alps scheduler.
>         >
>         > I get a core file from orted and have the seg fault tracked down
> to
>         > pmix_server_process_msgs.c:420 where req->proxy is NULL. I have
> tried
>         > reading the code to understand how this happens, but am unsure.
> I do see
>         > that in the if statement where I take the else branch, the other
> branch
>         > specifically checks "if (NULL == req->proxy)" - however, no such
> check is
>         > done the the else branch.
>         >
>         > I have debug output dumped for the failing runs. I can provide
> the output
>         > along with ompi_info output and config.log to anyone who is
> interested.
>         >
>         > - Ken Leiter
>         >
>         > _______________________________________________
>         > users mailing list
>         > us...@open-mpi.org <mailto:us...@open-mpi.org>
>         > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>         > Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/06/27094.php
>
>         _______________________________________________
>         users mailing list
>         us...@open-mpi.org <mailto:us...@open-mpi.org>
>         Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>         Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/06/27095.php
>
>
>
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2015/06/27103.php
>

Reply via email to