Re: [OMPI users] [EXTERNAL] strange pml error

2021-11-03 Thread Michael Di Domenico via users
this seemed to help me as well, so far at least.  still have a lot
more testing to do

On Tue, Nov 2, 2021 at 4:15 PM Shrader, David Lee  wrote:
>
> As a workaround for now, I have found that setting OMPI_MCA_pml=ucx seems to 
> get around this issue. I'm not sure why this works, but perhaps there is 
> different initialization that happens such that the offending device search 
> problem doesn't occur?
>
>
> Thanks,
>
> David
>
>
>
> 
> From: Shrader, David Lee
> Sent: Tuesday, November 2, 2021 2:09 PM
> To: Open MPI Users
> Cc: Michael Di Domenico
> Subject: Re: [EXTERNAL] [OMPI users] strange pml error
>
>
> I too have been getting this using 4.1.1, but not with the master nightly 
> tarballs from mid-October. I still have it on my to-do list to open a github 
> issue. The problem seems to come from device detection in the ucx pml: on 
> some ranks, it fails to find a device and thus the ucx pml disqualifies 
> itself. Which then just leaves the ob1 pml.
>
>
> Thanks,
>
> David
>
>
>
> 
> From: users  on behalf of Michael Di 
> Domenico via users 
> Sent: Tuesday, November 2, 2021 1:35 PM
> To: Open MPI Users
> Cc: Michael Di Domenico
> Subject: [EXTERNAL] [OMPI users] strange pml error
>
> fairly frequently, but not everytime when trying to run xhpl on a new
> machine i'm bumping into this.  it happens with a single node or
> multiple nodes
>
> node1 selected pml ob1, but peer on node1 selected pml ucx
>
> if i rerun the exact same command a few minutes later, it works fine.
> the machine is new and i'm the only one using it so there are no user
> conflicts
>
> the software stack is
>
> slurm 21.8.2.1
> ompi 4.1.1
> pmix 3.2.3
> ucx 1.9.0
>
> the hardware is HPE w/ mellanox edr cards (but i doubt that matters)
>
> any thoughts?


[OMPI users] strange pml error

2021-11-02 Thread Michael Di Domenico via users
fairly frequently, but not everytime when trying to run xhpl on a new
machine i'm bumping into this.  it happens with a single node or
multiple nodes

node1 selected pml ob1, but peer on node1 selected pml ucx

if i rerun the exact same command a few minutes later, it works fine.
the machine is new and i'm the only one using it so there are no user
conflicts

the software stack is

slurm 21.8.2.1
ompi 4.1.1
pmix 3.2.3
ucx 1.9.0

the hardware is HPE w/ mellanox edr cards (but i doubt that matters)

any thoughts?


Re: [OMPI users] [EXTERNAL] building openshem on opa

2021-03-22 Thread Michael Di Domenico via users
On Mon, Mar 22, 2021 at 11:13 AM Pritchard Jr., Howard  wrote:
> https://github.com/Sandia-OpenSHMEM/SOS
> if you want to use OpenSHMEM over OPA.
> If you have lots of cycles for development work, you could write an OFI SPML 
> for the  OSHMEM component of Open MPI.

thanks, i am aware of the sandia version.  the devs in my organization
don't really use shmem, but there was a call for it recently.  i
hadn't even noticed shmem didn't build on our opa cluster.  for now we
have a smaller mellanox cluster they can build against.

my ability to code an spml is nil.  but if we had more interest from
the internal devs i'd certainly be willing to fund someone to do it..
:)


[OMPI users] building openshem on opa

2021-03-22 Thread Michael Di Domenico via users
i can build and run openmpi on an opa network just fine, but it turns
out building openshmem fails.  the message is (no spml) found

looking at the config log it looks like it tries to build spml ikrit
and ucx which fail.  i turn ucx off because it doesn't support opa and
isn't needed.

so this message is really just a confirmation that openshmem and opa
are not capable of being built or did i do something wrong

and a curiosity if anyone knows what kind of effort would be involved
in getting it to work


Re: [OMPI users] [EXTERNAL] Re: OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Michael Di Domenico via users
if you have OPA cards, for openmpi you only need --with-ofi, you don't
need psm/psm2/verbs/ucx.  but this assumes you're running a rhel based
distro and have installed the OPA fabric suite of software from
Intel/CornelisNetworks.  which is what i have.  perhaps there's
something really odd in debian or there's an incompatibility with the
older ofed drivers perhaps included with debian.  unfortunately i
don't have access to a debian, so i can't be much more help

if i had to guess totally pulling junk from the air, there's probably
something incompatible with PSM and OPA when running specifically on
debian (likely due to library versioning).  i don't know how common
that is, so it's not clear how flushed out and tested it is




On Wed, Jan 27, 2021 at 3:07 PM Patrick Begou via users
 wrote:
>
> Hi Howard and Michael
>
> first many thanks for testing with my short application. Yes, when the
> test code runs fine it just show the max RSS size of rank 0 process.
> When it runs wrong it put a messages about each invalid value found.
>
> As I said, I have also deployed OpenMPI on various cluster (in DELL data
> center at Austin) when I was testing some architectures some months ago
> and nor on AMD/Mellanox_IB nor on Intel/Omni-path I got any problem. The
> goal was running my tests with same software stacks and be sure to be
> able to deploy my software stack on the selected solution.
> But as your clusters (and my small local clusters) they were all running
> RedHat (or similar Linux flavors) and a modern Gnu compiler (9 or 10).
> The university's cluster I have access is running Debian stretch and
> provides GCC6 as default compiler.
>
> I cannot ask for a different OS, but I can deploy a local gcc10 and
> build again OpenMPI.  UCX is not available on this cluster, should I
> deploy a local UCX too ?
>
> Libpsm2 seams good:
> dahu103 : dpkg -l |grep psm
> ii  libfabric-psm  1.10.0-2-1ifs+deb9amd64 Dynamic PSM
> provider for user-space Open Fabric Interfaces
> ii  libfabric-psm2 1.10.0-2-1ifs+deb9amd64 Dynamic PSM2
> provider for user-space Open Fabric Interfaces
> ii  libpsm-infinipath1 3.3-19-g67c0807-2ifs+deb9 amd64 PSM Messaging
> library for Intel Truescale adapters
> ii  libpsm-infinipath1-dev 3.3-19-g67c0807-2ifs+deb9 amd64 Development
> files for libpsm-infinipath1
> ii  libpsm2-2  11.2.185-1-1ifs+deb9  amd64 Intel PSM2
> Libraries
> ii  libpsm2-2-compat   11.2.185-1-1ifs+deb9  amd64 Compat
> library for Intel PSM2
> ii  libpsm2-dev11.2.185-1-1ifs+deb9  amd64 Development
> files for Intel PSM2
> ii  psmisc 22.21-2.1+b2  amd64 utilities
> that use the proc file system
>
> This will be my next try to install OpenMPI on this cluster.
>
> Patrick
>
>
> Le 27/01/2021 à 18:09, Pritchard Jr., Howard via users a écrit :
> > Hi Folks,
> >
> > I'm also have problems reproducing this on one of our OPA clusters:
> >
> > libpsm2-11.2.78-1.el7.x86_64
> > libpsm2-devel-11.2.78-1.el7.x86_64
> >
> > cluster runs RHEL 7.8
> >
> > hca_id:   hfi1_0
> >   transport:  InfiniBand (0)
> >   fw_ver: 1.27.0
> >   node_guid:  0011:7501:0179:e2d7
> >   sys_image_guid: 0011:7501:0179:e2d7
> >   vendor_id:  0x1175
> >   vendor_part_id: 9456
> >   hw_ver: 0x11
> >   board_id:   Intel Omni-Path Host Fabric Interface 
> > Adapter 100 Series
> >   phys_port_cnt:  1
> >   port:   1
> >   state:  PORT_ACTIVE (4)
> >   max_mtu:4096 (5)
> >   active_mtu: 4096 (5)
> >   sm_lid: 1
> >   port_lid:       99
> >   port_lmc:   0x00
> >   link_layer: InfiniBand
> >
> > using gcc/gfortran 9.3.0
> >
> > Built Open MPI 4.0.5 without any special configure options.
> >
> > Howard
> >
> > On 1/27/21, 9:47 AM, "users on behalf of Michael Di Domenico via users" 
> >  
> > wrote:
> >
> > for whatever it's worth running the test program on my OPA cluster
> > seems to work.  well it keeps spitting out [INFO MEMORY] lines, not
> > sure if it's supposed to stop at some point
> >
> > i'm running rhel7, gcc 10.1, openmpi 4.0.5rc2, with-ofi, 
> > without-{psm,ucx,verbs}
>

Re: [OMPI users] OpenMPI 4.0.5 error with Omni-path

2021-01-27 Thread Michael Di Domenico via users
for whatever it's worth running the test program on my OPA cluster
seems to work.  well it keeps spitting out [INFO MEMORY] lines, not
sure if it's supposed to stop at some point

i'm running rhel7, gcc 10.1, openmpi 4.0.5rc2, with-ofi, without-{psm,ucx,verbs}

On Tue, Jan 26, 2021 at 3:44 PM Patrick Begou via users
 wrote:
>
> Hi Michael
>
> indeed I'm a little bit lost with all these parameters in OpenMPI, mainly 
> because for years it works just fine out of the box in all my deployments on 
> various architectures, interconnects and linux flavor. Some weeks ago I 
> deploy OpenMPI4.0.5 in Centos8 with gcc10, slurm and UCX on an AMD epyc2 
> cluster with connectX6, and it just works fine.  It is the first time I've 
> such trouble to deploy this library.
>
> If you have my mail posted  the 25/01/2021 in this discussion at 18h54 (may 
> be Paris TZ) there is a small test case attached that show the problem. Did 
> you got it or did the list strip these attachments ? I can provide it again.
>
> Many thanks
>
> Patrick
>
> Le 26/01/2021 à 19:25, Heinz, Michael William a écrit :
>
> Patrick how are you using original PSM if you’re using Omni-Path hardware? 
> The original PSM was written for QLogic DDR and QDR Infiniband adapters.
>
> As far as needing openib - the issue is that the PSM2 MTL doesn’t support a 
> subset of MPI operations that we previously used the pt2pt BTL for. For 
> recent version of OMPI, the preferred BTL to use with PSM2 is OFI.
>
> Is there any chance you can give us a sample MPI app that reproduces the 
> problem? I can’t think of another way I can give you more help without being 
> able to see what’s going on. It’s always possible there’s a bug in the PSM2 
> MTL but it would be surprising at this point.
>
> Sent from my iPad
>
> On Jan 26, 2021, at 1:13 PM, Patrick Begou via users 
>  wrote:
>
> 
> Hi all,
>
> I ran many tests today. I saw that an older 4.0.2 version of OpenMPI packaged 
> with Nix was running using openib. So I add the --with-verbs option to setup 
> this module.
>
> That I can see now is that:
>
> mpirun -hostfile $OAR_NODEFILE  --mca mtl psm -mca btl_openib_allow_ib true 
> 
>
> - the testcase test_layout_array is running without error
>
> - the bandwidth measured with osu_bw is half of thar it should be:
>
> # OSU MPI Bandwidth Test v5.7
> # Size  Bandwidth (MB/s)
> 1   0.54
> 2   1.13
> 4   2.26
> 8   4.51
> 16  9.06
> 32 17.93
> 64 33.87
> 12869.29
> 256   161.24
> 512   333.82
> 1024  682.66
> 2048 1188.63
> 4096 1760.14
> 8192 2166.08
> 163842036.95
> 327683466.63
> 655366296.73
> 131072   7509.43
> 262144   9104.78
> 524288   6908.55
> 1048576  5530.37
> 2097152  4489.16
> 4194304  3498.14
>
> mpirun -hostfile $OAR_NODEFILE  --mca mtl psm2 -mca btl_openib_allow_ib true 
> ...
>
> - the testcase test_layout_array is not giving correct results
>
> - the bandwidth measured with osu_bw is the right one:
>
> # OSU MPI Bandwidth Test v5.7
> # Size  Bandwidth (MB/s)
> 1   3.73
> 2   7.96
> 4  15.82
> 8  31.22
> 16 51.52
> 32107.61
> 64196.51
> 128   438.66
> 256   817.70
> 512  1593.90
> 1024 2786.09
> 2048 4459.77
> 4096 6658.70
> 8192 8092.95
> 163848664.43
> 327688495.96
> 65536   11458.77
> 131072  12094.64
> 262144  11781.84
> 524288  12297.58
> 1048576 12346.92
> 2097152 12206.53
> 4194304 12167.00
>
> But yes, I know openib is deprecated too in 4.0.5.
>
> Patrick
>
>


[OMPI users] openmpi/pmix/ucx

2020-02-07 Thread Michael Di Domenico via users
i haven't compiled openmpi in a while, but i'm in the process of
upgrading our cluster.

the last time i did this there were specific versions of mpi/pmix/ucx
that were all tested and supposed to work together.  my understanding
of this was because pmi/ucx was under rapid development and the api's
were changing

is that still an issue or can i take the latest stable branches from
git for each and have a relatively good shot at it all working
together?

the one semi-immovable i have right now is ucx which is at 1.7.0 as
installed by mellanox ofed.  if the above is true, is there a matrix
of versions i should be using for all the others?  nothing jumped out
at me on the openmpi website


Re: [OMPI users] local rank to rank comms

2019-03-20 Thread Michael Di Domenico
unfortunately it takes a while to export the data, but here's what i see

On Mon, Mar 11, 2019 at 11:02 PM Gilles Gouaillardet  wrote:
>
> Michael,
>
>
> this is odd, I will have a look.
>
> Can you confirm you are running on a single node ?
>
>
> At first, you need to understand which component is used by Open MPI for
> communications.
>
> There are several options here, and since I do not know how Open MPI was
> built, nor which dependencies are installed,
>
> I can only list a few
>
>
> - pml/cm uses mtl/psm2 => omnipath is used for both inter and intra node
> communications
>
> - pml/cm uses mtl/ofi => libfabric is used for both inter and intra node
> communications. it definitely uses libpsm2 for inter node
> communications, and I do not know enough about the internals to tell how
> inter communications are handled
>
> - pml/ob1 is used, I guess it uses btl/ofi for inter node communications
> and btl/vader for intra node communications (in that case the NIC device
> is not used for intra node communications
>
> there could be other I am missing (does UCX support OmniPath ? could
> btl/ofi also be used for intra node communications ?)
>
>
> mpirun --mca pml_base_verbose 10 --mca btl_base_verbose 10 --mca
> mtl_base_verbose 10 ...
>
> should tell you what is used (feel free to compress and post the full
> output if you have some hard time understanding the logs)
>
>
> Cheers,
>
>
> Gilles
>
> On 3/12/2019 1:41 AM, Michael Di Domenico wrote:
> > On Mon, Mar 11, 2019 at 12:09 PM Gilles Gouaillardet
> >  wrote:
> >> You can force
> >> mpirun --mca pml ob1 ...
> >> And btl/vader (shared memory) will be used for intra node communications 
> >> ... unless MPI tasks are from different jobs (read MPI_Comm_spawn())
> > if i run
> >
> > mpirun -n 16 IMB-MPI1 alltoallv
> > things run fine, 12us on average for all ranks
> >
> > if i run
> >
> > mpirun -n 16 --mca pml ob1 IMB-MPI1 alltoallv
> > the program runs, but then it hangs at "List of benchmarks to run:
> > #Alltoallv"  and no tests run
> > ___
> > users mailing list
> > users@lists.open-mpi.org
> > https://lists.open-mpi.org/mailman/listinfo/users
> >
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


ompi.run.ob1
Description: Binary data


ompi.run.cm
Description: Binary data
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 12:09 PM Gilles Gouaillardet
 wrote:
> You can force
> mpirun --mca pml ob1 ...
> And btl/vader (shared memory) will be used for intra node communications ... 
> unless MPI tasks are from different jobs (read MPI_Comm_spawn())

if i run

mpirun -n 16 IMB-MPI1 alltoallv
things run fine, 12us on average for all ranks

if i run

mpirun -n 16 --mca pml ob1 IMB-MPI1 alltoallv
the program runs, but then it hangs at "List of benchmarks to run:
#Alltoallv"  and no tests run
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 12:19 PM Ralph H Castain  wrote:
> OFI uses libpsm2 underneath it when omnipath detected
>
> > On Mar 11, 2019, at 9:06 AM, Gilles Gouaillardet 
> >  wrote:
> > It might show that pml/cm and mtl/psm2 are used. In that case, then yes, 
> > the OmniPath library is used even for intra node communications. If this 
> > library is optimized for intra node, then it will internally uses shared 
> > memory instead of the NIC.

would it be fair to assume that, if we assume the opa library is
optimized for intra-node using shared memory, there shouldn't be much
of a difference between the opa library and the ompi library for local
rank to rank comms

is there a way or tool to measure that?  i'd like to run the tests
toggling opa vs ompi libraries and see if or really how much a
difference there is
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
On Mon, Mar 11, 2019 at 11:51 AM Ralph H Castain  wrote:
> You are probably using the ofi mtl - could be psm2 uses loopback method?

according to ompi_info i do in fact have mtl's ofi,psm,psm2.  i
haven't changed any of the defaults, so are you saying order to change
the behaviour i have to run mpirun --mca mtl psm2?  if true, what's
the recourse to not using the ofi mtl?
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] local rank to rank comms

2019-03-11 Thread Michael Di Domenico
i have a user that's claiming when two ranks on the same node want to
talk with each other, they're using the NIC to talk rather then just
talking directly.

i've never had to test such a scenario.  is there a way for me to
prove one way or another whether two ranks are talking through say the
kernel (or however it actually works) or using the nic?

i didn't set any flags when i compiled openmpi to change this.

i'm running ompi 3.1, pmix 2.2.1, and slurm 18.05 running atop omnipath
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] pmix and srun

2019-01-18 Thread Michael Di Domenico
seems to be better now.  jobs are running

On Fri, Jan 18, 2019 at 6:17 PM Ralph H Castain  wrote:
>
> I have pushed a fix to the v2.2 branch - could you please confirm it?
>
>
> > On Jan 18, 2019, at 2:23 PM, Ralph H Castain  wrote:
> >
> > Aha - I found it. It’s a typo in the v2.2.1 release. Sadly, our Slurm 
> > plugin folks seem to be off somewhere for awhile and haven’t been testing 
> > it. Sigh.
> >
> > I’ll patch the branch and let you know - we’d appreciate the feedback.
> > Ralph
> >
> >
> >> On Jan 18, 2019, at 2:09 PM, Michael Di Domenico  
> >> wrote:
> >>
> >> here's the branches i'm using.  i did a git clone on the repo's and
> >> then a git checkout
> >>
> >> [ec2-user@labhead bin]$ cd /hpc/src/pmix/
> >> [ec2-user@labhead pmix]$ git branch
> >> master
> >> * v2.2
> >> [ec2-user@labhead pmix]$ cd ../slurm/
> >> [ec2-user@labhead slurm]$ git branch
> >> * (detached from origin/slurm-18.08)
> >> master
> >> [ec2-user@labhead slurm]$ cd ../ompi/
> >> [ec2-user@labhead ompi]$ git branch
> >> * (detached from origin/v3.1.x)
> >> master
> >>
> >>
> >> attached is the debug out from the run with the debugging turned on
> >>
> >> On Fri, Jan 18, 2019 at 4:30 PM Ralph H Castain  wrote:
> >>>
> >>> Looks strange. I’m pretty sure Mellanox didn’t implement the event 
> >>> notification system in the Slurm plugin, but you should only be trying to 
> >>> call it if OMPI is registering a system-level event code - which OMPI 3.1 
> >>> definitely doesn’t do.
> >>>
> >>> If you are using PMIx v2.2.0, then please note that there is a bug in it 
> >>> that slipped through our automated testing. I replaced it today with 
> >>> v2.2.1 - you probably should update if that’s the case. However, that 
> >>> wouldn’t necessarily explain this behavior. I’m not that familiar with 
> >>> the Slurm plugin, but you might try adding
> >>>
> >>> PMIX_MCA_pmix_client_event_verbose=5
> >>> PMIX_MCA_pmix_server_event_verbose=5
> >>> OMPI_MCA_pmix_base_verbose=10
> >>>
> >>> to your environment and see if that provides anything useful.
> >>>
> >>>> On Jan 18, 2019, at 12:09 PM, Michael Di Domenico 
> >>>>  wrote:
> >>>>
> >>>> i compilied pmix slurm openmpi
> >>>>
> >>>> ---pmix
> >>>> ./configure --prefix=/hpc/pmix/2.2 --with-munge=/hpc/munge/0.5.13
> >>>> --disable-debug
> >>>> ---slurm
> >>>> ./configure --prefix=/hpc/slurm/18.08 --with-munge=/hpc/munge/0.5.13
> >>>> --with-pmix=/hpc/pmix/2.2
> >>>> ---openmpi
> >>>> ./configure --prefix=/hpc/ompi/3.1 --with-hwloc=external
> >>>> --with-libevent=external --with-slurm=/hpc/slurm/18.08
> >>>> --with-pmix=/hpc/pmix/2.2
> >>>>
> >>>> everything seemed to compile fine, but when i do an srun i get the
> >>>> below errors, however, if i salloc and then mpirun it seems to work
> >>>> fine.  i'm not quite sure where the breakdown is or how to debug it
> >>>>
> >>>> ---
> >>>>
> >>>> [ec2-user@labcmp1 linux]$ srun --mpi=pmix_v2 -n 16 xhpl
> >>>> [labcmp6:18353] PMIX ERROR: NOT-SUPPORTED in file
> >>>> event/pmix_event_registration.c at line 101
> >>>> [labcmp6:18355] PMIX ERROR: NOT-SUPPORTED in file
> >>>> event/pmix_event_registration.c at line 101
> >>>> [labcmp5:18355] PMIX ERROR: NOT-SUPPORTED in file
> >>>> event/pmix_event_registration.c at line 101
> >>>> --
> >>>> It looks like MPI_INIT failed for some reason; your parallel process is
> >>>> likely to abort.  There are many reasons that a parallel process can
> >>>> fail during MPI_INIT; some of which are due to configuration or 
> >>>> environment
> >>>> problems.  This failure appears to be an internal failure; here's some
> >>>> additional information (which may only be relevant to an Open MPI
> >>>> developer):
> >>>>
> >>>> ompi_interlib_declare
> >>>> --> Returned "Would block" (-10) instead of "Success" (0)
> >>>> ...snipped.

Re: [OMPI users] Fwd: pmix and srun

2019-01-18 Thread Michael Di Domenico
here's the branches i'm using.  i did a git clone on the repo's and
then a git checkout

[ec2-user@labhead bin]$ cd /hpc/src/pmix/
[ec2-user@labhead pmix]$ git branch
  master
* v2.2
[ec2-user@labhead pmix]$ cd ../slurm/
[ec2-user@labhead slurm]$ git branch
* (detached from origin/slurm-18.08)
  master
[ec2-user@labhead slurm]$ cd ../ompi/
[ec2-user@labhead ompi]$ git branch
* (detached from origin/v3.1.x)
  master


attached is the debug out from the run with the debugging turned on

On Fri, Jan 18, 2019 at 4:30 PM Ralph H Castain  wrote:
>
> Looks strange. I’m pretty sure Mellanox didn’t implement the event 
> notification system in the Slurm plugin, but you should only be trying to 
> call it if OMPI is registering a system-level event code - which OMPI 3.1 
> definitely doesn’t do.
>
> If you are using PMIx v2.2.0, then please note that there is a bug in it that 
> slipped through our automated testing. I replaced it today with v2.2.1 - you 
> probably should update if that’s the case. However, that wouldn’t necessarily 
> explain this behavior. I’m not that familiar with the Slurm plugin, but you 
> might try adding
>
> PMIX_MCA_pmix_client_event_verbose=5
> PMIX_MCA_pmix_server_event_verbose=5
> OMPI_MCA_pmix_base_verbose=10
>
> to your environment and see if that provides anything useful.
>
> > On Jan 18, 2019, at 12:09 PM, Michael Di Domenico  
> > wrote:
> >
> > i compilied pmix slurm openmpi
> >
> > ---pmix
> > ./configure --prefix=/hpc/pmix/2.2 --with-munge=/hpc/munge/0.5.13
> > --disable-debug
> > ---slurm
> > ./configure --prefix=/hpc/slurm/18.08 --with-munge=/hpc/munge/0.5.13
> > --with-pmix=/hpc/pmix/2.2
> > ---openmpi
> > ./configure --prefix=/hpc/ompi/3.1 --with-hwloc=external
> > --with-libevent=external --with-slurm=/hpc/slurm/18.08
> > --with-pmix=/hpc/pmix/2.2
> >
> > everything seemed to compile fine, but when i do an srun i get the
> > below errors, however, if i salloc and then mpirun it seems to work
> > fine.  i'm not quite sure where the breakdown is or how to debug it
> >
> > ---
> >
> > [ec2-user@labcmp1 linux]$ srun --mpi=pmix_v2 -n 16 xhpl
> > [labcmp6:18353] PMIX ERROR: NOT-SUPPORTED in file
> > event/pmix_event_registration.c at line 101
> > [labcmp6:18355] PMIX ERROR: NOT-SUPPORTED in file
> > event/pmix_event_registration.c at line 101
> > [labcmp5:18355] PMIX ERROR: NOT-SUPPORTED in file
> > event/pmix_event_registration.c at line 101
> > --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.  There are many reasons that a parallel process can
> > fail during MPI_INIT; some of which are due to configuration or environment
> > problems.  This failure appears to be an internal failure; here's some
> > additional information (which may only be relevant to an Open MPI
> > developer):
> >
> >  ompi_interlib_declare
> >  --> Returned "Would block" (-10) instead of "Success" (0)
> > ...snipped...
> > [labcmp6:18355] *** An error occurred in MPI_Init
> > [labcmp6:18355] *** reported by process [140726281390153,15]
> > [labcmp6:18355] *** on a NULL communicator
> > [labcmp6:18355] *** Unknown error
> > [labcmp6:18355] *** MPI_ERRORS_ARE_FATAL (processes in this
> > communicator will now abort,
> > [labcmp6:18355] ***and potentially your MPI job)
> > [labcmp6:18352] *** An error occurred in MPI_Init
> > [labcmp6:18352] *** reported by process [1677936713,12]
> > [labcmp6:18352] *** on a NULL communicator
> > [labcmp6:18352] *** Unknown error
> > [labcmp6:18352] *** MPI_ERRORS_ARE_FATAL (processes in this
> > communicator will now abort,
> > [labcmp6:18352] ***and potentially your MPI job)
> > [labcmp6:18354] *** An error occurred in MPI_Init
> > [labcmp6:18354] *** reported by process [140726281390153,14]
> > [labcmp6:18354] *** on a NULL communicator
> > [labcmp6:18354] *** Unknown error
> > [labcmp6:18354] *** MPI_ERRORS_ARE_FATAL (processes in this
> > communicator will now abort,
> > [labcmp6:18354] ***and potentially your MPI job)
> > srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
> > slurmstepd: error: *** STEP 24.0 ON labcmp3 CANCELLED AT 
> > 2019-01-18T20:03:33 ***
> > [labcmp5:18358] PMIX ERROR: NOT-SUPPORTED in file
> > event/pmix_event_registration.c at line 101
> > --
> > It looks like MPI_INIT failed for some reason; your parallel process is
> > likely to abort.

[OMPI users] Fwd: pmix and srun

2019-01-18 Thread Michael Di Domenico
i compilied pmix slurm openmpi

---pmix
./configure --prefix=/hpc/pmix/2.2 --with-munge=/hpc/munge/0.5.13
--disable-debug
---slurm
./configure --prefix=/hpc/slurm/18.08 --with-munge=/hpc/munge/0.5.13
--with-pmix=/hpc/pmix/2.2
---openmpi
./configure --prefix=/hpc/ompi/3.1 --with-hwloc=external
--with-libevent=external --with-slurm=/hpc/slurm/18.08
--with-pmix=/hpc/pmix/2.2

everything seemed to compile fine, but when i do an srun i get the
below errors, however, if i salloc and then mpirun it seems to work
fine.  i'm not quite sure where the breakdown is or how to debug it

---

[ec2-user@labcmp1 linux]$ srun --mpi=pmix_v2 -n 16 xhpl
[labcmp6:18353] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
[labcmp6:18355] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
[labcmp5:18355] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_interlib_declare
  --> Returned "Would block" (-10) instead of "Success" (0)
...snipped...
[labcmp6:18355] *** An error occurred in MPI_Init
[labcmp6:18355] *** reported by process [140726281390153,15]
[labcmp6:18355] *** on a NULL communicator
[labcmp6:18355] *** Unknown error
[labcmp6:18355] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[labcmp6:18355] ***and potentially your MPI job)
[labcmp6:18352] *** An error occurred in MPI_Init
[labcmp6:18352] *** reported by process [1677936713,12]
[labcmp6:18352] *** on a NULL communicator
[labcmp6:18352] *** Unknown error
[labcmp6:18352] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[labcmp6:18352] ***and potentially your MPI job)
[labcmp6:18354] *** An error occurred in MPI_Init
[labcmp6:18354] *** reported by process [140726281390153,14]
[labcmp6:18354] *** on a NULL communicator
[labcmp6:18354] *** Unknown error
[labcmp6:18354] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[labcmp6:18354] ***and potentially your MPI job)
srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
slurmstepd: error: *** STEP 24.0 ON labcmp3 CANCELLED AT 2019-01-18T20:03:33 ***
[labcmp5:18358] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
--
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

  ompi_interlib_declare
  --> Returned "Would block" (-10) instead of "Success" (0)
--
[labcmp5:18357] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
[labcmp5:18356] PMIX ERROR: NOT-SUPPORTED in file
event/pmix_event_registration.c at line 101
srun: error: labcmp6: tasks 12-15: Exited with exit code 1
srun: error: labcmp3: tasks 0-3: Killed
srun: error: labcmp4: tasks 4-7: Killed
srun: error: labcmp5: tasks 8-11: Killed
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] OpenFabrics warning

2018-11-12 Thread Michael Di Domenico
On Mon, Nov 12, 2018 at 8:08 AM Andrei Berceanu
 wrote:
>
> Running a CUDA+MPI application on a node with 2 K80 GPUs, I get the following 
> warnings:
>
> --
> WARNING: There is at least non-excluded one OpenFabrics device found,
> but there are no active ports detected (or Open MPI was unable to use
> them).  This is most certainly not what you wanted.  Check your
> cables, subnet manager configuration, etc.  The openib BTL will be
> ignored for this job.
>
>   Local host: gpu01
> --
> [gpu01:107262] 1 more process has sent help message help-mpi-btl-openib.txt / 
> no active ports found
> [gpu01:107262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all 
> help / error messages
>
> Any idea of what is going on and how I can fix this?
> I am using OpenMPI 3.1.2.

looks like openmpi found something like an infiniband card in the
compute node you're using, but it is not active/usable

as for a fix, it depends.

if you have an IB card should it be active?  if so, you'd have to
check the connections to see why it's disabled

if not, you'll can tell openmpi to disregard the IB ports, which will
clear the warning, but that might mean you're potentially using a
slower interface for message passing
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Problem running with UCX/oshmem on single node?

2018-05-14 Thread Michael Di Domenico
On Wed, May 9, 2018 at 9:45 PM, Howard Pritchard  wrote:
>
> You either need to go and buy a connectx4/5 HCA from mellanox (and maybe a
> switch), and install that
> on your system, or else install xpmem (https://github.com/hjelmn/xpmem).
> Note there is a bug right now
> in UCX that you may hit if you try to go thee xpmem only  route:

How stringent is the Connect-X 4/5 requirement?  i have Connect-X 3
cards will they work?  during the configure step is seems to yell at
me that mlx5 wont compile because i don't have Mellanox OFED v3.1
installed, is that also a requirement (i'm using the RHEl7.4 bundled
version of ofed, not then vendor versions)
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] shmem

2018-05-09 Thread Michael Di Domenico
before i debug ucx further (cause it's totally not working for me), i
figured i'd check to see if it's *really* required to use shmem inside
of openmpi.  i'm pretty sure the answer is yes, but i wanted to double
check.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] openmpi/slurm/pmix

2018-04-25 Thread Michael Di Domenico
On Mon, Apr 23, 2018 at 6:07 PM, r...@open-mpi.org  wrote:
> Looks like the problem is that you didn’t wind up with the external PMIx. The 
> component listed in your error is the internal PMIx one which shouldn’t have 
> built given that configure line.
>
> Check your config.out and see what happened. Also, ensure that your 
> LD_LIBRARY_PATH is properly pointing to the installation, and that you built 
> into a “clean” prefix.

the "clean prefix" part seemed to fix my issue.  i'm not exactly sure
i understand why/how though.  i recompiled pmix and removed the old
installation before doing a make install

when i recompiled openmpi it seems to have figured itself out

i think things are still a little wonky, but at least that issue is gone
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] openmpi/slurm/pmix

2018-04-23 Thread Michael Di Domenico
i'm trying to get slurm 17.11.5 and openmpi 3.0.1 working with pmix.

everything compiled, but when i run something it get

: symbol lookup error: /openmpi/mca_pmix_pmix2x.so: undefined symbol:
opal_libevent2022_evthread_use_pthreads

i more then sure i did something wrong, but i'm not sure what, here's what i did

compile libevent 2.1.8

./configure --prefix=/libevent-2.1.8

compile pmix 2.1.0

./configure --prefix=/pmix-2.1.0 --with-psm2
--with-munge=/munge-0.5.13 --with-libevent=/libevent-2.1.8

compile openmpi

./configure --prefix=/openmpi-3.0.1 --with-slurm=/slurm-17.11.5
--with-hwloc=external --with-mxm=/opt/mellanox/mxm
--with-cuda=/usr/local/cuda --with-pmix=/pmix-2.1.0
--with-libevent=/libevent-2.1.8

when i look at the symbols in the mca_pmix_pmix2x.so library the
function is indeed undefined (U) in the output, but checking ldd
against the library doesn't show any missing

any thoughts?
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] disabling libraries?

2018-04-10 Thread Michael Di Domenico
On Sat, Apr 7, 2018 at 3:50 PM, Jeff Squyres (jsquyres)
<jsquy...@cisco.com> wrote:
> On Apr 6, 2018, at 8:12 AM, Michael Di Domenico <mdidomeni...@gmail.com> 
> wrote:
>> it would be nice if openmpi had (or may already have) a simple switch
>> that lets me disable entire portions of the library chain, ie this
>> host doesn't have a particular interconnect, so don't load any of the
>> libraries.  this might run counter to how openmpi discovers and load
>> libs though.
>
> We've actually been arguing about exactly how to do this for quite a while.  
> It's complicated (I can explain further, if you care).  :-\

i have no doubt its complicated.  i'm not overly interested in the
detail, but others i'm sure might be.  in reality you're correct, i
don't care that openmpi failed to load the libs given the fact that
the job continues to run without issue.  and in fact i don't even care
about the warnings, but my users will complain and ask questions.

achieving a single build binary where i can disable the
interconnects/libraries at runtime would be HIGHLY beneficial to me
(perhaps others as well).  it cuts my build version combinations from
like 12 to 4 (or less), that's a huge reduction in labour/maintenance.
which also means i can upgrade openmpi quicker and stay more up to
date.

i would garner this is probably not a high priority for the team
working on openmpi, but if there's something my organization or I can
do to push this higher, let me know.

> That being said, I think we *do* have a workaround that might be good enough 
> for you: disable those warnings about plugins not being able to be opened:
> mpirun --mca mca_component_show_load_errors 0 ...

disabled this: mca_base_component_repository_open: unable to open
mca_oob_ud: libibverbs.so.1
but not this: pmix_mca_base_component_repository_open: unable to open
mca_pnet_opa: libpsm2.so.2
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] disabling libraries?

2018-04-06 Thread Michael Di Domenico
On Thu, Apr 5, 2018 at 7:59 PM, Gilles Gouaillardet
 wrote:
> That being said, the error suggest mca_oob_ud.so is a module from a
> previous install,
> Open MPI was not built on the system it is running, or libibverbs.so.1
> has been removed after
> Open MPI was built.

yes, understood, i compiled openmpi on a node that has all the
libraries installed for our various interconnects, opa/psm/mxm/ib, but
i ran mpirun on a node that has none of them

so the resulting warnings i get

mca_btl_openib: lbrdmacm.so.1
mca_btl_usnic: libfabric.so.1
mca_oob_ud: libibverbs.so.1
mca_mtl_mxm: libmxm.so.2
mca_mtl_ofi: libfabric.so.1
mca_mtl_psm: libpsm_infinipath.so.1
mca_mtl_psm2: libpsm2.so.2
mca_pml_yalla: libmxm.so.2

you referenced them as "errors" above, but mpi actually runs just fine
for me even with these msgs, so i would consider them more warnings.

> So I do encourage you to take a step back, and think if you can find a
> better solution for your site.

there are two alternatives

1 i can compile a specific version of openmpi for each of our clusters
with each specific interconnect libraries

2 i can install all the libraries on all the machines regardless of
whether the interconnect is present

both are certainly plausible, but my effort here is to see if i can
reduce the size of our software stack and/or reduce the number of
compiled versions of openmpi

it would be nice if openmpi had (or may already have) a simple switch
that lets me disable entire portions of the library chain, ie this
host doesn't have a particular interconnect, so don't load any of the
libraries.  this might run counter to how openmpi discovers and load
libs though.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] disabling libraries?

2018-04-05 Thread Michael Di Domenico
i'm trying to compile openmpi to support all of our interconnects,
psm/openib/mxm/etc

this works fine, openmpi finds all the libs, compiles and runs on each
of the respective machines

however, we don't install the libraries for everything everywhere

so when i run things like ompi_info and mpirun i get

mca_base_component_reposity_open: unable to open mca_oob_ud:
libibverbs.so.1: cannot open shared object file: no such file or
directory (ignored)

and so on, for a bunch of other libs.

i understand how the lib linking works so this isn't unexpected and
doesn't stop the mpi programs from running.

here's the part i don't understand, how can i trace the above warning
and others like it back the required --mca parameters i need to add
into the configuration to make the warnings go away?

as an aside, i believe i can set most of them via environment
variables as well as the command, but what i really like to do is set
them from a file.  i know i can create a default param file, but is
there a way to feed a param file at invocation depending where mpirun
is being run?
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] openmpi hang on IB disconnect

2018-01-17 Thread Michael Di Domenico
openmpi-2.0.2 running on rhel 7.4 with qlogic QDR infiniband
switches/adapters, also using slurm

i have a user that's running a job over multiple days.  unfortunately
after a few days at random the job will seemingly hang.  the latest
instance was caused by an infiniband adapter that went offline and
online several times.

the card is in a semi-working state at the moment, it's passing
traffic, but i suspect some of the IB messages during the job run got
lost and now the job is seemingly hung.

is there some mechanism i can put in place to detect this condition
either in the code or on the system.  it's causing two problems at the
moment.  first and foremost the user has no idea the job hung and for
what reason.  second it's wasting system time.

i'm sure other people have come across wonky IB cards, i'm curious how
everyone else is detecting this condition and dealing with it.
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] openmpi mgmt traffic

2017-10-11 Thread Michael Di Domenico
my cluster nodes are connected on 1g ethernet eth0/eth1 and via
infiniband rdma and ib0

my understanding is that openmpi will detect all these interfaces.
using eth0/eth1 for connection setup and use rdma for msg passing

what would be an appropriate to command line parameters to tell
openmpi to ipoib for connection setup and rdma for message passing?

i effectively want to ignore the 1g ethernet connections for anything
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


[OMPI users] alltoallv

2017-10-10 Thread Michael Di Domenico
i'm getting stuck trying to run some fairly large IMB-MPI alltoall
tests under openmpi 2.0.2 on rhel 7.4

i have two different clusters, one running mellanox fdr10 and one
running qlogic qdr

if i issue

mpirun -n 1024 ./IMB-MPI1 -npmin 1024 -iter 1 -mem 2.001 alltoallv

the job just stalls after the "List of Benchmarks to run: Alltoallv"
line outputs from IMB-MPI

if i switch it to alltoall the test does progress

often when running various size alltoall's i'll get

"too many retries sending message to <>:<>, giving up

i'm able to use infiniband just fine (our lustre filesystem mounts
over it) and i have other mpi programs running

it only seems to stem when i run alltoall type primitives

any thoughts on debugging where the failures are, i might just need to
turn up the debugging, but i'm not sure where
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] disable slurm/munge from mpirun

2017-06-23 Thread Michael Di Domenico
On Thu, Jun 22, 2017 at 12:41 PM, r...@open-mpi.org  wrote:
> I gather you are using OMPI 2.x, yes? And you configured it 
> --with-pmi=, then moved the executables/libs to your 
> workstation?

correct

> I suppose I could state the obvious and say “don’t do that - just rebuild it”

correct...  but bummer...  so much for being lazy...

> and I fear that (after checking the 2.x code) you really have no choice. OMPI 
> v3.0 will have a way around the problem, but not the 2.x series.
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
On Thu, Jun 22, 2017 at 10:43 AM, John Hearns via users
 wrote:
> Having had some problems with ssh launching (a few minutes ago) I can
> confirm that this works:
>
> --mca plm_rsh_agent "ssh -v"

this doesn't do anything for me

if i set OMPI_MCA_sec=^munge

i can clear the mca_sec_munge error

but the mca_pmix_pmix112 and opal_pmix_base_select errors still
exists.  the plm_rsh_agent switch/env var doesn't seem to affect that
error

down the road, i may still need the rsh_agent flag, but i think we're
still before that in the sequence of events
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
that took care of one of the errors, but i missed a re-type on the second error

mca_base_component_repository_open: unable to open mca_pmix_pmix112:
libmunge missing

and the opal_pmix_base_select error is still there (which is what's
actually halting my job)



On Thu, Jun 22, 2017 at 10:35 AM, r...@open-mpi.org <r...@open-mpi.org> wrote:
> You can add "OMPI_MCA_plm=rsh OMPI_MCA_sec=^munge” to your environment
>
>
> On Jun 22, 2017, at 7:28 AM, John Hearns via users
> <users@lists.open-mpi.org> wrote:
>
> Michael,  try
>  --mca plm_rsh_agent ssh
>
> I've been fooling with this myself recently, in the contect of a PBS cluster
>
> On 22 June 2017 at 16:16, Michael Di Domenico <mdidomeni...@gmail.com>
> wrote:
>>
>> is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
>> command line or (better) using environment variables?
>>
>> i'd like to use the installed version of openmpi i have on a
>> workstation, but it's linked with slurm from one of my clusters.
>>
>> mpi/slurm work just fine on the cluster, but when i run it on a
>> workstation i get the below errors
>>
>> mca_base_component_repositoy_open: unable to open mca_sec_munge:
>> libmunge missing
>> ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
>> opal_pmix_base_select failed
>> returned value not found (-13) instead of orte_success
>>
>> there's probably a magical incantation of mca parameters, but i'm not
>> adept enough at determining what they are
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://rfd.newmexicoconsortium.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users

[OMPI users] disable slurm/munge from mpirun

2017-06-22 Thread Michael Di Domenico
is it possible to disable slurm/munge/psm/pmi(x) from the mpirun
command line or (better) using environment variables?

i'd like to use the installed version of openmpi i have on a
workstation, but it's linked with slurm from one of my clusters.

mpi/slurm work just fine on the cluster, but when i run it on a
workstation i get the below errors

mca_base_component_repositoy_open: unable to open mca_sec_munge:
libmunge missing
ORTE_ERROR_LOG Not found in file ess_hnp_module.c at line 648
opal_pmix_base_select failed
returned value not found (-13) instead of orte_success

there's probably a magical incantation of mca parameters, but i'm not
adept enough at determining what they are
___
users mailing list
users@lists.open-mpi.org
https://rfd.newmexicoconsortium.org/mailman/listinfo/users


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-25 Thread Michael Di Domenico
On Mon, Jul 25, 2016 at 4:53 AM, Gilles Gouaillardet  wrote:
>
> as a workaround, you can configure without -noswitcherror.
>
> after you ran configure, you have to manually patch the generated 'libtool'
> file and add the line with pgcc*) and the next line like this :
>
> /* if pgcc is used, libtool does *not* pass -pthread to pgcc any more */
>
>
># Convert "-framework foo" to "foo.ltframework"
> # and "-pthread" to "-Wl,-pthread" if NAG compiler
> if test -n "$inherited_linker_flags"; then
>   case "$CC" in
> nagfor*)
>   tmp_inherited_linker_flags=`$ECHO "$inherited_linker_flags" |
> $SED 's/-framework \([^ $]*\)/\1.ltframework/g' | $SED
> 's/-pthread/-Wl,-pthread/g'`;;
> pgcc*)
>   tmp_inherited_linker_flags=`$ECHO "$inherited_linker_flags" |
> $SED 's/-framework \([^ $]*\)/\1.ltframework/g' | $SED 's/-pthread//g'`;;
> *)
>   tmp_inherited_linker_flags=`$ECHO "$inherited_linker_flags" |
> $SED 's/-framework \([^ $]*\)/\1.ltframework/g'`;;
>   esac
>
>
> i guess the right way is to patch libtool so it passes -noswitcherror to $CC
> and/or $LD, but i was not able to achieve that yet.


Thanks.  I managed to work around the issue, by hand compiling the
single module that failed during the build process.  but something is
definitely amiss in the openmpi compile system when it comes to pgi


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-22 Thread Michael Di Domenico
So, the -noswitcherror is partially working.  I added the switch into
my configure line LDFLAGS param.  I can see the parameter being passed
to libtool, but for some reason libtool is refusing to passing it
along at compile.

if i sh -x the libtool command line, i can see it set in a few
variables, but at the end when eval's the compile line for pgcc the
option is missing.

if i cut and past the eval line and hand put it back in, the library
compiles with a pgcc warning instead of an error which i believe what
i want, but i'm not sure why libtool is dropping the switch



On Tue, Jul 19, 2016 at 5:27 AM, Sylvain Jeaugey <sjeau...@nvidia.com> wrote:
> As a workaround, you can also try adding -noswitcherror to PGCC flags.
>
> On 07/11/2016 03:52 PM, Åke Sandgren wrote:
>>
>> Looks like you are compiling with slurm support.
>>
>> If so, you need to remove the "-pthread" from libslurm.la and libpmi.la
>>
>> On 07/11/2016 02:54 PM, Michael Di Domenico wrote:
>>>
>>> I'm trying to get openmpi compiled using the PGI compiler.
>>>
>>> the configure goes through and the code starts to compile, but then
>>> gets hung up with
>>>
>>> entering: openmpi-1.10.2/opal/mca/common/pmi
>>> CC common_pmi.lo
>>> CCLD libmca_common_pmi.la
>>> pgcc-Error-Unknown switch: - pthread
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>>> Link to this post:
>>> http://www.open-mpi.org/community/lists/users/2016/07/29635.php
>>>
>
> ---
> This email message is for the sole use of the intended recipient(s) and may
> contain
> confidential information.  Any unauthorized review, use, disclosure or
> distribution
> is prohibited.  If you are not the intended recipient, please contact the
> sender by
> reply email and destroy all copies of the original message.
> ---
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/07/29692.php


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-14 Thread Michael Di Domenico
On Mon, Jul 11, 2016 at 9:52 AM, Åke Sandgren  wrote:
> Looks like you are compiling with slurm support.
>
> If so, you need to remove the "-pthread" from libslurm.la and libpmi.la

i don't see a configure option in slurm to disable pthreads, so i'm
not sure this is possible.


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-14 Thread Michael Di Domenico
On Thu, Jul 14, 2016 at 9:47 AM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> Have 1.10.3 unpacked, ran through the configure using the same command
> line options as 1.10.2
>
> but it fails even earlier in the make process at
>
> Entering openmpi-1.10.3/opal/asm
> CPPAS atomic-asm.lo
> This licensed Software was made available from Nvidia Corportation
> under a time-limited beta license the beta license expires on jun 1 2015
> any attempt to use this product after jun 1 2015 is a violation of the terms
> of the PGI end user license agreement.

sorry, i take this back, i accidentally used PGI 15.3 compiler instead of 15.9

using 15.9 i get the same -pthread error from the slurm_pmi library.


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-14 Thread Michael Di Domenico
Have 1.10.3 unpacked, ran through the configure using the same command
line options as 1.10.2

but it fails even earlier in the make process at

Entering openmpi-1.10.3/opal/asm
CPPAS atomic-asm.lo
This licensed Software was made available from Nvidia Corportation
under a time-limited beta license the beta license expires on jun 1 2015
any attempt to use this product after jun 1 2015 is a violation of the terms
of the PGI end user license agreement.





On Mon, Jul 11, 2016 at 9:11 AM, Gilles Gouaillardet
<gilles.gouaillar...@gmail.com> wrote:
> Can you try the latest 1.10.3 instead ?
>
> btw, do you have a license for the pgCC C++ compiler ?
> fwiw, FreePGI on OSX has no C++ license and PGI C and gnu g++ does not work
> together out of the box, hopefully I will have a fix ready sometimes this
> week
>
> Cheers,
>
> Gilles
>
>
> On Monday, July 11, 2016, Michael Di Domenico <mdidomeni...@gmail.com>
> wrote:
>>
>> I'm trying to get openmpi compiled using the PGI compiler.
>>
>> the configure goes through and the code starts to compile, but then
>> gets hung up with
>>
>> entering: openmpi-1.10.2/opal/mca/common/pmi
>> CC common_pmi.lo
>> CCLD libmca_common_pmi.la
>> pgcc-Error-Unknown switch: - pthread
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post:
>> http://www.open-mpi.org/community/lists/users/2016/07/29635.php
>
>
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post:
> http://www.open-mpi.org/community/lists/users/2016/07/29636.php


Re: [OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-11 Thread Michael Di Domenico
On Mon, Jul 11, 2016 at 9:11 AM, Gilles Gouaillardet
 wrote:
> Can you try the latest 1.10.3 instead ?

i can but it'll take a few days to pull the software inside.

> btw, do you have a license for the pgCC C++ compiler ?
> fwiw, FreePGI on OSX has no C++ license and PGI C and gnu g++ does not work
> together out of the box, hopefully I will have a fix ready sometimes this
> week

we should, but i'm not positive.  we're running PGI on linux x64, we
typically buy the full suite, but i'll double check.


[OMPI users] openmpi 1.10.2 and PGI 15.9

2016-07-11 Thread Michael Di Domenico
I'm trying to get openmpi compiled using the PGI compiler.

the configure goes through and the code starts to compile, but then
gets hung up with

entering: openmpi-1.10.2/opal/mca/common/pmi
CC common_pmi.lo
CCLD libmca_common_pmi.la
pgcc-Error-Unknown switch: - pthread


Re: [OMPI users] locked memory and queue pairs

2016-03-17 Thread Michael Di Domenico
On Thu, Mar 17, 2016 at 12:15 PM, Cabral, Matias A
 wrote:
> I was looking for lines like" [nodexyz:17085] selected cm best priority 40" 
> and  " [nodexyz:17099] select: component psm selected"

this may have turned up more then i expected.  i recompiled openmpi
v1.8.4 as a test and reran the tests.  which seemed to run just fine.
looking at the debug output, i can clearly see a difference in the psm
calls.  i performed the same test using 1.10.2 and it works as well.

i've sent a msg off to the user to have him rerun and see where we're at.

i suspect my system level compile of openmpi might be all screwed up
with concern for psm.  i didn't see anything off in the configure
output, but i must have missed something.  i'll report back


Re: [OMPI users] locked memory and queue pairs

2016-03-17 Thread Michael Di Domenico
On Thu, Mar 17, 2016 at 12:52 PM, Jeff Squyres (jsquyres)
 wrote:
> Can you send all the information listed here?
>
> https://www.open-mpi.org/community/help/
>
> (including the full output from the run with the PML/BTL/MTL/etc. verbosity)
>
> This will allow Matias to look through all the relevant info, potentially 
> with fewer back-n-forth emails.

Understood, but unfortunately i cannot pull large dumps from the
system, its isolated.


Re: [OMPI users] locked memory and queue pairs

2016-03-17 Thread Michael Di Domenico
On Thu, Mar 17, 2016 at 12:15 PM, Cabral, Matias A
 wrote:
> I was looking for lines like" [nodexyz:17085] selected cm best priority 40" 
> and  " [nodexyz:17099] select: component psm selected"

i see cm best priority 20, which seems to relate to ob1 being
selected.  i don't see a mention of psm anywhere (i am NOT doing --mca
mtl ^psm), but i did compile openmpi with psm support


Re: [OMPI users] locked memory and queue pairs

2016-03-17 Thread Michael Di Domenico
On Wed, Mar 16, 2016 at 4:49 PM, Cabral, Matias A
 wrote:
> I didn't go into the code to see who is actually calling this error message, 
> but I suspect this may be a generic error for "out of memory" kind of thing 
> and not specific to the que pair. To confirm please add  -mca 
> pml_base_verbose 100 and add  -mca mtl_base_verbose 100  to see what is being 
> selected.

this didn't spit out anything overly useful, just lots of lines

[node001:00909] mca: base: components_register: registering pml components
[node001:00909] mca: base: components_register: found loaded component v
[node001:00909] mca: base: components_register: component v register
function successful
[node001:00909] mca: base: components_register: found loaded component bfo
[node001:00909] mca: base: components_register: component bfo register
function successful
[node001:00909] mca: base: components_register: found loaded component cm
[node001:00909] mca: base: components_register: component cm register
function successful
[node001:00909] mca: base: components_register: found loaded component ob1
[node001:00909] mca: base: components_register: component ob1 register
function successful

> I'm trying to remember some details of IMB  and alltoallv to see if it is 
> indeed requiring more resources that the other micro benchmarks.

i'm using IMB for my tests, but this issue came up because a
researcher isn't able to run large alltoall codes, so i don't believe
it's specific to IMB

> BTW, did you confirm the limits setup? Also do the nodes have all the same 
> amount of mem?

yes, all nodes have the limits set to unlimited and each node has
256GB of memory


Re: [OMPI users] locked memory and queue pairs

2016-03-16 Thread Michael Di Domenico
On Wed, Mar 16, 2016 at 3:37 PM, Cabral, Matias A
 wrote:
> Hi Michael,
>
> I may be missing some context, if you are using the qlogic cards you will 
> always want to use the psm mtl (-mca pml cm -mca mtl psm) and not openib btl. 
> As Tom suggest, confirm the limits are setup on every node: could it be the 
> alltoall is reaching a node that "others" are not? Please share the command 
> line and the error message.



Yes, under normal circumstances, I use PSM.  i only disabled to see if
it affected any kind of change.

the test i'm running is

mpirun -n 512 ./IMB-MPI1 alltoallv

when the system gets to 128 ranks, it freezes and errors out with

---

A process failed to create a queue pair. This usually means either
the device has run out of queue pairs (too many connections) or
there are insufficient resources available to allocate a queue pair
(out of memory). The latter can happen if either 1) insufficient
memory is available, or 2) no more physical memory can be registered
with the device.

For more information on memory registration see the Open MPI FAQs at:
http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages

Local host: node001
Local device:   qib0
Queue pair type:Reliable connected (RC)

---

i've also tried various nodes across the cluster (200+).  i think i
ruled out errant switch (qlogic single 12800-120) problems, bad
cables, and bad nodes.  that's not to say they're may not be present,
i've just not been able to find it


Re: [OMPI users] locked memory and queue pairs

2016-03-16 Thread Michael Di Domenico
On Wed, Mar 16, 2016 at 12:12 PM, Elken, Tom  wrote:
> Hi Mike,
>
> In this file,
> $ cat /etc/security/limits.conf
> ...
> < do you see at the end ... >
>
> * hard memlock unlimited
> * soft memlock unlimited
> # -- All InfiniBand Settings End here --
> ?

Yes.  I double checked that it's set on all compute nodes in the
actual file and through the ulimit command


Re: [OMPI users] locked memory and queue pairs

2016-03-16 Thread Michael Di Domenico
On Thu, Mar 10, 2016 at 11:54 AM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> when i try to run an openmpi job with >128 ranks (16 ranks per node)
> using alltoall or alltoallv, i'm getting an error that the process was
> unable to get a queue pair.
>
> i've checked the max locked memory settings across my machines;
>
> using ulimit -l in and outside of mpirun and they're all set to unlimited
> pam modules to ensure pam_limits.so is loaded and working
> the /etc/security/limits.conf is set for soft/hard mem to unlimited
>
> i tried a couple of quick mpi config settings i could think of;
>
> -mca mtl ^psm no affect
> -mca btl_openib_flags 1 no affect
>
> the openmpi faq says to tweak some mtt values in /sys, but since i'm
> not on mellanox that doesn't apply to me
>
> the machines are rhel 6.7, kernel 2.6.32-573.12.1(with bundled ofed),
> running on qlogic single-port infiniband cards, psm is enabled
>
> other collectives seem to run okay, it seems to only be alltoall comms
> that fail and only at scale
>
> i believe (but can't prove) that this worked at one point, but i can't
> recall when i last tested it.  so it's reasonable to assume that some
> change to the system is preventing this.
>
> the question is, where should i start poking to find it?

bump?


[OMPI users] locked memory and queue pairs

2016-03-10 Thread Michael Di Domenico
when i try to run an openmpi job with >128 ranks (16 ranks per node)
using alltoall or alltoallv, i'm getting an error that the process was
unable to get a queue pair.

i've checked the max locked memory settings across my machines;

using ulimit -l in and outside of mpirun and they're all set to unlimited
pam modules to ensure pam_limits.so is loaded and working
the /etc/security/limits.conf is set for soft/hard mem to unlimited

i tried a couple of quick mpi config settings i could think of;

-mca mtl ^psm no affect
-mca btl_openib_flags 1 no affect

the openmpi faq says to tweak some mtt values in /sys, but since i'm
not on mellanox that doesn't apply to me

the machines are rhel 6.7, kernel 2.6.32-573.12.1(with bundled ofed),
running on qlogic single-port infiniband cards, psm is enabled

other collectives seem to run okay, it seems to only be alltoall comms
that fail and only at scale

i believe (but can't prove) that this worked at one point, but i can't
recall when i last tested it.  so it's reasonable to assume that some
change to the system is preventing this.

the question is, where should i start poking to find it?


[OMPI users] slurm openmpi 1.8.3 core bindings

2015-01-30 Thread Michael Di Domenico
I'm trying to get slurm and openmpi to cooperate when running multi
thread jobs.  i'm sure i'm doing something wrong, but i can't figure
out what

my node configuration is

2 nodes
2 sockets
6 cores per socket

i want to run

sbatch -N2 -n 8 --ntasks-per-node=4 --cpus-per-task=3 -w node1,node2
program.sbatch

inside the program.sbatch i'm calling openmpi

mpirun -n $SLURM_NTASKS --report-bindings program

when the binds report comes out i get

node1 rank 0 socket 0 core 0
node1 rank 1 socket 1 core 6
node1 rank 2 socket 0 core 1
node1 rank 3 socket 1 core 7
node2 rank 4 socket 0 core 0
node2 rank 5 socket 1 core 6
node2 rank 6 socket 0 core 1
node2 rank 7 socket 1 core 7

which is semi-fine, but when the job runs the resulting threads from
the program are locked (according to top) to those eight cores rather
then spreading themselves over the 24 cores available

i tried a few incantations of the map-by, bind-to, etc, but openmpi
basically complained about everything i tried for one reason or
another

my understand is that slurm should be passing the requested config to
openmpi (or openmpi is pulling from the environment somehow) and it
should magically work

if i skip slurm and run

mpirun -n 8 --map-by node:pe=3 -bind-to core -host node1,node2
--report-bindings program

node1 rank 0 socket 0 core 0
node2 rank 1 socket 0 core 0
node1 rank 2 socket 0 core 3
node2 rank 3 socket 0 core 3
node1 rank 4 socket 1 core 6
node2 rank 5 socket 1 core 6
node1 rank 6 socket 1 core 9
node2 rank 7 socket 1 core 9

i do get the behavior i want (though i would prefer a -npernode switch
in there, but openmpi complains).  the bindings look better and the
threads are not locked to the particular cores

therefore i'm pretty sure this is a problem between openmpi and slurm
and not necessarily with either individually

i did compile openmpi with the slurm support switch and we're using
the cgroups taskplugin within slurm

i guess ancillary to this, is there a way to turn off core
binding/placement routines and control the placement manually?


Re: [OMPI users] ipath_userinit errors

2014-11-06 Thread Michael Di Domenico
Andrew,

Thanks.  We're using the RHEL version because it was less complicated
for our environment in the past, but sounds like we might want to
reconsider that decision.

Do you know why we don't see the message with lower node count
allocations?  It only seems to happen when the node count gets over a
certain point?

thanks

On Wed, Nov 5, 2014 at 5:51 PM, Friedley, Andrew
<andrew.fried...@intel.com> wrote:
> Hi Michael,
>
> From what I understand, this is an issue with the qib driver and PSM from 
> RHEL 6.5 and 6.6, and will be fixed for 6.7.  There is no functional change 
> between qib->PSM API versions 11 and 12, so the message is harmless.  I 
> presume you're using the RHEL sourced package for a reason, but using an IFS 
> release would fix the problem until RHEL 6.7 is ready.
>
> Andrew
>
>> -Original Message-
>> From: users [mailto:users-boun...@open-mpi.org] On Behalf Of Michael Di
>> Domenico
>> Sent: Tuesday, November 4, 2014 8:35 AM
>> To: Open MPI Users
>> Subject: [OMPI users] ipath_userinit errors
>>
>> I'm getting the below message on my cluster(s).  It seems to only happen
>> when I try to use more then 64 nodes (16-cores each).  The clusters are
>> running RHEL 6.5 with Slurm and Openmpi-1.6.5 with PSM.
>> I'm using the OFED versions included with RHEL for infiniband support.
>>
>> ipath_userinit: Mismatched user minor version (12) and driver minor version
>> (11) while context sharing. Ensure that driver and library are from the same
>> release
>>
>> I already realize this is a warning message and the jobs complete.
>> Another user a little over a year ago had a similar issue that was tracked to
>> mismatched ofed versions.  Since i have a diskless cluster all my nodes are
>> identical.
>>
>> I'm not adverse to thinking there might not be something unique about my
>> machine, but since i have two separate machines doing it, I'm not really sure
>> where to look to triage the issue and see what might be set incorrectly.
>>
>> Any thoughts on where to start checking would be helpful, thanks...
>> ___
>> users mailing list
>> us...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
>> Link to this post: http://www.open-
>> mpi.org/community/lists/users/2014/11/25667.php
> ___
> users mailing list
> us...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/11/25694.php


[OMPI users] ipath_userinit errors

2014-11-04 Thread Michael Di Domenico
I'm getting the below message on my cluster(s).  It seems to only
happen when I try to use more then 64 nodes (16-cores each).  The
clusters are running RHEL 6.5 with Slurm and Openmpi-1.6.5 with PSM.
I'm using the OFED versions included with RHEL for infiniband support.

ipath_userinit: Mismatched user minor version (12) and driver minor
version (11) while context sharing. Ensure that driver and library are
from the same release

I already realize this is a warning message and the jobs complete.
Another user a little over a year ago had a similar issue that was
tracked to mismatched ofed versions.  Since i have a diskless cluster
all my nodes are identical.

I'm not adverse to thinking there might not be something unique about
my machine, but since i have two separate machines doing it, I'm not
really sure where to look to triage the issue and see what might be
set incorrectly.

Any thoughts on where to start checking would be helpful, thanks...


Re: [OMPI users] debugs for jobs not starting

2012-10-12 Thread Michael Di Domenico
turned on the daemon debugs for orted and noticed this difference

 i get this on all the good nodes (ones that actually started xhpl)

Daemon was launched on node08 - beginning to initialize
[node08:21230] [[64354,0],1] orted_cmd: received add_local_procs
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],84]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],85]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],86]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],87]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],88]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],89]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],90]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],91]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],92]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],93]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],94]
[node08:21230] [[64354,0],0] orted_recv: received sync+nidmap from
local proc [[64354,1],95]
[node08:21230] [[64354,0],1] orted: up and running - waiting for commands!
[node08:21230] procdir: /tmp/openmpi-sessions-user@node08_0/28/1/1
[node08:21230] jobdir: /tmp/openmpi-sessions-user@node08_/44228/1
[node08:21230] top: openmpi-sessions-user@node08_0
[node08:21230] tmp: /tmp
[...repeats the above five lines a bunch of times...]

--- get this on the ones that do not start xhpl

Daemon was launched on node06 - beginning to initialize
[node06:11230] [[46344,0],1] orted: up and running - waiting for commands!
[node06:11230] procdir: /tmp/openmpi-sessions-user@node06_0/28/1/1
[node06:11230] jobdir: /tmp/openmpi-sessions-user@node06_/44228/1
[node06:11230] top: openmpi-sessions-user@node06_0
[node06:11230] tmp: /tmp
[...above lines only come out once...]

On Fri, Oct 12, 2012 at 9:27 AM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> what isn't working is when i fire off an MPI job with over 800 ranks,
> they don't all actually start up a process
>
> fe, if i do srun -n 1024 --ntasks-per-node 12 xhpl
>
> and then do a 'pgrep xhpl | wc -l', on all of the allocated nodes, not
> all of them have actually started xhpl
>
> most will read 12 started processes, but an inconsistent list of nodes
> will fail to actually start xhpl and stall the whole job
>
> if i look at all the nodes allocated to my job, it does start the orte
> process though
>
> what i need to figure out, is why the orte process starts, but fails
> to actually start xhpl on some of the nodes
>
> unfortunately, the list of nodes that don't start xhpl during my runs
> changes each time and no hardware errors are being detected.  if i
> cancel the job and restart the job over and over, eventually one will
> actually kick off and run to completion.
>
> if i run the process outside of slurm just using openmpi, it seems to
> behave correctly, so i'm leaning towards a slurm interacting with
> openmpi problem.
>
> what i'd like to do is instrument a debug in openmpi that will tell me
> what openmpi is waiting on in order to kick off the xhpl binary
>
> i'm testing to see whether it's a psm related problem now, i'll check
> back if i can narrow the scope a little more
>
> On Thu, Oct 11, 2012 at 10:21 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> I'm afraid I'm confused - I don't understand what is and isn't working. What
>> "next process" isn't starting?
>>
>>
>> On Thu, Oct 11, 2012 at 9:41 AM, Michael Di Domenico
>> <mdidomeni...@gmail.com> wrote:
>>>
>>> adding some additional info
>>>
>>> did an strace on an orted process where xhpl failed to start, i did
>>> this after the mpirun execution, so i probably missed some output, but
>>> it keeps scrolling
>>>
>>> poll([{fd=4, events=POLLIN},{fd=7, events=POLLIN},{fd=8,
>>> events=POLLIN},{fd=10, events=POLLIN},{fd=12, events=POLLIN},{fd=13,
>>> events=POLLIN},{fd=14, events=POLLIN},{fd=15, events=POLLIN},{fd=16,
>>> events=POLLIN}], 9, 1000) = 0 (Timeout)
>>>
>>> i didn't see anything useful in /proc under those file descriptors,
>>> but perhaps i missed something i don't know to look for
>>>
>>> On Thu, Oct 11, 2012 at 12:06 PM, Michael Di Domenico
>>> <mdidomeni...@gmail.com> wrote:
>>> > too add a little more detail, it looks like xhpl is not actually
>>> > starting on all nodes when i kick of

Re: [OMPI users] debugs for jobs not starting

2012-10-12 Thread Michael Di Domenico
what isn't working is when i fire off an MPI job with over 800 ranks,
they don't all actually start up a process

fe, if i do srun -n 1024 --ntasks-per-node 12 xhpl

and then do a 'pgrep xhpl | wc -l', on all of the allocated nodes, not
all of them have actually started xhpl

most will read 12 started processes, but an inconsistent list of nodes
will fail to actually start xhpl and stall the whole job

if i look at all the nodes allocated to my job, it does start the orte
process though

what i need to figure out, is why the orte process starts, but fails
to actually start xhpl on some of the nodes

unfortunately, the list of nodes that don't start xhpl during my runs
changes each time and no hardware errors are being detected.  if i
cancel the job and restart the job over and over, eventually one will
actually kick off and run to completion.

if i run the process outside of slurm just using openmpi, it seems to
behave correctly, so i'm leaning towards a slurm interacting with
openmpi problem.

what i'd like to do is instrument a debug in openmpi that will tell me
what openmpi is waiting on in order to kick off the xhpl binary

i'm testing to see whether it's a psm related problem now, i'll check
back if i can narrow the scope a little more

On Thu, Oct 11, 2012 at 10:21 PM, Ralph Castain <r...@open-mpi.org> wrote:
> I'm afraid I'm confused - I don't understand what is and isn't working. What
> "next process" isn't starting?
>
>
> On Thu, Oct 11, 2012 at 9:41 AM, Michael Di Domenico
> <mdidomeni...@gmail.com> wrote:
>>
>> adding some additional info
>>
>> did an strace on an orted process where xhpl failed to start, i did
>> this after the mpirun execution, so i probably missed some output, but
>> it keeps scrolling
>>
>> poll([{fd=4, events=POLLIN},{fd=7, events=POLLIN},{fd=8,
>> events=POLLIN},{fd=10, events=POLLIN},{fd=12, events=POLLIN},{fd=13,
>> events=POLLIN},{fd=14, events=POLLIN},{fd=15, events=POLLIN},{fd=16,
>> events=POLLIN}], 9, 1000) = 0 (Timeout)
>>
>> i didn't see anything useful in /proc under those file descriptors,
>> but perhaps i missed something i don't know to look for
>>
>> On Thu, Oct 11, 2012 at 12:06 PM, Michael Di Domenico
>> <mdidomeni...@gmail.com> wrote:
>> > too add a little more detail, it looks like xhpl is not actually
>> > starting on all nodes when i kick off the mpirun
>> >
>> > each time i cancel and restart the job, the nodes that do not start
>> > change, so i can't call it a bad node
>> >
>> > if i disable infiniband with --mca btl self,sm,tcp on occasion i can
>> > get xhpl to actually run, but it's not consistent
>> >
>> > i'm going to check my ethernet network and make sure there's no
>> > problems there (could this be an OOB error with mpirun?), on the nodes
>> > that fail to start xhpl, i do see the orte process, but nothing in the
>> > logs about why it failed to launch xhpl
>> >
>> >
>> >
>> > On Thu, Oct 11, 2012 at 11:49 AM, Michael Di Domenico
>> > <mdidomeni...@gmail.com> wrote:
>> >> I'm trying to diagnose an MPI job (in this case xhpl), that fails to
>> >> start when the rank count gets fairly high into the thousands.
>> >>
>> >> My symptom is the jobs fires up via slurm, and I can see all the xhpl
>> >> processes on the nodes, but it never kicks over to the next process.
>> >>
>> >> My question is, what debugs should I turn on to tell me what the
>> >> system might be waiting on?
>> >>
>> >> I've checked a bunch of things, but I'm probably overlooking something
>> >> trivial (which is par for me).
>> >>
>> >> I'm using the Openmpi 1.6.1, Slurm 2.4.2 on CentOS 6.3, with
>> >> Infiniband/PSM
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Re: [OMPI users] debugs for jobs not starting

2012-10-11 Thread Michael Di Domenico
adding some additional info

did an strace on an orted process where xhpl failed to start, i did
this after the mpirun execution, so i probably missed some output, but
it keeps scrolling

poll([{fd=4, events=POLLIN},{fd=7, events=POLLIN},{fd=8,
events=POLLIN},{fd=10, events=POLLIN},{fd=12, events=POLLIN},{fd=13,
events=POLLIN},{fd=14, events=POLLIN},{fd=15, events=POLLIN},{fd=16,
events=POLLIN}], 9, 1000) = 0 (Timeout)

i didn't see anything useful in /proc under those file descriptors,
but perhaps i missed something i don't know to look for

On Thu, Oct 11, 2012 at 12:06 PM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> too add a little more detail, it looks like xhpl is not actually
> starting on all nodes when i kick off the mpirun
>
> each time i cancel and restart the job, the nodes that do not start
> change, so i can't call it a bad node
>
> if i disable infiniband with --mca btl self,sm,tcp on occasion i can
> get xhpl to actually run, but it's not consistent
>
> i'm going to check my ethernet network and make sure there's no
> problems there (could this be an OOB error with mpirun?), on the nodes
> that fail to start xhpl, i do see the orte process, but nothing in the
> logs about why it failed to launch xhpl
>
>
>
> On Thu, Oct 11, 2012 at 11:49 AM, Michael Di Domenico
> <mdidomeni...@gmail.com> wrote:
>> I'm trying to diagnose an MPI job (in this case xhpl), that fails to
>> start when the rank count gets fairly high into the thousands.
>>
>> My symptom is the jobs fires up via slurm, and I can see all the xhpl
>> processes on the nodes, but it never kicks over to the next process.
>>
>> My question is, what debugs should I turn on to tell me what the
>> system might be waiting on?
>>
>> I've checked a bunch of things, but I'm probably overlooking something
>> trivial (which is par for me).
>>
>> I'm using the Openmpi 1.6.1, Slurm 2.4.2 on CentOS 6.3, with Infiniband/PSM


Re: [OMPI users] debugs for jobs not starting

2012-10-11 Thread Michael Di Domenico
too add a little more detail, it looks like xhpl is not actually
starting on all nodes when i kick off the mpirun

each time i cancel and restart the job, the nodes that do not start
change, so i can't call it a bad node

if i disable infiniband with --mca btl self,sm,tcp on occasion i can
get xhpl to actually run, but it's not consistent

i'm going to check my ethernet network and make sure there's no
problems there (could this be an OOB error with mpirun?), on the nodes
that fail to start xhpl, i do see the orte process, but nothing in the
logs about why it failed to launch xhpl



On Thu, Oct 11, 2012 at 11:49 AM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> I'm trying to diagnose an MPI job (in this case xhpl), that fails to
> start when the rank count gets fairly high into the thousands.
>
> My symptom is the jobs fires up via slurm, and I can see all the xhpl
> processes on the nodes, but it never kicks over to the next process.
>
> My question is, what debugs should I turn on to tell me what the
> system might be waiting on?
>
> I've checked a bunch of things, but I'm probably overlooking something
> trivial (which is par for me).
>
> I'm using the Openmpi 1.6.1, Slurm 2.4.2 on CentOS 6.3, with Infiniband/PSM


[OMPI users] debugs for jobs not starting

2012-10-11 Thread Michael Di Domenico
I'm trying to diagnose an MPI job (in this case xhpl), that fails to
start when the rank count gets fairly high into the thousands.

My symptom is the jobs fires up via slurm, and I can see all the xhpl
processes on the nodes, but it never kicks over to the next process.

My question is, what debugs should I turn on to tell me what the
system might be waiting on?

I've checked a bunch of things, but I'm probably overlooking something
trivial (which is par for me).

I'm using the Openmpi 1.6.1, Slurm 2.4.2 on CentOS 6.3, with Infiniband/PSM


Re: [OMPI users] srun and openmpi

2011-04-29 Thread Michael Di Domenico
Certainly, i reached out to several contacts I have inside qlogic (i
used to work there)...

On Fri, Apr 29, 2011 at 10:30 AM, Ralph Castain  wrote:
> Hi Michael
>
> I'm told that the Qlogic contacts we used to have are no longer there. Since 
> you obviously are a customer, can you ping them and ask (a) what that error 
> message means, and (b) what's wrong with the values I computed?
>
> You can also just send them my way, if that would help. We just need someone 
> to explain the requirements on that precondition value.
>
> Thanks
> Ralph


Re: [OMPI users] srun and openmpi

2011-04-29 Thread Michael Di Domenico
On Fri, Apr 29, 2011 at 10:01 AM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> On Fri, Apr 29, 2011 at 4:52 AM, Ralph Castain <r...@open-mpi.org> wrote:
>> Hi Michael
>>
>> Please see the attached updated patch to try for 1.5.3. I mistakenly free'd 
>> the envar after adding it to the environ :-/
>
> The patch works great, i can now see the precondition environment
> variable if i do
>
> mpirun -n 2 -host node1 
>
> and my  runs just fine, However if i do
>
> srun --resv-ports -n 2 -w node1 
>
> I get
>
> [node1:16780] PSM EP connect error (unknown connect error):
> [node1:16780]  node1
> [node1:16780] PSM EP connect error (Endpoint could not be reached):
> [node1:16780]  node1
>
> PML add procs failed
> --> Returned "Error" (-1) instead of "Success" (0)
>
> I did notice a difference in the precondition env variable between the two 
> runs
>
> mpirun -n 2 -host node1 
>
> sets precondition_transports=fbc383997ee1b668-00d40f1401d2e827 (which
> changes with each run (aka random))
>
> srun --resv-ports -n 2 -w node1 

this should have been "srun --resv-ports -n 1 -w node1 ", i
can't run a 2 rank job, i get the PML error above

>
> sets precondition_transports=1845-0001 (which
> doesn't seem to change run to run)
>



Re: [OMPI users] srun and openmpi

2011-04-29 Thread Michael Di Domenico
On Fri, Apr 29, 2011 at 4:52 AM, Ralph Castain  wrote:
> Hi Michael
>
> Please see the attached updated patch to try for 1.5.3. I mistakenly free'd 
> the envar after adding it to the environ :-/

The patch works great, i can now see the precondition environment
variable if i do

mpirun -n 2 -host node1 

and my  runs just fine, However if i do

srun --resv-ports -n 2 -w node1 

I get

[node1:16780] PSM EP connect error (unknown connect error):
[node1:16780]  node1
[node1:16780] PSM EP connect error (Endpoint could not be reached):
[node1:16780]  node1

PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)

I did notice a difference in the precondition env variable between the two runs

mpirun -n 2 -host node1 

sets precondition_transports=fbc383997ee1b668-00d40f1401d2e827 (which
changes with each run (aka random))

srun --resv-ports -n 2 -w node1 

sets precondition_transports=1845-0001 (which
doesn't seem to change run to run)



Re: [OMPI users] srun and openmpi

2011-04-28 Thread Michael Di Domenico
On Thu, Apr 28, 2011 at 9:03 AM, Ralph Castain <r...@open-mpi.org> wrote:
>
> On Apr 28, 2011, at 6:49 AM, Michael Di Domenico wrote:
>
>> On Wed, Apr 27, 2011 at 11:47 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>> On Apr 27, 2011, at 1:06 PM, Michael Di Domenico wrote:
>>>
>>>> On Wed, Apr 27, 2011 at 2:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>
>>>>> On Apr 27, 2011, at 12:38 PM, Michael Di Domenico wrote:
>>>>>
>>>>>> On Wed, Apr 27, 2011 at 2:25 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>>
>>>>>>> On Apr 27, 2011, at 10:09 AM, Michael Di Domenico wrote:
>>>>>>>
>>>>>>>> Was this ever committed to the OMPI src as something not having to be
>>>>>>>> run outside of OpenMPI, but as part of the PSM setup that OpenMPI
>>>>>>>> does?
>>>>>>>
>>>>>>> Not that I know of - I don't think the PSM developers ever looked at it.
>>>
>>> Thought about this some more and I believe I have a soln to the problem. 
>>> Will try to commit something to the devel trunk by the end of the week.
>>
>> Thanks
>
> Just to save me looking back thru the thread - what OMPI version are you 
> using? If it isn't the trunk, I'll send you a patch you can use.

I'm using OpenMPI v1.5.3 currently


Re: [OMPI users] srun and openmpi

2011-04-28 Thread Michael Di Domenico
On Wed, Apr 27, 2011 at 11:47 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
> On Apr 27, 2011, at 1:06 PM, Michael Di Domenico wrote:
>
>> On Wed, Apr 27, 2011 at 2:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>> On Apr 27, 2011, at 12:38 PM, Michael Di Domenico wrote:
>>>
>>>> On Wed, Apr 27, 2011 at 2:25 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>
>>>>> On Apr 27, 2011, at 10:09 AM, Michael Di Domenico wrote:
>>>>>
>>>>>> Was this ever committed to the OMPI src as something not having to be
>>>>>> run outside of OpenMPI, but as part of the PSM setup that OpenMPI
>>>>>> does?
>>>>>
>>>>> Not that I know of - I don't think the PSM developers ever looked at it.
>
> Thought about this some more and I believe I have a soln to the problem. Will 
> try to commit something to the devel trunk by the end of the week.

Thanks


Re: [OMPI users] srun and openmpi

2011-04-27 Thread Michael Di Domenico
On Wed, Apr 27, 2011 at 2:46 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
> On Apr 27, 2011, at 12:38 PM, Michael Di Domenico wrote:
>
>> On Wed, Apr 27, 2011 at 2:25 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>
>>> On Apr 27, 2011, at 10:09 AM, Michael Di Domenico wrote:
>>>
>>>> Was this ever committed to the OMPI src as something not having to be
>>>> run outside of OpenMPI, but as part of the PSM setup that OpenMPI
>>>> does?
>>>
>>> Not that I know of - I don't think the PSM developers ever looked at it.
>>>
>>>>
>>>> I'm having some trouble getting Slurm/OpenMPI to play nice with the
>>>> setup of this key.  Namely, with slurm you cannot export variables
>>>> from the --prolog of an srun, only from an --task-prolog,
>>>> unfortunately, if you use a task-prolog each rank gets a different
>>>> key, which doesn't work.
>>>>
>>>> I'm also guessing that each unique mpirun needs it's own psm key, not
>>>> one for the whole system, so i can't just make it a permanent
>>>> parameter somewhere else.
>>>>
>>>> Also, i recall reading somewhere that the --resv-ports parameter that
>>>> OMPI uses from slurm to choose a list of ports to use for TCP comm's,
>>>> tries to lock a port from the pool three times before giving up.
>>>
>>> Had to look back at the code - I think you misread this. I can find no 
>>> evidence in the code that we try to bind that port more than once.
>>
>> Perhaps i misstated, i don't believe you're trying to bind to the same
>> port twice during the same session.  i believe the code re-uses
>> similar ports from session to session.  what i believe happens (but
>> could be totally wrong) the previous session releases the port, but
>> linux isn't quite done with it when the new session tries to bind to
>> the port, in which case it tries three times and then fails the job
>
> Actually, I understood you correctly. I'm just saying that I find no evidence 
> in the code that we try three times before giving up. What I see is a single 
> attempt to bind the port - if it fails, then we abort. There is no parameter 
> to control that behavior.
>
> So if the OS hasn't released the port by the time a new job starts on that 
> node, then it will indeed abort if the job was unfortunately given the same 
> port reservation.

Oh, okay, sorry...



Re: [OMPI users] srun and openmpi

2011-04-27 Thread Michael Di Domenico
On Wed, Apr 27, 2011 at 2:25 PM, Ralph Castain <r...@open-mpi.org> wrote:
>
> On Apr 27, 2011, at 10:09 AM, Michael Di Domenico wrote:
>
>> Was this ever committed to the OMPI src as something not having to be
>> run outside of OpenMPI, but as part of the PSM setup that OpenMPI
>> does?
>
> Not that I know of - I don't think the PSM developers ever looked at it.
>
>>
>> I'm having some trouble getting Slurm/OpenMPI to play nice with the
>> setup of this key.  Namely, with slurm you cannot export variables
>> from the --prolog of an srun, only from an --task-prolog,
>> unfortunately, if you use a task-prolog each rank gets a different
>> key, which doesn't work.
>>
>> I'm also guessing that each unique mpirun needs it's own psm key, not
>> one for the whole system, so i can't just make it a permanent
>> parameter somewhere else.
>>
>> Also, i recall reading somewhere that the --resv-ports parameter that
>> OMPI uses from slurm to choose a list of ports to use for TCP comm's,
>> tries to lock a port from the pool three times before giving up.
>
> Had to look back at the code - I think you misread this. I can find no 
> evidence in the code that we try to bind that port more than once.

Perhaps i misstated, i don't believe you're trying to bind to the same
port twice during the same session.  i believe the code re-uses
similar ports from session to session.  what i believe happens (but
could be totally wrong) the previous session releases the port, but
linux isn't quite done with it when the new session tries to bind to
the port, in which case it tries three times and then fails the job



Re: [OMPI users] srun and openmpi

2011-04-27 Thread Michael Di Domenico
Was this ever committed to the OMPI src as something not having to be
run outside of OpenMPI, but as part of the PSM setup that OpenMPI
does?

I'm having some trouble getting Slurm/OpenMPI to play nice with the
setup of this key.  Namely, with slurm you cannot export variables
from the --prolog of an srun, only from an --task-prolog,
unfortunately, if you use a task-prolog each rank gets a different
key, which doesn't work.

I'm also guessing that each unique mpirun needs it's own psm key, not
one for the whole system, so i can't just make it a permanent
parameter somewhere else.

Also, i recall reading somewhere that the --resv-ports parameter that
OMPI uses from slurm to choose a list of ports to use for TCP comm's,
tries to lock a port from the pool three times before giving up.

Can someone tell me where that parameter is set, i'd like to set it to
a higher value.  We're seeing issues where running a large number of
short srun's sequentially is causing some of the mpirun's in the
stream to be killed because they could not lock the ports.

I suspect because of the lag between when the port is actually closed
in linux and when ompi re-opens a new port is very quick, we're trying
three times and giving up.  I have more then enough ports in the
resv-ports list, 30k.  but i suspect there is some random re-use being
done and it's failing

thanks


On Mon, Jan 3, 2011 at 10:00 AM, Jeff Squyres <jsquy...@cisco.com> wrote:
> Yo Ralph --
>
> I see this was committed https://svn.open-mpi.org/trac/ompi/changeset/24197.  
> Do you want to add a blurb in README about it, and/or have this executable 
> compiled as part of the PSM MTL and then installed into $bindir (maybe named 
> ompi-psm-keygen)?
>
> Right now, it's only compiled as part of "make check" and not installed, 
> right?
>
> On Dec 30, 2010, at 5:07 PM, Ralph Castain wrote:
>
>> Run the program only once - it can be in the prolog of the job if you like. 
>> The output value needs to be in the env of every rank.
>>
>> You can reuse the value as many times as you like - it doesn't have to be 
>> unique for each job. There is nothing magic about the value itself.
>>
>> On Dec 30, 2010, at 2:11 PM, Michael Di Domenico wrote:
>>
>>> How early does this need to run? Can I run it as part of a task
>>> prolog, or does it need to be the shell env for each rank?  And does
>>> it need to run on one node or all the nodes in the job?
>>>
>>> On Thu, Dec 30, 2010 at 8:54 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Well, I couldn't do it as a patch - proved too complicated as the psm 
>>>> system looks for the value early in the boot procedure.
>>>>
>>>> What I can do is give you the attached key generator program. It outputs 
>>>> the envar required to run your program. So if you run the attached program 
>>>> and then export the output into your environment, you should be okay. 
>>>> Looks like this:
>>>>
>>>> $ ./psm_keygen
>>>> OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
>>>> $
>>>>
>>>> You compile the program with the usual mpicc.
>>>>
>>>> Let me know if this solves the problem (or not).



[OMPI users] Ofed v1.5.3?

2011-04-16 Thread Michael Di Domenico
Does OpenMPI v1.5.3 support Ofed v.1.5.3.1 ?


Re: [OMPI users] alltoall messages > 2^26

2011-04-11 Thread Michael Di Domenico
Here's a chunk of code that reproduces the error everytime on my cluster

If you call it with $((2**24)) as a parameter it should run fine, change it
to $((2**27)) and it will stall

On Tue, Apr 5, 2011 at 11:24 AM, Terry Dontje <terry.don...@oracle.com>wrote:

>  It was asked during the community concall whether the below may be related
> to ticket #2722 https://svn.open-mpi.org/trac/ompi/ticket/2722?
>
> --td
>
> On 04/04/2011 10:17 PM, David Zhang wrote:
>
> Any error messages?  Maybe the nodes ran out of memory?  I know MPI
> implement some kind of buffering under the hood, so even though you're
> sending array's over 2^26 in size, it may require more than that for MPI to
> actually send it.
>
> On Mon, Apr 4, 2011 at 2:16 PM, Michael Di Domenico <
> mdidomeni...@gmail.com> wrote:
>
>> Has anyone seen an issue where OpenMPI/Infiniband hangs when sending
>> messages over 2^26 in size?
>>
>> For a reason i have not determined just yet machines on my cluster
>> (OpenMPI v1.5 and Qlogic Stack/QDR IB Adapters) is failing to send
>> array's over 2^26 in size via the AllToAll collective. (user code)
>>
>> Further testing seems to indicate that an MPI message over 2^26 fails
>> (tested with IMB-MPI)
>>
>> Running the same test on a different older IB connected cluster seems
>> to work, which would seem to indicate a problem with the infiniband
>> drivers of some sort rather then openmpi (but i'm not sure).
>>
>> Any thoughts, directions, or tests?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> David Zhang
> University of California, San Diego
>
>
> ___
> users mailing 
> listusers@open-mpi.orghttp://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> [image: Oracle]
> Terry D. Dontje | Principal Software Engineer
> Developer Tools Engineering | +1.781.442.2631
>  Oracle * - Performance Technologies*
>  95 Network Drive, Burlington, MA 01803
> Email terry.don...@oracle.com
>
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>

#include 
#include 
#include 
#include 
#include 

typedef signed char	int8;
typedef unsigned char	uint8;
typedef short		int16;
typedef unsigned short	uint16;
typedef int		int32;
typedef unsigned int	uint32;
typedef long		int64;
typedef unsigned long	uint64;

#define	I64(c)		(c##L)
#define	UI64(c)		(c##uL)

#define	_BR_RUNUP_	128
#define	_BR_LG_TABSZ_	7
#define	_BR_TABSZ_	(I64(1) << _BR_LG_TABSZ_)

#define	_ZERO64		UI64(0x0)

#define	_maskl(x)	(((x) == 0) ? _ZERO64 : ((~_ZERO64) << (64-(x
#define	_maskr(x)	(((x) == 0) ? _ZERO64 : ((~_ZERO64) >> (64-(x

#define	_BR_64STEP_(H,L,A,B) {\
	uint64	x;\
	x = H ^ (H << A) ^ (L >> (64 - A));\
	H = L | (x >> (B - 64));\
	L = x << (128 - B);\
}

static uint64_t _rtc()
{
	unsigned hi, lo, tmp;
	asm volatile ("rdtsc" : "=a" (lo), "=d" (hi));
	return (uint64_t)hi << 32 | lo;
}

typedef struct
{
	uint64	hi, lo, ind;
	uint64	tab[_BR_TABSZ_];
} brand_t;

static uint64 brand (brand_t *p)
{
	uint64	hi=p->hi, lo=p->lo, i=p->ind, ret;
	
	ret = p->tab[i];

	// 64-step a primitive trinomial LRS:  0, 45, 118   
	_BR_64STEP_(hi,lo,45,118);

	p->tab[i] = ret + hi;
	p->hi  = hi;
	p->lo  = lo;
	p->ind = hi & _maskr(_BR_LG_TABSZ_);

	return ret;
}

static void brand_init (brand_t *p, uint64 val)
{
	int64	i;
	uint64	hi, lo;

	hi = UI64(0x9ccae22ed2c6e578) ^ val;
	lo = UI64(0xce4db5d70739bd22) & _maskl(118-64);

	// we 64-step 0, 33, 118 in the initialization   
	for (i = 0; i < 64; i++)
		_BR_64STEP_(hi,lo,33,118);

	for (i = 0; i < _BR_TABSZ_; i++) {
		_BR_64STEP_(hi,lo,33,118);
		p->tab[i] = hi;
	}
	p->ind = _BR_TABSZ_/2;
	p->hi  = hi;
	p->lo  = lo;

	for (i = 0; i < _BR_RUNUP_; i++)
		brand(p);
}

void rubbish(brand_t* state, uint64_t n_words, uint64_t array[])
{
	uint64_t	i;

	for (i = 0; i < n_words; i++)
		array[i] = brand(state);
}

void usage(const char* prog)
{
	int	me;

	MPI_Comm_rank(MPI_COMM_WORLD, );
	if (me == 0)
		fprintf(stderr, "usage: %s #bytes/process\n", prog);

	exit(2);
}

int main(int argc, char* argv[])
{
	brand_t		state;
	int		i_proc, n_procs, words_per_chunk, loop;
	size_t		array_size;
	uint64_t*	source;
	uint64_t*	dest;

	MPI_Init(, );
	MPI_Comm_size(MPI_COMM_WORLD, _procs);
	MPI_Comm_rank(MPI_COMM_WORLD, _proc);
	MPI_Datatype uint64_type = MPI_LONG;
	MPI_Aint foo = 0;
	MPI_Type_extent(uint64_type, );

	assert(foo == 8);

	if (argc < 2)
		usage(argv[0]);

	arra

Re: [OMPI users] alltoall messages > 2^26

2011-04-05 Thread Michael Di Domenico
There are no messages being spit out, but i'm not sure i have all the
correct debugs turn on.  I turned on -debug-devel -debug-daemons and
mca_verbose.  but it appears that the process just hangs.

If it's memory exhaustion its not from the core memory these nodes
have 48GB of memory, it could be a buffer somewhere, but i'm not sure
where

On Mon, Apr 4, 2011 at 10:17 PM, David Zhang <solarbik...@gmail.com> wrote:
> Any error messages?  Maybe the nodes ran out of memory?  I know MPI
> implement some kind of buffering under the hood, so even though you're
> sending array's over 2^26 in size, it may require more than that for MPI to
> actually send it.
>
> On Mon, Apr 4, 2011 at 2:16 PM, Michael Di Domenico <mdidomeni...@gmail.com>
> wrote:
>>
>> Has anyone seen an issue where OpenMPI/Infiniband hangs when sending
>> messages over 2^26 in size?
>>
>> For a reason i have not determined just yet machines on my cluster
>> (OpenMPI v1.5 and Qlogic Stack/QDR IB Adapters) is failing to send
>> array's over 2^26 in size via the AllToAll collective. (user code)
>>
>> Further testing seems to indicate that an MPI message over 2^26 fails
>> (tested with IMB-MPI)
>>
>> Running the same test on a different older IB connected cluster seems
>> to work, which would seem to indicate a problem with the infiniband
>> drivers of some sort rather then openmpi (but i'm not sure).
>>
>> Any thoughts, directions, or tests?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> --
> David Zhang
> University of California, San Diego
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] srun and openmpi

2011-01-25 Thread Michael Di Domenico
Yes, i am setting the config correcty.  Our IB machines seem to run
just fine so far using srun and openmpi v1.5.

As another data point, we enabled mpi-threads in Openmpi and that also
seems to trigger the Srun/TCP behavior, but on the IB fabric.  Running
the program within an salloc rather the straight srun and the problem
seems to go away



On Tue, Jan 25, 2011 at 2:59 PM, Nathan Hjelm <hje...@lanl.gov> wrote:
> We are seeing the similar problem with our infiniband machines. After some
> investigation I discovered that we were not setting our slurm environment
> correctly (ref:
> https://computing.llnl.gov/linux/slurm/mpi_guide.html#open_mpi). Are you
> setting the ports in your slurm.conf and executing srun with --resv-ports?
>
> I have yet to see if this fixes the problem for LANL. Waiting on a sysadmin
> to modify the slurm.conf.
>
> -Nathan
> HPC-3, LANL
>
> On Tue, 25 Jan 2011, Michael Di Domenico wrote:
>
>> Thanks.  We're only seeing it on machines with Ethernet only as the
>> interconnect.  fortunately for us that only equates to one small
>> machine, but it's still annoying.  unfortunately, i don't have enough
>> knowledge to dive into the code to help fix, but i can certainly help
>> test
>>
>> On Mon, Jan 24, 2011 at 1:41 PM, Nathan Hjelm <hje...@lanl.gov> wrote:
>>>
>>> I am seeing similar issues on our slurm clusters. We are looking into the
>>> issue.
>>>
>>> -Nathan
>>> HPC-3, LANL
>>>
>>> On Tue, 11 Jan 2011, Michael Di Domenico wrote:
>>>
>>>> Any ideas on what might be causing this one?  Or atleast what
>>>> additional debug information someone might need?
>>>>
>>>> On Fri, Jan 7, 2011 at 4:03 PM, Michael Di Domenico
>>>> <mdidomeni...@gmail.com> wrote:
>>>>>
>>>>> I'm still testing the slurm integration, which seems to work fine so
>>>>> far.  However, i just upgraded another cluster to openmpi-1.5 and
>>>>> slurm 2.1.15 but this machine has no infiniband
>>>>>
>>>>> if i salloc the nodes and mpirun the command it seems to run and
>>>>> complete
>>>>> fine
>>>>> however if i srun the command i get
>>>>>
>>>>> [btl_tcp_endpoint:486] mca_btl_tcp_endpoint_recv_connect_ack received
>>>>> unexpected prcoess identifier
>>>>>
>>>>> the job does not seem to run, but exhibits two behaviors
>>>>> running a single process per node the job runs and does not present
>>>>> the error (srun -N40 --ntasks-per-node=1)
>>>>> running multiple processes per node, the job spits out the error but
>>>>> does not run (srun -n40 --ntasks-per-node=8)
>>>>>
>>>>> I copied the configs from the other machine, so (i think) everything
>>>>> should be configured correctly (but i can't rule it out)
>>>>>
>>>>> I saw (and reported) a similar error to above with the 1.4-dev branch
>>>>> (see mailing list) and slurm, I can't say whether they're related or
>>>>> not though
>>>>>
>>>>>
>>>>> On Mon, Jan 3, 2011 at 3:00 PM, Jeff Squyres <jsquy...@cisco.com>
>>>>> wrote:
>>>>>>
>>>>>> Yo Ralph --
>>>>>>
>>>>>> I see this was committed
>>>>>> https://svn.open-mpi.org/trac/ompi/changeset/24197.  Do you want to
>>>>>> add a
>>>>>> blurb in README about it, and/or have this executable compiled as part
>>>>>> of
>>>>>> the PSM MTL and then installed into $bindir (maybe named
>>>>>> ompi-psm-keygen)?
>>>>>>
>>>>>> Right now, it's only compiled as part of "make check" and not
>>>>>> installed,
>>>>>> right?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Dec 30, 2010, at 5:07 PM, Ralph Castain wrote:
>>>>>>
>>>>>>> Run the program only once - it can be in the prolog of the job if you
>>>>>>> like. The output value needs to be in the env of every rank.
>>>>>>>
>>>>>>> You can reuse the value as many times as you like - it doesn't have
>>>>>>> to
>>>>>>> be unique for each job. There is nothing magic about the value
>>>>>>> itself.
>>>>>>>
>>>>>>>

Re: [OMPI users] srun and openmpi

2011-01-25 Thread Michael Di Domenico
Thanks.  We're only seeing it on machines with Ethernet only as the
interconnect.  fortunately for us that only equates to one small
machine, but it's still annoying.  unfortunately, i don't have enough
knowledge to dive into the code to help fix, but i can certainly help
test

On Mon, Jan 24, 2011 at 1:41 PM, Nathan Hjelm <hje...@lanl.gov> wrote:
> I am seeing similar issues on our slurm clusters. We are looking into the
> issue.
>
> -Nathan
> HPC-3, LANL
>
> On Tue, 11 Jan 2011, Michael Di Domenico wrote:
>
>> Any ideas on what might be causing this one?  Or atleast what
>> additional debug information someone might need?
>>
>> On Fri, Jan 7, 2011 at 4:03 PM, Michael Di Domenico
>> <mdidomeni...@gmail.com> wrote:
>>>
>>> I'm still testing the slurm integration, which seems to work fine so
>>> far.  However, i just upgraded another cluster to openmpi-1.5 and
>>> slurm 2.1.15 but this machine has no infiniband
>>>
>>> if i salloc the nodes and mpirun the command it seems to run and complete
>>> fine
>>> however if i srun the command i get
>>>
>>> [btl_tcp_endpoint:486] mca_btl_tcp_endpoint_recv_connect_ack received
>>> unexpected prcoess identifier
>>>
>>> the job does not seem to run, but exhibits two behaviors
>>> running a single process per node the job runs and does not present
>>> the error (srun -N40 --ntasks-per-node=1)
>>> running multiple processes per node, the job spits out the error but
>>> does not run (srun -n40 --ntasks-per-node=8)
>>>
>>> I copied the configs from the other machine, so (i think) everything
>>> should be configured correctly (but i can't rule it out)
>>>
>>> I saw (and reported) a similar error to above with the 1.4-dev branch
>>> (see mailing list) and slurm, I can't say whether they're related or
>>> not though
>>>
>>>
>>> On Mon, Jan 3, 2011 at 3:00 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>>>
>>>> Yo Ralph --
>>>>
>>>> I see this was committed
>>>> https://svn.open-mpi.org/trac/ompi/changeset/24197.  Do you want to add a
>>>> blurb in README about it, and/or have this executable compiled as part of
>>>> the PSM MTL and then installed into $bindir (maybe named ompi-psm-keygen)?
>>>>
>>>> Right now, it's only compiled as part of "make check" and not installed,
>>>> right?
>>>>
>>>>
>>>>
>>>> On Dec 30, 2010, at 5:07 PM, Ralph Castain wrote:
>>>>
>>>>> Run the program only once - it can be in the prolog of the job if you
>>>>> like. The output value needs to be in the env of every rank.
>>>>>
>>>>> You can reuse the value as many times as you like - it doesn't have to
>>>>> be unique for each job. There is nothing magic about the value itself.
>>>>>
>>>>> On Dec 30, 2010, at 2:11 PM, Michael Di Domenico wrote:
>>>>>
>>>>>> How early does this need to run? Can I run it as part of a task
>>>>>> prolog, or does it need to be the shell env for each rank?  And does
>>>>>> it need to run on one node or all the nodes in the job?
>>>>>>
>>>>>> On Thu, Dec 30, 2010 at 8:54 PM, Ralph Castain <r...@open-mpi.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> Well, I couldn't do it as a patch - proved too complicated as the psm
>>>>>>> system looks for the value early in the boot procedure.
>>>>>>>
>>>>>>> What I can do is give you the attached key generator program. It
>>>>>>> outputs the envar required to run your program. So if you run the 
>>>>>>> attached
>>>>>>> program and then export the output into your environment, you should be
>>>>>>> okay. Looks like this:
>>>>>>>
>>>>>>> $ ./psm_keygen
>>>>>>>
>>>>>>> OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
>>>>>>> $
>>>>>>>
>>>>>>> You compile the program with the usual mpicc.
>>>>>>>
>>>>>>> Let me know if this solves the problem (or not).
>>>>>>> Ralph
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On

Re: [OMPI users] srun and openmpi

2011-01-11 Thread Michael Di Domenico
Any ideas on what might be causing this one?  Or atleast what
additional debug information someone might need?

On Fri, Jan 7, 2011 at 4:03 PM, Michael Di Domenico
<mdidomeni...@gmail.com> wrote:
> I'm still testing the slurm integration, which seems to work fine so
> far.  However, i just upgraded another cluster to openmpi-1.5 and
> slurm 2.1.15 but this machine has no infiniband
>
> if i salloc the nodes and mpirun the command it seems to run and complete fine
> however if i srun the command i get
>
> [btl_tcp_endpoint:486] mca_btl_tcp_endpoint_recv_connect_ack received
> unexpected prcoess identifier
>
> the job does not seem to run, but exhibits two behaviors
> running a single process per node the job runs and does not present
> the error (srun -N40 --ntasks-per-node=1)
> running multiple processes per node, the job spits out the error but
> does not run (srun -n40 --ntasks-per-node=8)
>
> I copied the configs from the other machine, so (i think) everything
> should be configured correctly (but i can't rule it out)
>
> I saw (and reported) a similar error to above with the 1.4-dev branch
> (see mailing list) and slurm, I can't say whether they're related or
> not though
>
>
> On Mon, Jan 3, 2011 at 3:00 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>> Yo Ralph --
>>
>> I see this was committed https://svn.open-mpi.org/trac/ompi/changeset/24197. 
>>  Do you want to add a blurb in README about it, and/or have this executable 
>> compiled as part of the PSM MTL and then installed into $bindir (maybe named 
>> ompi-psm-keygen)?
>>
>> Right now, it's only compiled as part of "make check" and not installed, 
>> right?
>>
>>
>>
>> On Dec 30, 2010, at 5:07 PM, Ralph Castain wrote:
>>
>>> Run the program only once - it can be in the prolog of the job if you like. 
>>> The output value needs to be in the env of every rank.
>>>
>>> You can reuse the value as many times as you like - it doesn't have to be 
>>> unique for each job. There is nothing magic about the value itself.
>>>
>>> On Dec 30, 2010, at 2:11 PM, Michael Di Domenico wrote:
>>>
>>>> How early does this need to run? Can I run it as part of a task
>>>> prolog, or does it need to be the shell env for each rank?  And does
>>>> it need to run on one node or all the nodes in the job?
>>>>
>>>> On Thu, Dec 30, 2010 at 8:54 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> Well, I couldn't do it as a patch - proved too complicated as the psm 
>>>>> system looks for the value early in the boot procedure.
>>>>>
>>>>> What I can do is give you the attached key generator program. It outputs 
>>>>> the envar required to run your program. So if you run the attached 
>>>>> program and then export the output into your environment, you should be 
>>>>> okay. Looks like this:
>>>>>
>>>>> $ ./psm_keygen
>>>>> OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
>>>>> $
>>>>>
>>>>> You compile the program with the usual mpicc.
>>>>>
>>>>> Let me know if this solves the problem (or not).
>>>>> Ralph
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Dec 30, 2010, at 11:18 AM, Michael Di Domenico wrote:
>>>>>
>>>>>> Sure, i'll give it a go
>>>>>>
>>>>>> On Thu, Dec 30, 2010 at 5:53 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>>> Ah, yes - that is going to be a problem. The PSM key gets generated by 
>>>>>>> mpirun as it is shared info - i.e., every proc has to get the same 
>>>>>>> value.
>>>>>>>
>>>>>>> I can create a patch that will do this for the srun direct-launch 
>>>>>>> scenario, if you want to try it. Would be later today, though.
>>>>>>>
>>>>>>>
>>>>>>> On Dec 30, 2010, at 10:31 AM, Michael Di Domenico wrote:
>>>>>>>
>>>>>>>> Well maybe not horray, yet.  I might have jumped the gun a bit, it's
>>>>>>>> looking like srun works in general, but perhaps not with PSM
>>>>>>>>
>>>>>>>> With PSM i get this error, (at least now i know what i changed)
>>>>>>>>
>>>>>>>> Error obtaining unique transport key from ORTE
&g

Re: [OMPI users] CQ errors

2011-01-10 Thread Michael Di Domenico
2011/1/10 Peter Kjellström <c...@nsc.liu.se>:
> On Monday, January 10, 2011 03:06:06 pm Michael Di Domenico wrote:
>> I'm not sure if these are being reported from OpenMPI or through
>> OpenMPI from OpenFabrics, but i figured this would be a good place to
>> start
>>
>> On one node we received the below errors, i'm not sure i under the
>> error sequence, hopefully someone can shed some light on what
>> happened.
>>
>> [[5691,1],49][btl_openib_component.c:3294:handle_wc] from node27 to:
> ...
>> network is qlogic qdr end to end, openmpi 1.5 and ofed 1.5.2 (q stack)
>
> Not really addressing your problem, but, with qlogic you should be using psm,
> not verbs (btl_openib).
>
> That said, openib should work (slowly).

Yes, you are correct.  We're running via verbs at the moment because
of a slurm interop issue.  I have a patch from ralph but have not
tested it yet.

So far the only noticeable to effect to running non-psm is a 5usec hit
on each packet.  otherwise functionally we seem okay.



[OMPI users] CQ errors

2011-01-10 Thread Michael Di Domenico
I'm not sure if these are being reported from OpenMPI or through
OpenMPI from OpenFabrics, but i figured this would be a good place to
start

On one node we received the below errors, i'm not sure i under the
error sequence, hopefully someone can shed some light on what
happened.

[[5691,1],49][btl_openib_component.c:3294:handle_wc] from node27 to:
node28 error polling HP CQ with status WORK_REQUEST FLUSHED ERROR
status number 5 for wr_id c30b100 opcode 128 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from node26 to:
node28 error polling LP CQ with status RETRY EXCEEDED ERROR status
number 12 for wr_id 1755c900 opcode 1 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from (null) to:
node28 error polling HP CQ with status WORK_REQUEST FLUSHED ERROR
status number 5 for wr_id 1779b180 opcode 128 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from node20 to:
node28 error polling HP CQ with status WORK_REQUEST FLUSHED ERROR
status number 5 for wr_id 8e1aa80 opcode 128 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from node24 to:
node28 error polling LP CQ with status RETRY EXCEEDED ERROR status
number 12 for wr_id 1164b600 opcode 1 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from (null) to:
node28 error polling HP CQ with status WORK_REQUEST FLUSHED ERROR
status number 5 for wr_id 118c3f80 opcode 128 vendor error 0 qp_idx 0
[[5691,1],49][btl_openib_component.c:3294:handle_wc] from node12 to:
node28 error polling HP CQ with status WORK_REQUEST FLUSHED ERROR
status number 5 for wr_id 1b8f0080 opcode 128 vendor error 0 qp_idx 0

It was the only node out of a 75 node run that spit out the error.  I
rechecked the node, no symbol/link recovery errors on the network and
ran Pallas between it and several other machines with no errors

network is qlogic qdr end to end, openmpi 1.5 and ofed 1.5.2 (q stack)

thanks


Re: [OMPI users] srun and openmpi

2011-01-07 Thread Michael Di Domenico
I'm still testing the slurm integration, which seems to work fine so
far.  However, i just upgraded another cluster to openmpi-1.5 and
slurm 2.1.15 but this machine has no infiniband

if i salloc the nodes and mpirun the command it seems to run and complete fine
however if i srun the command i get

[btl_tcp_endpoint:486] mca_btl_tcp_endpoint_recv_connect_ack received
unexpected prcoess identifier

the job does not seem to run, but exhibits two behaviors
running a single process per node the job runs and does not present
the error (srun -N40 --ntasks-per-node=1)
running multiple processes per node, the job spits out the error but
does not run (srun -n40 --ntasks-per-node=8)

I copied the configs from the other machine, so (i think) everything
should be configured correctly (but i can't rule it out)

I saw (and reported) a similar error to above with the 1.4-dev branch
(see mailing list) and slurm, I can't say whether they're related or
not though


On Mon, Jan 3, 2011 at 3:00 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> Yo Ralph --
>
> I see this was committed https://svn.open-mpi.org/trac/ompi/changeset/24197.  
> Do you want to add a blurb in README about it, and/or have this executable 
> compiled as part of the PSM MTL and then installed into $bindir (maybe named 
> ompi-psm-keygen)?
>
> Right now, it's only compiled as part of "make check" and not installed, 
> right?
>
>
>
> On Dec 30, 2010, at 5:07 PM, Ralph Castain wrote:
>
>> Run the program only once - it can be in the prolog of the job if you like. 
>> The output value needs to be in the env of every rank.
>>
>> You can reuse the value as many times as you like - it doesn't have to be 
>> unique for each job. There is nothing magic about the value itself.
>>
>> On Dec 30, 2010, at 2:11 PM, Michael Di Domenico wrote:
>>
>>> How early does this need to run? Can I run it as part of a task
>>> prolog, or does it need to be the shell env for each rank?  And does
>>> it need to run on one node or all the nodes in the job?
>>>
>>> On Thu, Dec 30, 2010 at 8:54 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> Well, I couldn't do it as a patch - proved too complicated as the psm 
>>>> system looks for the value early in the boot procedure.
>>>>
>>>> What I can do is give you the attached key generator program. It outputs 
>>>> the envar required to run your program. So if you run the attached program 
>>>> and then export the output into your environment, you should be okay. 
>>>> Looks like this:
>>>>
>>>> $ ./psm_keygen
>>>> OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
>>>> $
>>>>
>>>> You compile the program with the usual mpicc.
>>>>
>>>> Let me know if this solves the problem (or not).
>>>> Ralph
>>>>
>>>>
>>>>
>>>>
>>>> On Dec 30, 2010, at 11:18 AM, Michael Di Domenico wrote:
>>>>
>>>>> Sure, i'll give it a go
>>>>>
>>>>> On Thu, Dec 30, 2010 at 5:53 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> Ah, yes - that is going to be a problem. The PSM key gets generated by 
>>>>>> mpirun as it is shared info - i.e., every proc has to get the same value.
>>>>>>
>>>>>> I can create a patch that will do this for the srun direct-launch 
>>>>>> scenario, if you want to try it. Would be later today, though.
>>>>>>
>>>>>>
>>>>>> On Dec 30, 2010, at 10:31 AM, Michael Di Domenico wrote:
>>>>>>
>>>>>>> Well maybe not horray, yet.  I might have jumped the gun a bit, it's
>>>>>>> looking like srun works in general, but perhaps not with PSM
>>>>>>>
>>>>>>> With PSM i get this error, (at least now i know what i changed)
>>>>>>>
>>>>>>> Error obtaining unique transport key from ORTE
>>>>>>> (orte_precondition_transports not present in the environment)
>>>>>>> PML add procs failed
>>>>>>> --> Returned "Error" (-1) instead of "Success" (0)
>>>>>>>
>>>>>>> Turn off PSM and srun works fine
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 30, 2010 at 5:13 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>>> wrote:
>>>>>>>> Hooray!
>>>>>>>>
>&g

Re: [OMPI users] srun and openmpi

2010-12-30 Thread Michael Di Domenico
How early does this need to run? Can I run it as part of a task
prolog, or does it need to be the shell env for each rank?  And does
it need to run on one node or all the nodes in the job?

On Thu, Dec 30, 2010 at 8:54 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Well, I couldn't do it as a patch - proved too complicated as the psm system 
> looks for the value early in the boot procedure.
>
> What I can do is give you the attached key generator program. It outputs the 
> envar required to run your program. So if you run the attached program and 
> then export the output into your environment, you should be okay. Looks like 
> this:
>
> $ ./psm_keygen
> OMPI_MCA_orte_precondition_transports=0099b3eaa2c1547e-afb287789133a954
> $
>
> You compile the program with the usual mpicc.
>
> Let me know if this solves the problem (or not).
> Ralph
>
>
>
>
> On Dec 30, 2010, at 11:18 AM, Michael Di Domenico wrote:
>
>> Sure, i'll give it a go
>>
>> On Thu, Dec 30, 2010 at 5:53 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> Ah, yes - that is going to be a problem. The PSM key gets generated by 
>>> mpirun as it is shared info - i.e., every proc has to get the same value.
>>>
>>> I can create a patch that will do this for the srun direct-launch scenario, 
>>> if you want to try it. Would be later today, though.
>>>
>>>
>>> On Dec 30, 2010, at 10:31 AM, Michael Di Domenico wrote:
>>>
>>>> Well maybe not horray, yet.  I might have jumped the gun a bit, it's
>>>> looking like srun works in general, but perhaps not with PSM
>>>>
>>>> With PSM i get this error, (at least now i know what i changed)
>>>>
>>>> Error obtaining unique transport key from ORTE
>>>> (orte_precondition_transports not present in the environment)
>>>> PML add procs failed
>>>> --> Returned "Error" (-1) instead of "Success" (0)
>>>>
>>>> Turn off PSM and srun works fine
>>>>
>>>>
>>>> On Thu, Dec 30, 2010 at 5:13 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>> Hooray!
>>>>>
>>>>> On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:
>>>>>
>>>>>> I think i take it all back.  I just tried it again and it seems to
>>>>>> work now.  I'm not sure what I changed (between my first and this
>>>>>> msg), but it does appear to work now.
>>>>>>
>>>>>> On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
>>>>>> <mdidomeni...@gmail.com> wrote:
>>>>>>> Yes that's true, error messages help.  I was hoping there was some
>>>>>>> documentation to see what i've done wrong.  I can't easily cut and
>>>>>>> paste errors from my cluster.
>>>>>>>
>>>>>>> Here's a snippet (hand typed) of the error message, but it does look
>>>>>>> like a rank communications error
>>>>>>>
>>>>>>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>>>>>>> contact information is unknown in file rml_oob_send.c at line 145.
>>>>>>> *** MPI_INIT failure message (snipped) ***
>>>>>>> orte_grpcomm_modex failed
>>>>>>> --> Returned "A messages is attempting to be sent to a process whose
>>>>>>> contact information us uknown" (-117) instead of "Success" (0)
>>>>>>>
>>>>>>> This msg repeats for each rank, an ultimately hangs the srun which i
>>>>>>> have to Ctrl-C and terminate
>>>>>>>
>>>>>>> I have mpiports defined in my slurm config and running srun with
>>>>>>> -resv-ports does show the SLURM_RESV_PORTS environment variable
>>>>>>> getting parts to the shell
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> 
>>>>>>> wrote:
>>>>>>>> I'm not sure there is any documentation yet - not much clamor for it. 
>>>>>>>> :-/
>>>>>>>>
>>>>>>>> It would really help if you included the error message. Otherwise, all 
>>>>>>>> I can do is guess, which wastes both of our time :-(
>>>>>>>>
>>>&g

Re: [OMPI users] srun and openmpi

2010-12-30 Thread Michael Di Domenico
Sure, i'll give it a go

On Thu, Dec 30, 2010 at 5:53 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Ah, yes - that is going to be a problem. The PSM key gets generated by mpirun 
> as it is shared info - i.e., every proc has to get the same value.
>
> I can create a patch that will do this for the srun direct-launch scenario, 
> if you want to try it. Would be later today, though.
>
>
> On Dec 30, 2010, at 10:31 AM, Michael Di Domenico wrote:
>
>> Well maybe not horray, yet.  I might have jumped the gun a bit, it's
>> looking like srun works in general, but perhaps not with PSM
>>
>> With PSM i get this error, (at least now i know what i changed)
>>
>> Error obtaining unique transport key from ORTE
>> (orte_precondition_transports not present in the environment)
>> PML add procs failed
>> --> Returned "Error" (-1) instead of "Success" (0)
>>
>> Turn off PSM and srun works fine
>>
>>
>> On Thu, Dec 30, 2010 at 5:13 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>> Hooray!
>>>
>>> On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:
>>>
>>>> I think i take it all back.  I just tried it again and it seems to
>>>> work now.  I'm not sure what I changed (between my first and this
>>>> msg), but it does appear to work now.
>>>>
>>>> On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
>>>> <mdidomeni...@gmail.com> wrote:
>>>>> Yes that's true, error messages help.  I was hoping there was some
>>>>> documentation to see what i've done wrong.  I can't easily cut and
>>>>> paste errors from my cluster.
>>>>>
>>>>> Here's a snippet (hand typed) of the error message, but it does look
>>>>> like a rank communications error
>>>>>
>>>>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>>>>> contact information is unknown in file rml_oob_send.c at line 145.
>>>>> *** MPI_INIT failure message (snipped) ***
>>>>> orte_grpcomm_modex failed
>>>>> --> Returned "A messages is attempting to be sent to a process whose
>>>>> contact information us uknown" (-117) instead of "Success" (0)
>>>>>
>>>>> This msg repeats for each rank, an ultimately hangs the srun which i
>>>>> have to Ctrl-C and terminate
>>>>>
>>>>> I have mpiports defined in my slurm config and running srun with
>>>>> -resv-ports does show the SLURM_RESV_PORTS environment variable
>>>>> getting parts to the shell
>>>>>
>>>>>
>>>>> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> I'm not sure there is any documentation yet - not much clamor for it. :-/
>>>>>>
>>>>>> It would really help if you included the error message. Otherwise, all I 
>>>>>> can do is guess, which wastes both of our time :-(
>>>>>>
>>>>>> My best guess is that the port reservation didn't get passed down to the 
>>>>>> MPI procs properly - but that's just a guess.
>>>>>>
>>>>>>
>>>>>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>>>>>>
>>>>>>> Can anyone point me towards the most recent documentation for using
>>>>>>> srun and openmpi?
>>>>>>>
>>>>>>> I followed what i found on the web with enabling the MpiPorts config
>>>>>>> in slurm and using the --resv-ports switch, but I'm getting an error
>>>>>>> from openmpi during setup.
>>>>>>>
>>>>>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>>>>>>
>>>>>>> I'm sure I'm missing a step.
>>>>>>>
>>>>>>> Thanks
>>>>>>> ___
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>>
>>>>>> ___
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>
>>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] srun and openmpi

2010-12-30 Thread Michael Di Domenico
Well maybe not horray, yet.  I might have jumped the gun a bit, it's
looking like srun works in general, but perhaps not with PSM

With PSM i get this error, (at least now i know what i changed)

Error obtaining unique transport key from ORTE
(orte_precondition_transports not present in the environment)
PML add procs failed
--> Returned "Error" (-1) instead of "Success" (0)

Turn off PSM and srun works fine


On Thu, Dec 30, 2010 at 5:13 PM, Ralph Castain <r...@open-mpi.org> wrote:
> Hooray!
>
> On Dec 30, 2010, at 9:57 AM, Michael Di Domenico wrote:
>
>> I think i take it all back.  I just tried it again and it seems to
>> work now.  I'm not sure what I changed (between my first and this
>> msg), but it does appear to work now.
>>
>> On Thu, Dec 30, 2010 at 4:31 PM, Michael Di Domenico
>> <mdidomeni...@gmail.com> wrote:
>>> Yes that's true, error messages help.  I was hoping there was some
>>> documentation to see what i've done wrong.  I can't easily cut and
>>> paste errors from my cluster.
>>>
>>> Here's a snippet (hand typed) of the error message, but it does look
>>> like a rank communications error
>>>
>>> ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
>>> contact information is unknown in file rml_oob_send.c at line 145.
>>> *** MPI_INIT failure message (snipped) ***
>>> orte_grpcomm_modex failed
>>> --> Returned "A messages is attempting to be sent to a process whose
>>> contact information us uknown" (-117) instead of "Success" (0)
>>>
>>> This msg repeats for each rank, an ultimately hangs the srun which i
>>> have to Ctrl-C and terminate
>>>
>>> I have mpiports defined in my slurm config and running srun with
>>> -resv-ports does show the SLURM_RESV_PORTS environment variable
>>> getting parts to the shell
>>>
>>>
>>> On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> wrote:
>>>> I'm not sure there is any documentation yet - not much clamor for it. :-/
>>>>
>>>> It would really help if you included the error message. Otherwise, all I 
>>>> can do is guess, which wastes both of our time :-(
>>>>
>>>> My best guess is that the port reservation didn't get passed down to the 
>>>> MPI procs properly - but that's just a guess.
>>>>
>>>>
>>>> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>>>>
>>>>> Can anyone point me towards the most recent documentation for using
>>>>> srun and openmpi?
>>>>>
>>>>> I followed what i found on the web with enabling the MpiPorts config
>>>>> in slurm and using the --resv-ports switch, but I'm getting an error
>>>>> from openmpi during setup.
>>>>>
>>>>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>>>>
>>>>> I'm sure I'm missing a step.
>>>>>
>>>>> Thanks
>>>>> ___
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>>
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>
>>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



Re: [OMPI users] srun and openmpi

2010-12-30 Thread Michael Di Domenico
Yes that's true, error messages help.  I was hoping there was some
documentation to see what i've done wrong.  I can't easily cut and
paste errors from my cluster.

Here's a snippet (hand typed) of the error message, but it does look
like a rank communications error

ORTE_ERROR_LOG: A message is attempting to be sent to a process whose
contact information is unknown in file rml_oob_send.c at line 145.
*** MPI_INIT failure message (snipped) ***
orte_grpcomm_modex failed
--> Returned "A messages is attempting to be sent to a process whose
contact information us uknown" (-117) instead of "Success" (0)

This msg repeats for each rank, an ultimately hangs the srun which i
have to Ctrl-C and terminate

I have mpiports defined in my slurm config and running srun with
-resv-ports does show the SLURM_RESV_PORTS environment variable
getting parts to the shell


On Thu, Dec 23, 2010 at 8:09 PM, Ralph Castain <r...@open-mpi.org> wrote:
> I'm not sure there is any documentation yet - not much clamor for it. :-/
>
> It would really help if you included the error message. Otherwise, all I can 
> do is guess, which wastes both of our time :-(
>
> My best guess is that the port reservation didn't get passed down to the MPI 
> procs properly - but that's just a guess.
>
>
> On Dec 23, 2010, at 12:46 PM, Michael Di Domenico wrote:
>
>> Can anyone point me towards the most recent documentation for using
>> srun and openmpi?
>>
>> I followed what i found on the web with enabling the MpiPorts config
>> in slurm and using the --resv-ports switch, but I'm getting an error
>> from openmpi during setup.
>>
>> I'm using Slurm 2.1.15 and Openmpi 1.5 w/PSM
>>
>> I'm sure I'm missing a step.
>>
>> Thanks
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] flex.exe

2010-01-21 Thread Michael Di Domenico
openmpi-1.4.1/contrib/platform/win32/bin/flex.exe

I understand this file might be required for building on windows,
since I'm not I can just delete the file without issue.

However, for those of us under import restrictions, where binaries are
not allowed in, this file causes me to open the tarball and delete the
file (not a big deal, i know, i know).

But, can I put up a vote for a pure source only tree?

Thanks...


Re: [OMPI users] openmpi 1.4 and barrier

2009-10-01 Thread Michael Di Domenico
Hmm, i don't recall seeing that...

On Thu, Oct 1, 2009 at 1:51 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> FWIW, I saw this bug to have race-condition-like behavior.  I could run a
> few times and then it would work.
>
> On Oct 1, 2009, at 1:42 PM, Michael Di Domenico wrote:
>
>> On Thu, Oct 1, 2009 at 1:37 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>> > On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote:
>> >
>> >> I just upgraded to the devel snapshot of 1.4a1r22031
>> >>
>> >> when i run a simple hello world with a barrier i get
>> >>
>> >> btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received
>> >> unexpected process identifier
>> >
>> > I have seen this failure over the last day or three myself.  I'll file a
>> > trac ticket about it.
>> >
>> > (all's fair in love, war, and trunk development snapshots!)
>>
>> Okay, thanks...  Unfortunately i need the dev snap for slurm
>> intergration... :(
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] openmpi 1.4 and barrier

2009-10-01 Thread Michael Di Domenico
On Thu, Oct 1, 2009 at 1:37 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
> On Oct 1, 2009, at 1:24 PM, Michael Di Domenico wrote:
>
>> I just upgraded to the devel snapshot of 1.4a1r22031
>>
>> when i run a simple hello world with a barrier i get
>>
>> btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received
>> unexpected process identifier
>
> I have seen this failure over the last day or three myself.  I'll file a
> trac ticket about it.
>
> (all's fair in love, war, and trunk development snapshots!)

Okay, thanks...  Unfortunately i need the dev snap for slurm intergration... :(


[OMPI users] openmpi 1.4 and barrier

2009-10-01 Thread Michael Di Domenico
I just upgraded to the devel snapshot of 1.4a1r22031

when i run a simple hello world with a barrier i get

btl_tcp_endpoint.c:484:mca_btl_tcp_endpoint_recv_connect_ack] received
unexpected process identifier

if i pull the barrier out the hello world runs fine

interestingly enough, i can run IMB which also uses barrier and it
runs just fine

Any thoughts?


Re: [OMPI users] strange IMB runs

2009-08-14 Thread Michael Di Domenico
> One of the differences among MPI implementations is the default placement of
> processes within the node.  E.g., should processes by default be collocated
> on cores of the same socket or on cores of different sockets?  I don't know
> if that issue is applicable here (that is, HP MPI vs Open MPI or on
> Superdome architecture), but it's potentially an issue to look at.  With HP
> MPI, mpirun has a -cpu_bind switch for controlling placement.  With Open
> MPI, mpirun controls placement with -rankfile.
>
> E.g., what happens if you try
>
> % cat rf1
> rank 0=XX  slot=0
> rank 1=XX  slot=1
> % cat rf2
> rank 0=XX  slot=0
> rank 1=XX  slot=2
> % cat rf3
> rank 0=XX  slot=0
> rank 1=XX  slot=3
> [...etc...]
> % mpirun -np 2 --mca btl self,sm --host XX,XX -rf rf1 $PWD/IMB-MPI1 pingpong
> % mpirun -np 2 --mca btl self,sm --host XX,XX -rf rf2 $PWD/IMB-MPI1 pingpong
> % mpirun -np 2 --mca btl self,sm --host XX,XX -rf rf3 $PWD/IMB-MPI1 pingpong
> [...etc...]
>
> where XX is the name of your node and you march through all the cores on
> your Superdome node?

I tried this, but it didn't seem to make a difference either


Re: [OMPI users] strange IMB runs

2009-08-14 Thread Michael Di Domenico
On Thu, Aug 13, 2009 at 1:51 AM, Eugene Loh wrote:
> Also, I'm puzzled why you should see better results by changing
> btl_sm_eager_limit.  That shouldn't change long-message bandwidth, but only
> the message size at which one transitions from short to long messages.  If
> anything, tweaking btl_sm_max_send_size would be the variable to try.

Changing this value does tweak the 1,2,4MB messages, but all others
are the same and it's not consistent.

I've also noticed that the latency is nearly 2usec high on OpenMPI vs HPMPI


Re: [OMPI users] strange IMB runs

2009-08-13 Thread Michael Di Domenico
On Thu, Aug 13, 2009 at 1:51 AM, Eugene Loh wrote:
>>Is this behavior expected?  Are there any tunables to get the OpenMPI
>>sockets up near HP-MPI?
>
> First, I want to understand the configuration.  It's just a single node.  No
> interconnect (InfiniBand or Ethernet or anything).  Right?

Yes, the superdome arch is itanium processors all connected internally
by some interlink.  No ethernet or infiniband is at play.


Re: [OMPI users] strange IMB runs

2009-08-12 Thread Michael Di Domenico
So pushing this along a little more

running with openmpi-1.3 svn rev 20295

mpirun -np 2
  -mca btl sm,self
  -mca mpi_paffinity_alone 1
  -mca mpi_leave_pinned 1
  -mca btl_sm_eager_limit 8192
$PWD/IMB-MPI1 pingpong

Yields ~390MB/sec

So we're getting there, but still only about half speed


On Thu, Aug 6, 2009 at 9:30 AM, Michael Di
Domenico<mdidomeni...@gmail.com> wrote:
> Here's an interesting data point.  I installed the RHEL rpm version of
> OpenMPI 1.2.7-6 for ia64
>
> mpirun -np 2 -mca btl self,sm -mca mpi_paffinity_alone 1 -mca
> mpi_leave_pinned 1 $PWD/IMB-MPI1 pingpong
>
> With v1.3 and -mca btl self,sm i get ~150MB/sec
> With v1.3 and -mca btl self,tcp i get ~550MB/sec
>
> With v1.2.7-6 and -mca btl self,sm i get ~225MB/sec
> With v1.2.7-6 and -mca btl self,tcp i get ~650MB/sec
>
>
> On Fri, Jul 31, 2009 at 10:42 AM, Edgar Gabriel<gabr...@cs.uh.edu> wrote:
>> Michael Di Domenico wrote:
>>>
>>> mpi_leave_pinned didn't help still at ~145MB/sec
>>> btl_sm_eager_limit from 4096 to 8192 pushes me upto ~212MB/sec, but
>>> pushing it past that doesn't change it anymore
>>>
>>> Are there any intelligent programs that can go through and test all
>>> the different permutations of tunables for openmpi?  Outside of me
>>> just writing an ugly looping script...
>>
>> actually there is,
>>
>> http://svn.open-mpi.org/svn/otpo/trunk/
>>
>> this tool has been used to tune openib parameter, and I would guess that it
>> could be used without any modification to also run netpipe over sm...
>>
>> Thanks
>> Edgar
>>>
>>> On Wed, Jul 29, 2009 at 1:55 PM, Dorian Krause<doriankra...@web.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> --mca mpi_leave_pinned 1
>>>>
>>>> might help. Take a look at the FAQ for various tuning parameters.
>>>>
>>>>
>>>> Michael Di Domenico wrote:
>>>>>
>>>>> I'm not sure I understand what's actually happened here.  I'm running
>>>>> IMB on an HP superdome, just comparing the PingPong benchmark
>>>>>
>>>>> HP-MPI v2.3
>>>>> Max ~ 700-800MB/sec
>>>>>
>>>>> OpenMPI v1.3
>>>>> -mca btl self,sm - Max ~ 125-150MB/sec
>>>>> -mca btl self,tcp - Max ~ 500-550MB/sec
>>>>>
>>>>> Is this behavior expected?  Are there any tunables to get the OpenMPI
>>>>> sockets up near HP-MPI?
>


[OMPI users] x4100 with IB

2009-08-07 Thread Michael Di Domenico
I have several Sun x4100 with Infiniband which appear to be running at
400MB/sec instead of 800MB/sec.  It a freshly reformatted cluster
converting from solaris to linux.  We also reset the bios settings
with "load optimal defaults". Does anyone know which bios setting i
changed to dump the BW?

x4100
mellanox ib
ofed-1.4.1-rc6 w/ openmpi


Re: [OMPI users] strange IMB runs

2009-07-31 Thread Michael Di Domenico
mpi_leave_pinned didn't help still at ~145MB/sec
btl_sm_eager_limit from 4096 to 8192 pushes me upto ~212MB/sec, but
pushing it past that doesn't change it anymore

Are there any intelligent programs that can go through and test all
the different permutations of tunables for openmpi?  Outside of me
just writing an ugly looping script...

On Wed, Jul 29, 2009 at 1:55 PM, Dorian Krause<doriankra...@web.de> wrote:
> Hi,
>
> --mca mpi_leave_pinned 1
>
> might help. Take a look at the FAQ for various tuning parameters.
>
>
> Michael Di Domenico wrote:
>>
>> I'm not sure I understand what's actually happened here.  I'm running
>> IMB on an HP superdome, just comparing the PingPong benchmark
>>
>> HP-MPI v2.3
>> Max ~ 700-800MB/sec
>>
>> OpenMPI v1.3
>> -mca btl self,sm - Max ~ 125-150MB/sec
>> -mca btl self,tcp - Max ~ 500-550MB/sec
>>
>> Is this behavior expected?  Are there any tunables to get the OpenMPI
>> sockets up near HP-MPI?
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] strange IMB runs

2009-07-30 Thread Michael Di Domenico
On Thu, Jul 30, 2009 at 10:08 AM, George Bosilca wrote:
> The leave pinned will not help in this context. It can only help for devices
> capable of real RMA operations and that require pinned memory, which
> unfortunately is not the case for TCP. What is [really] strange about your
> results is that you get a 4 times better bandwidth over TCP than over shared
> memory. Over TCP there are 2 extra memory copies (compared with sm) plus a
> bunch of syscalls, so there is absolutely no reason to get better
> performance.
>
> The Open MPI version is something you compiled or it came installed with the
> OS? If you compiled it can you please provide us the configure line?

OpenMPI was compiled from source v1.3 with only a --prefix line, no
other options.


[OMPI users] strange IMB runs

2009-07-29 Thread Michael Di Domenico
I'm not sure I understand what's actually happened here.  I'm running
IMB on an HP superdome, just comparing the PingPong benchmark

HP-MPI v2.3
Max ~ 700-800MB/sec

OpenMPI v1.3
-mca btl self,sm - Max ~ 125-150MB/sec
-mca btl self,tcp - Max ~ 500-550MB/sec

Is this behavior expected?  Are there any tunables to get the OpenMPI
sockets up near HP-MPI?


Re: [OMPI users] quadrics support?

2009-07-08 Thread Michael Di Domenico
On Wed, Jul 8, 2009 at 3:33 PM, Ashley Pittman wrote:
>> When i run tping i get:
>> ELAN_EXCEOPTIOn @ --: 6 (Initialization error)
>> elan_init: Can't get capability from environment
>>
>> I am not using slurm or RMS at all, just trying to get openmpi to run
>> between two nodes.
>
> To attach to the elan a process has to have a "capability" which is a
> kernel attribute describing the size (number of nodes/ranks) of the job,
> without this you'll get errors like the one from tping.  The only way to
> generate these capabilities is by using RMS, Slurm or I believe pdsh
> which can generate one and push it into the kernel before calling fork()
> to create the user application.

I didn't realize it was an MPI type program, so I ran is using the
QSNet version of mpirun and OpenMPI.  The process does start and runs
through 0: and 2:, which i assume are packet sizes, but freezes at
that point.

We have an existing XC cluster from HP, that we're trying to convert
from XC to standard RHEL5.3 w/ Slurm and OpenMPI.  All i want to be
able to do is load RHEL5 and the Quadrics NIC drivers, and run OpenMPI
jobs between these two nodes I yanked from the cluster before we
switch the whole thing over.


Re: [OMPI users] quadrics support?

2009-07-08 Thread Michael Di Domenico
On Wed, Jul 8, 2009 at 12:33 PM, Ashley Pittman wrote:
> Is the machine configured correctly to allow non OpenMPI QsNet programs
> to run, for example tping?
>
> Which resource manager are you running, I think slurm compiled for RMS
> is essential.

I can ping via TCP/IP using the eip0 ports.

When i run tping i get:
ELAN_EXCEOPTIOn @ --: 6 (Initialization error)
elan_init: Can't get capability from environment

I am not using slurm or RMS at all, just trying to get openmpi to run
between two nodes.

Using -mca btl self,tcp -mca btl_tcp_if_include eip0 i can run the
jobs no problem using sockets over the elan interface, but if i run
the job with -mca btl self,elan,tcp, below is the short snipped
output:

Signal: Segmentation fault (11)
Signal code: Invalid permissions (2)


Re: [OMPI users] quadrics support?

2009-07-07 Thread Michael Di Domenico
So, first run i seem to have run into a bit of an issue.  All the
Quadrics modules are compiled and loaded.  I can ping between nodes
over the quadrics interfaces.  But when i try to run one of the hello
mpi example from openmpi, i get:

first run, the process hung - killed with ctl-c
though it doesnt seem to actually die and kill -9 doesn't work

second run, the process fails with
  failed elan4_attach  Device or resource busy
  
  elan_allocSleepDesc  Failed to allocate IRQ cookie 2a: 22
Invalid argument
all subsequent runs fail the same way and i have to reboot the box to
get the processes to go away

I'm not sure if this is a quadrics or openmpi issue at this point, but
i figured since there are quadrics people on the list its a good place
to start

On Tue, Jul 7, 2009 at 3:30 PM, Michael Di
Domenico<mdidomeni...@gmail.com> wrote:
> Does OpenMPI/Quadrics require the Quadrics Kernel patches in order to
> operate?  Or operate at full speed or are the Quadrics modules
> sufficient?
>
> On Thu, Jul 2, 2009 at 1:52 PM, Ashley Pittman<ash...@pittman.co.uk> wrote:
>> On Thu, 2009-07-02 at 09:34 -0400, Michael Di Domenico wrote:
>>> Jeff,
>>>
>>> Okay, thanks.  I'll give it a shot and report back.  I can't
>>> contribute any code, but I can certainly do testing...
>>
>> I'm from the Quadrics stable so could certainty support a port should
>> you require it but I don't have access to hardware either currently.
>>
>> Ashley,
>>
>> --
>>
>> Ashley Pittman, Bath, UK.
>>
>> Padb - A parallel job inspection tool for cluster computing
>> http://padb.pittman.org.uk
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>


Re: [OMPI users] quadrics support?

2009-07-07 Thread Michael Di Domenico
Does OpenMPI/Quadrics require the Quadrics Kernel patches in order to
operate?  Or operate at full speed or are the Quadrics modules
sufficient?

On Thu, Jul 2, 2009 at 1:52 PM, Ashley Pittman<ash...@pittman.co.uk> wrote:
> On Thu, 2009-07-02 at 09:34 -0400, Michael Di Domenico wrote:
>> Jeff,
>>
>> Okay, thanks.  I'll give it a shot and report back.  I can't
>> contribute any code, but I can certainly do testing...
>
> I'm from the Quadrics stable so could certainty support a port should
> you require it but I don't have access to hardware either currently.
>
> Ashley,
>
> --
>
> Ashley Pittman, Bath, UK.
>
> Padb - A parallel job inspection tool for cluster computing
> http://padb.pittman.org.uk
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] quadrics support?

2009-07-02 Thread Michael Di Domenico
Jeff,

Okay, thanks.  I'll give it a shot and report back.  I can't
contribute any code, but I can certainly do testing...

On Thu, Jul 2, 2009 at 9:23 AM, Jeff Squyres<jsquy...@cisco.com> wrote:
> I see ompi/mca/btl/elan in the OMPI SVN development trunk and in the 1.3
> tree (where elan = the quadrics interface).
>
> So actually, looking at the 1.3.x README, I see configure switches like
> "--with-elan" that specifies where the Elan (Quadrics) headers and libraries
> live.  I have no Quadrics networks and didn't pay attention to this
> development at all (obviously ;-) ) -- you might want to give it a shot and
> see how well it performs.  Meaning: I'm sure it works or UT wouldn't have
> pushed this stuff upstream, but I have no idea how well tuned it is.
>
> If you build OMPI properly, you should be able to tell if Quadrics support
> is included via
>
>ompi_info | grep elan
>
> You should see a BTL line for elan (i.e., a BTL plugin for "elan" is
> installed and functional).  Although OMPI should automatically pick elan for
> MPI communications, you can force OMPI to pick it via:
>
>mpirun --mca btl elan,self ...
>
> Quadrics networks should also qualify for Open MPI's "other" type of network
> support (the MTL, instead of the BTL).  MTL level support can typically give
> slightly better performance on some types of networks, but it doesn't look
> like anyone did any work in this area.  Contributions are always welcome, of
> course!  :-)
>
>
>
> On Jul 2, 2009, at 9:12 AM, Michael Di Domenico wrote:
>
>> Jeff,
>>
>> Thanks, honestly though if the patches haven't been pulled mainline,
>> we are not likely to bring it internally.  I was hoping that quadrics
>> support was mainline, but the documentation was out of date.
>>
>> On Thu, Jul 2, 2009 at 8:08 AM, Jeff Squyres<jsquy...@cisco.com> wrote:
>> > George --
>> >
>> > I know that U. Tennessee did some work in this area; did it ever
>> > materialize?
>> >
>> >
>> > On Jul 1, 2009, at 4:49 PM, Michael Di Domenico wrote:
>> >
>> >> Did the quadrics support for OpenMPI ever materialize?  I can't find
>> >> any documentation on the web about it and the few mailing list
>> >> messages I came across showed some hints that it might be in progress
>> >> but that was way back in 2007
>> >>
>> >> Thanks
>> >> ___
>> >> users mailing list
>> >> us...@open-mpi.org
>> >> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >>
>> >
>> >
>> > --
>> > Jeff Squyres
>> > Cisco Systems
>> >
>> > ___
>> > users mailing list
>> > us...@open-mpi.org
>> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>> >
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] quadrics support?

2009-07-02 Thread Michael Di Domenico
Jeff,

Thanks, honestly though if the patches haven't been pulled mainline,
we are not likely to bring it internally.  I was hoping that quadrics
support was mainline, but the documentation was out of date.

On Thu, Jul 2, 2009 at 8:08 AM, Jeff Squyres<jsquy...@cisco.com> wrote:
> George --
>
> I know that U. Tennessee did some work in this area; did it ever
> materialize?
>
>
> On Jul 1, 2009, at 4:49 PM, Michael Di Domenico wrote:
>
>> Did the quadrics support for OpenMPI ever materialize?  I can't find
>> any documentation on the web about it and the few mailing list
>> messages I came across showed some hints that it might be in progress
>> but that was way back in 2007
>>
>> Thanks
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> --
> Jeff Squyres
> Cisco Systems
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


[OMPI users] quadrics support?

2009-07-01 Thread Michael Di Domenico
Did the quadrics support for OpenMPI ever materialize?  I can't find
any documentation on the web about it and the few mailing list
messages I came across showed some hints that it might be in progress
but that was way back in 2007

Thanks