Re: [OMPI users] Error with Open MPI 2.0.0: error obtaining device attributes for mlx5_0 errno says Cannot allocate memory

2016-07-13 Thread Aaron Knister
Well son of a gun. I just compiled the code with pgcc (version 16.5.0) instead of gcc and lo and behold: # pgcc -libverbs ./ib_verbs_q.c -o ib_verbs_q # ./ib_verbs_q error obtaining device attributes for mlx5_0 errno says Cannot allocate memory # gcc -libverbs ./ib_verbs_q.c -o ib_verbs_q #

Re: [OMPI users] Error with Open MPI 2.0.0: error obtaining device attributes for mlx5_0 errno says Cannot allocate memory

2016-07-13 Thread Aaron Knister
Matt, you're far too kind :) I put together a test program that uses the block of code in question and... it works for me? I've attached the reproducer here. A compile should be just a "gcc -libverbs ib_verbs_q.c". I'm a little perplexed. I truthfully didn't expect it to work given that the same

Re: [OMPI users] Error with Open MPI 2.0.0: error obtaining device attributes for mlx5_0 errno says Cannot allocate memory

2016-07-13 Thread Matt Thompson
On Wed, Jul 13, 2016 at 9:50 AM, Nathan Hjelm wrote: > As of 2.0.0 we now support experimental verbs. It looks like one of the > calls is failing: > > #if HAVE_DECL_IBV_EXP_QUERY_DEVICE > device->ib_exp_dev_attr.comp_mask = IBV_EXP_DEVICE_ATTR_RESERVED - 1; >

Re: [OMPI users] Error with Open MPI 2.0.0: error obtaining device attributes for mlx5_0 errno says Cannot allocate memory

2016-07-13 Thread Nathan Hjelm
As of 2.0.0 we now support experimental verbs. It looks like one of the calls is failing: #if HAVE_DECL_IBV_EXP_QUERY_DEVICE device->ib_exp_dev_attr.comp_mask = IBV_EXP_DEVICE_ATTR_RESERVED - 1; if(ibv_exp_query_device(device->ib_dev_context, >ib_exp_dev_attr)){ BTL_ERROR(("error

[OMPI users] Error with Open MPI 2.0.0: error obtaining device attributes for mlx5_0 errno says Cannot allocate memory

2016-07-13 Thread Matt Thompson
All, I've been struggling here at NASA Goddard trying to get PGI 16.5 + Open MPI 1.10.3 working on the Discover cluster. What was happening was I'd run our climate model at, say, 4x24 and it would work sometimes. Most of the time. Every once in a while, it'd throw a segfault. If we changed the

Re: [OMPI users] MPI-RMA rget doesn't complete the communication after mpi_wait

2016-07-13 Thread Jeff Squyres (jsquyres)
I've filed https://github.com/open-mpi/ompi/issues/1869 to track the issue. > On Jul 13, 2016, at 5:23 AM, Alfio Lazzaro wrote: > > Hi Jeff, > thanks for your reply. We tried it and it still doesn't work... > > Alfio > > 2016-07-13 1:19 GMT+02:00 Jeff Squyres

Re: [OMPI users] MPI-RMA rget doesn't complete the communication after mpi_wait

2016-07-13 Thread Alfio Lazzaro
Hi Jeff, thanks for your reply. We tried it and it still doesn't work... Alfio 2016-07-13 1:19 GMT+02:00 Jeff Squyres (jsquyres) : > Alfio -- > > We just released Open MPI v2.0.0, with lots of MPI RMA fixes. Would you > mind testing there? > > > > On Jul 12, 2016, at 1:33