Re: [OMPI users] Oversubscribing When Running Locally

2018-01-24 Thread Gilles Gouaillardet
Benjamin,

There was no need to open a new thread with the same title and a
slightly modified question,
it just added some confusion.

If you want to allow oversubscription by default, you can insert the
following line in your
/etc/openmpi-mca-params.conf (update the path if needed)

rmaps_base_oversubscribe = true

FWIW

you can also do that on a per user basis by adding the same line in
$HOME/.openmpi/mca-params.conf

last but not least, that can also be achieved via an environment variable
export OMPI_MCA_rmaps_base_oversubscribe=true

and as already answered, via the command line
mpirun --oversubscribe ...


Cheers,

Gilles

On Thu, Jan 25, 2018 at 7:57 AM, Jeff Squyres (jsquyres)
 wrote:
> Ben --
>
> Did you not see Jeff Hammond's reply earlier today?
>
> https://www.mail-archive.com/users@lists.open-mpi.org//msg31964.html
>
>
>> On Jan 24, 2018, at 5:40 PM, Benjamin Brock  wrote:
>>
>> Recently, when I try to run something locally with OpenMPI with more than 
>> two ranks (I have a dual-core machine), I get the friendly message
>>
>> --
>> There are not enough slots available in the system to satisfy the 3 slots
>> that were requested by the application:
>>   ./kmer_generic_hash
>>
>> Either request fewer slots for your application, or make more slots available
>> for use.
>> --
>>
>> Why is oversubscription now disabled by default when running without a 
>> hostfile?  And how can I turn this off?  Is the recommended way to do this 
>> editing /etc/openmpi/openmpi-default-hostfile?
>>
>> I'm using default OpenMPI 3.0.0 on Arch Linux.
>>
>> Cheers,
>>
>> Ben
>> ___
>> users mailing list
>> users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/users
>
>
> --
> Jeff Squyres
> jsquy...@cisco.com
>
>
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [OMPI users] Oversubscribing When Running Locally

2018-01-24 Thread Jeff Squyres (jsquyres)
Ben --

Did you not see Jeff Hammond's reply earlier today?

https://www.mail-archive.com/users@lists.open-mpi.org//msg31964.html


> On Jan 24, 2018, at 5:40 PM, Benjamin Brock  wrote:
> 
> Recently, when I try to run something locally with OpenMPI with more than two 
> ranks (I have a dual-core machine), I get the friendly message
> 
> --
> There are not enough slots available in the system to satisfy the 3 slots
> that were requested by the application:
>   ./kmer_generic_hash
> 
> Either request fewer slots for your application, or make more slots available
> for use.
> --
> 
> Why is oversubscription now disabled by default when running without a 
> hostfile?  And how can I turn this off?  Is the recommended way to do this 
> editing /etc/openmpi/openmpi-default-hostfile?
> 
> I'm using default OpenMPI 3.0.0 on Arch Linux.
> 
> Cheers,
> 
> Ben
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users


-- 
Jeff Squyres
jsquy...@cisco.com



___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users


Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Brice Goglin
#2 and #3 are OK

About #1, we could rename pkgconfig and pkgdata prefixes into something
like hwlocpkgconfig and hwlocpkgdata. I don't think the actual prefix
value matters. I'll try that tomorrow.

Brice


Le 24/01/2018 à 23:46, Balaji, Pavan a écrit :
> Hi Brice,
>
> Here are the other patches that we are currently maintaining for hwloc.  Can 
> you see if these can be integrated upstream too:
>
> https://github.com/pmodels/hwloc/commit/44fe0a500e7828bcb2390fbd24656a7a26b450ed
> https://github.com/pmodels/hwloc/commit/5b6d776a1226148030dcf4e26bd13fe16cc885f9
> https://github.com/pmodels/hwloc/commit/9bf3ff256511ea4092928438f5718904875e65e1
>
> The first one is definitely not usable as-is, since that breaks standalone 
> builds.  But I'm interested in hearing about any better solution that you 
> might have.
>
> Thanks,
>
>   -- Pavan
>
>> On Jan 24, 2018, at 4:43 PM, Brice Goglin  wrote:
>>
>> Thanks, I am fixing this for rc2 tomorrow.
>>
>> Brice
>>
>>
>>
>> Le 24/01/2018 à 22:59, Balaji, Pavan a écrit :
>>> Folks,
>>>
>>> I'm seeing these warnings on the mac os when building hwloc-2.0rc1 with 
>>> clang:
>>>
>>> 8<
>>> CC   lstopo-lstopo.o
>>> lstopo.c: In function 'usage':
>>> lstopo.c:425:7: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates 
>>> to 0 [-Wundef]
>>> #elif CAIRO_HAS_XLIB_SURFACE && (defined HWLOC_HAVE_X11_KEYSYM)
>>>  ^~
>>> lstopo.c: In function 'main':
>>> lstopo.c:1041:5: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, 
>>> evaluates to 0 [-Wundef]
>>> #if CAIRO_HAS_XLIB_SURFACE && defined HWLOC_HAVE_X11_KEYSYM
>>> 8<
>>>
>>> 8<
>>> % clang --version
>>> Apple LLVM version 9.0.0 (clang-900.0.39.2)
>>> Target: x86_64-apple-darwin17.4.0
>>> Thread model: posix
>>> InstalledDir: 
>>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>>> 8<
>>>
>>> Thanks,
>>>
>>> -- Pavan
>>>
>>> ___
>>> hwloc-users mailing list
>>> hwloc-users@lists.open-mpi.org
>>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Balaji, Pavan
Hi Brice,

Here are the other patches that we are currently maintaining for hwloc.  Can 
you see if these can be integrated upstream too:

https://github.com/pmodels/hwloc/commit/44fe0a500e7828bcb2390fbd24656a7a26b450ed
https://github.com/pmodels/hwloc/commit/5b6d776a1226148030dcf4e26bd13fe16cc885f9
https://github.com/pmodels/hwloc/commit/9bf3ff256511ea4092928438f5718904875e65e1

The first one is definitely not usable as-is, since that breaks standalone 
builds.  But I'm interested in hearing about any better solution that you might 
have.

Thanks,

  -- Pavan

> On Jan 24, 2018, at 4:43 PM, Brice Goglin  wrote:
> 
> Thanks, I am fixing this for rc2 tomorrow.
> 
> Brice
> 
> 
> 
> Le 24/01/2018 à 22:59, Balaji, Pavan a écrit :
>> Folks,
>> 
>> I'm seeing these warnings on the mac os when building hwloc-2.0rc1 with 
>> clang:
>> 
>> 8<
>> CC   lstopo-lstopo.o
>> lstopo.c: In function 'usage':
>> lstopo.c:425:7: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates 
>> to 0 [-Wundef]
>> #elif CAIRO_HAS_XLIB_SURFACE && (defined HWLOC_HAVE_X11_KEYSYM)
>>  ^~
>> lstopo.c: In function 'main':
>> lstopo.c:1041:5: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates 
>> to 0 [-Wundef]
>> #if CAIRO_HAS_XLIB_SURFACE && defined HWLOC_HAVE_X11_KEYSYM
>> 8<
>> 
>> 8<
>> % clang --version
>> Apple LLVM version 9.0.0 (clang-900.0.39.2)
>> Target: x86_64-apple-darwin17.4.0
>> Thread model: posix
>> InstalledDir: 
>> /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
>> 8<
>> 
>> Thanks,
>> 
>> -- Pavan
>> 
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
> 

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[OMPI users] Oversubscribing When Running Locally

2018-01-24 Thread Benjamin Brock
Recently, when I try to run something locally with OpenMPI with more than
two ranks (I have a dual-core machine), I get the friendly message

--
There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  ./kmer_generic_hash

Either request fewer slots for your application, or make more slots
available
for use.
--

Why is oversubscription now disabled by default when running without a
hostfile?  And how can I turn this off?  Is the recommended way to do this
editing /etc/openmpi/openmpi-default-hostfile?

I'm using default OpenMPI 3.0.0 on Arch Linux.

Cheers,

Ben
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

Re: [hwloc-users] OFED requirements for netloc

2018-01-24 Thread Brice Goglin
OK

In the meantime, maybe you can diff the ibnetdiscover outputs to see if
anything obvious appears? You might need to sort the lines first if the
outputs aren't ordered the same.

Brice




Le 24/01/2018 à 23:33, Craig West a écrit :
> Brice,
>
> The output isn't big, just a pair of IB switches and a dozen hosts,
> some with single, some dual connections.
> However, we would need to sanitise the data, or at least look at it in
> detail first to see what it contains.
>
> I can say that the ibnetdiscover and ibroute commands report version
> 1.6.5 on the system that seq faults, and 1.6.6 on the one that succeeds. 
> And that the first looks to be the standard OFED release and the 1.6.6
> version a mellanox release of OFED.
>
> Craig.
>
> On Tue, 23 Jan 2018 at 17:10 Brice Goglin  > wrote:
>
> Hello,
>
> If the output isn't too big, could you put the files gathered by
> netloc_ib_gather_raw online so that we look at them and try to
> reproduce the crash?
>
> Thanks
>
> Brice
>
>
>
> Le 23/01/2018 à 03:54, Craig West a écrit :
>> Hi,
>>
>> I can't find the version requirements for netloc. I've tried it
>> on an older version of OFED and a newer version of Mellanox OFED.
>> The newer version worked, the older segfaults when running the
>> "netloc_ib_extract_dats" process.
>>
>> Thanks,
>> Craig.
>>
>>
>> ___
>> hwloc-users mailing list
>> hwloc-users@lists.open-mpi.org
>> 
>> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org 
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

Re: [hwloc-users] OFED requirements for netloc

2018-01-24 Thread Craig West
Brice,

The output isn't big, just a pair of IB switches and a dozen hosts, some
with single, some dual connections.
However, we would need to sanitise the data, or at least look at it in
detail first to see what it contains.

I can say that the ibnetdiscover and ibroute commands report version 1.6.5
on the system that seq faults, and 1.6.6 on the one that succeeds.
And that the first looks to be the standard OFED release and the 1.6.6
version a mellanox release of OFED.

Craig.

On Tue, 23 Jan 2018 at 17:10 Brice Goglin  wrote:

> Hello,
>
> If the output isn't too big, could you put the files gathered by
> netloc_ib_gather_raw online so that we look at them and try to reproduce
> the crash?
>
> Thanks
>
> Brice
>
>
>
> Le 23/01/2018 à 03:54, Craig West a écrit :
>
> Hi,
>
> I can't find the version requirements for netloc. I've tried it on an
> older version of OFED and a newer version of Mellanox OFED. The newer
> version worked, the older segfaults when running the
> "netloc_ib_extract_dats" process.
>
> Thanks,
> Craig.
>
>
> ___
> hwloc-users mailing 
> listhwloc-us...@lists.open-mpi.orghttps://lists.open-mpi.org/mailman/listinfo/hwloc-users
>
>
> ___
> hwloc-users mailing list
> hwloc-users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/hwloc-users
___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users

[hwloc-users] hwloc-2.0rc1 build warnings

2018-01-24 Thread Balaji, Pavan
Folks,

I'm seeing these warnings on the mac os when building hwloc-2.0rc1 with clang:

8<
 CC   lstopo-lstopo.o
lstopo.c: In function 'usage':
lstopo.c:425:7: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates to 
0 [-Wundef]
#elif CAIRO_HAS_XLIB_SURFACE && (defined HWLOC_HAVE_X11_KEYSYM)
  ^~
lstopo.c: In function 'main':
lstopo.c:1041:5: warning: "CAIRO_HAS_XLIB_SURFACE" is not defined, evaluates to 
0 [-Wundef]
#if CAIRO_HAS_XLIB_SURFACE && defined HWLOC_HAVE_X11_KEYSYM
8<

8<
% clang --version
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin17.4.0
Thread model: posix
InstalledDir: 
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
8<

Thanks,

 -- Pavan

___
hwloc-users mailing list
hwloc-users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/hwloc-users


Re: [OMPI users] Oversubscribing

2018-01-24 Thread Jeff Hammond
mpirun --oversubscribe $OTHER_ARGS

Jeff

On Wed, Jan 24, 2018 at 12:13 PM, Benjamin Brock 
wrote:
>
> Recently, when I try to run something locally with OpenMPI with more than
two ranks (I have a dual-core machine), I get the friendly message
>
> --
> There are not enough slots available in the system to satisfy the 3 slots
> that were requested by the application:
>   ./kmer_generic_hash
>
> Either request fewer slots for your application, or make more slots
available
> for use.
> --
>
> Why is oversubscription now disabled by default when running without a
hostfile?  And how can I turn this off?
>
> I'm using default OpenMPI 3.0.0 on Arch Linux.
>
> Cheers,
>
> Ben
>
> ___
> users mailing list
> users@lists.open-mpi.org
> https://lists.open-mpi.org/mailman/listinfo/users




--
Jeff Hammond
jeff.scie...@gmail.com
http://jeffhammond.github.io/
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users

[OMPI users] Oversubscribing

2018-01-24 Thread Benjamin Brock
Recently, when I try to run something locally with OpenMPI with more than
two ranks (I have a dual-core machine), I get the friendly message

--
There are not enough slots available in the system to satisfy the 3 slots
that were requested by the application:
  ./kmer_generic_hash

Either request fewer slots for your application, or make more slots
available
for use.
--

Why is oversubscription now disabled by default when running without a
hostfile?  And how can I turn this off?

I'm using default OpenMPI 3.0.0 on Arch Linux.

Cheers,

Ben
___
users mailing list
users@lists.open-mpi.org
https://lists.open-mpi.org/mailman/listinfo/users