Hi Daniel

Yeah, this is a known problem traced to updating ofed to 3.12 - see this thread:

http://www.open-mpi.org/community/lists/users/2014/12/25924.php 
<http://www.open-mpi.org/community/lists/users/2014/12/25924.php>


> On Dec 9, 2014, at 7:16 AM, Faraj, Daniel A <daniel.a.fa...@intel.com> wrote:
> 
> I am having a trouble running simple benchmarks like osu bidirectional 
> bandwidth tests with recent OMPI (> version 1.8.1)over MLX. <>
> All versions including 1.8.1 seem to work.
> The issue is that FDR will hang frequently and will complain about physical 
> memory available for user run is very low.
>  
> The bug starts in v1.8.2.
> I searched the src code for differences, but no luck.
>  
> I get the message below and hangsā€¦
> --------------------------------------------------------------------------
> WARNING: It appears that your OpenFabrics subsystem is configured to only
> allow registering part of your physical memory.  This can cause MPI jobs to
> run with erratic performance, hang, and/or crash.
>  
> This may be caused by your OpenFabrics vendor limiting the amount of
> physical memory that can be registered.  You should investigate the
> relevant Linux kernel module parameters that control how much physical
> memory can be registered, and increase them to allow registering all
> physical memory on your machine.
>  
> See this Open MPI FAQ item for more information on these Linux kernel module
> parameters:
>  
>     http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages 
> <http://www.open-mpi.org/faq/?category=openfabrics#ib-locked-pages>
>  
>   Local host:              sb-cn16
>   Registerable memory:     24576 MiB
>   Total memory:            65457 MiB
>  
> Your MPI job will continue, but may be behave poorly and/or hang.
>  
>  
> ---------------
> Daniel Faraj
>  
> _______________________________________________
> users mailing list
> us...@open-mpi.org <mailto:us...@open-mpi.org>
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users 
> <http://www.open-mpi.org/mailman/listinfo.cgi/users>
> Link to this post: 
> http://www.open-mpi.org/community/lists/users/2014/12/25934.php 
> <http://www.open-mpi.org/community/lists/users/2014/12/25934.php>

Reply via email to