Iirc, there used to be a bug in Open MPI leading to such a false positive, but I cannot remember the details. I recommend you use at least the latest 1.10 (which is really a 1.8 + a few more features and several bug fixes) An other option is to simply +1 a mtt parameter and see if it helps
Cheers, Gilles On Sunday, March 26, 2017, Ilchenko Evgeniy <ilchenk...@gmail.com> wrote: > Hi all! > > I install older version openmpi 1.8 and get other error. For command > > mpirun -np 1 prog > > I get next output: > > -------------------------------------------------------------------------- > WARNING: It appears that your OpenFabrics subsystem is configured to only > allow registering part of your physical memory. This can cause MPI jobs to > run with erratic performance, hang, and/or crash. > > This may be caused by your OpenFabrics vendor limiting the amount of > physical memory that can be registered. You should investigate the > relevant Linux kernel module parameters that control how much physical > memory can be registered, and increase them to allow registering all > physical memory on your machine. > > See this Open MPI FAQ item for more information on these Linux kernel module > parameters: > http://www.open-mpi.org/faq/?category=openfabrics#ib-.. > > Local host: node107 > Registerable memory: 32768 MiB > Total memory: 65459 MiB > > Your MPI job will continue, but may be behave poorly and/or hang. > -------------------------------------------------------------------------- > hello from 0 > hello from 1 > [node107:48993] 1 more process has sent help message help-mpi- btl-openib.txt > / reg mem limit low > [node107:48993] Set MCA parameter "orte_base_help_aggregate" to 0 to see all > help / error messages > > Other installed soft (Intel MPI library) work fine, without any errors and > using all 64GB memory. > > For OpenMPI I don't use any PBS manager (Torque, slurm, etc.), I work on > single node. I get to the node by command > > ssh node107 > > For command > > cat /etc/security/limits.conf > > I get next output: > > ... > * soft rss 2000000 > * soft stack 2000000 > * hard stack unlimited > * soft data unlimited > * hard data unlimited > * soft memlock unlimited > * hard memlock unlimited > * soft nproc 10000 > * hard nproc 10000 > * soft nofile 10000 > * hard nofile 10000 > * hard cpu unlimited > * soft cpu unlimited > ... > > For command > > cat /sys/module/mlx4_core/parameters/log_num_mtt > > I get output: > > 0 > > Command: > > cat /sys/module/mlx4_core/parameters/log_mtts_per_seg > > output: > > 3 > > Command: > > getconf PAGESIZE > > output: > > 4096 > > With this params and by formula > > max_reg_mem = (2^log_num_mtt) * (2^log_mtts_per_seg) * PAGE_SIZE > > max_reg_mem = 32768 bytes, nor 32GB, how specified in openmpi warning. > > I think that the cause of errors for different versions (1.8 and 2.1 ) is > the same... > > What is the reason for this? > > What programs or settings may restrict memory for openmpi? >
_______________________________________________ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users