Re: [OMPI devel] OFED question

Shamis, Pavel Thu, 27 Jan 2011 20:29:59 -0500

Brain,
I would calculate maximum number of qps for all-to-all connection:
4*num_nodes*num_cores^2
And then compare it to the number reported by : ibv_devinfo -v | grep max_qp
If your theoretical maximum is close to ib_devinfo number, then I would suspect 
the qp limitation. Driver manage some internal qps, so you can not get the max.


For memory limit, I do not have any good idea. If it happens in early stages of 
app, then probably the limit is really small and I would verify it with IT.

Regards,
Pavel (Pasha) Shamis
---
Application Performance Tools Group
Computer Science and Math Division
Oak Ridge National Laboratory






On Jan 27, 2011, at 8:09 PM, Barrett, Brian W wrote:

> Pasha -
> 
> Is there a way to tell which of the two happened or to check the number of 
> QPs available per node?  The app likely does talk to a large number of peers 
> from each process, and the nodes are fairly "fat" - it's quad socket, quad 
> core and they are running 16 MPI ranks for each node.  
> 
> Brian
> 
> On Jan 27, 2011, at 6:17 PM, Shamis, Pavel wrote:
> 
>> Unfortunately verbose error reports are not so friendly...anyway , I may 
>> think about 2 issues:
>> 
>> 1. You trying to open open too much QPs. By default ib devices support 
>> fairly large amount of QPs and it is quite hard to push it to this corner. 
>> But If your job is really huge it may be the case. Or for example, if you 
>> share the compute nodes with some other processes that create a lot of qps. 
>> The maximum amount of supported qps you may see in ibv_devinfo.
>> 
>> 2. The memory limit for registered memory is too low, as result driver fails 
>> allocate and register memory for QP. This scenario is most common. Just 
>> happened to me recently, system folks pushed some crap into limits.conf.
>> 
>> Regards,
>> 
>> Pavel (Pasha) Shamis
>> ---
>> Application Performance Tools Group
>> Computer Science and Math Division
>> Oak Ridge National Laboratory
>> 
>> 
>> 
>> 
>> 
>> 
>> On Jan 27, 2011, at 5:56 PM, Barrett, Brian W wrote:
>> 
>>> All -
>>> 
>>> On one of our clusters, we're seeing the following on one of our 
>>> applications, I believe using Open MPI 1.4.3:
>>> 
>>> [xxx:27545] *** An error occurred in MPI_Scatterv
>>> [xxx:27545] *** on communicator MPI COMMUNICATOR 5 DUP FROM 4
>>> [xxx:27545] *** MPI_ERR_OTHER: known error not in list
>>> [xxx:27545] *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
>>> [xxx][[31806,1],0][connect/btl_openib_connect_oob.c:857:qp_create_one] 
>>> error creating qp errno says Resource temporarily unavailable
>>> --------------------------------------------------------------------------
>>> mpirun has exited due to process rank 0 with PID 27545 on
>>> node rs1891 exiting without calling "finalize". This may
>>> have caused other processes in the application to be
>>> terminated by signals sent by mpirun (as reported here).
>>> --------------------------------------------------------------------------
>>> 
>>> 
>>> The problem goes away if we modify the eager protocol msg sizes so that 
>>> there are only two QPs necessary instead of the default 4.  Is there a way 
>>> to bump up the number of QPs that can be created on a node, assuming the 
>>> issue is just running out of available QPs?  If not, any other thoughts on 
>>> working around the problem?
>>> 
>>> Thanks,
>>> 
>>> Brian
>>> 
>>> --
>>> Brian W. Barrett
>>> Dept. 1423: Scalable System Software
>>> Sandia National Laboratories
>>> 
>>> 
>>> 
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 
> --
>  Brian W. Barrett
>  Dept. 1423: Scalable System Software
>  Sandia National Laboratories
> 
> 
> 
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel

Re: [OMPI devel] OFED question

Reply via email to