Hi,

QP = Queue Pair

Here are a couple of nice FAQ entries that I find useful.

http://www.open-mpi.org/faq/?category=openfabrics

And videos:

http://www.open-mpi.org/video/?category=openfabrics


--
Samuel K. Gutierrez
Los Alamos National Laboratory

On Jun 23, 2011, at 8:22 AM, Mathieu Gontier wrote:

> Hi,
> 
> Thanks for your answer. It makes sense. 
> Sorry if my question seems silly, but what does QP mean? It is difficult to 
> read the FAQ without knowing that!
> 
> Thanks. 
> 
> On 06/23/2011 04:00 PM, Ralph Castain wrote:
>> 
>> One possibility: if you increase the number of processes in the job, and 
>> they all interconnect, then the IB interface can (I believe) run out of 
>> memory at some point. IIRC, the answer was to reduce the size of the QPs so 
>> that you could support a larger number of them.
>> 
>> You should find info about controlling QP size in the IB FAQ area on the 
>> OMPI web site, I believe.
>> 
>> On Jun 23, 2011, at 7:56 AM, Mathieu Gontier wrote:
>> 
>>> Hello, 
>>> 
>>> Thank for the answer.
>>> I am testing with OpenMPI-1.4.3: my computation is queuing. But I did not 
>>> read anything obvious related to my issue. Have you read something which 
>>> could solve it? 
>>> I am going to submit my computation with --mca mpi_leave_pinned 0, but do 
>>> you have any idea how it affect the performance? Compared to using 
>>> Ethernet? 
>>> 
>>> Many thanks for your support. 
>>> 
>>> On 06/23/2011 03:01 PM, Josh Hursey wrote:
>>>> 
>>>> I wonder if this is related to memory pinning. Can you try turning off
>>>> the leave pinned, and see if the problem persists (this may affect
>>>> performance, but should avoid the crash):
>>>>   mpirun ... --mca mpi_leave_pinned 0 ...
>>>> 
>>>> Also it looks like Smoky has a slightly newer version of the 1.4
>>>> branch that you should try to switch to if you can. The following
>>>> command will show you all of the available installs on that machine:
>>>>   shell$ module avail ompi
>>>> 
>>>> For a list of supported compilers for that version try the 'show' option:
>>>> shell$ module show ompi/1.4.3
>>>> -------------------------------------------------------------------
>>>> /sw/smoky/modulefiles-centos/ompi/1.4.3:
>>>> 
>>>> module-whatis       This module configures your environment to make Open
>>>> MPI 1.4.3 available.
>>>> Supported Compilers:
>>>>      pathscale/3.2.99
>>>>      pathscale/3.2
>>>>      pgi/10.9
>>>>      pgi/10.4
>>>>      intel/11.1.072
>>>>      gcc/4.4.4
>>>>      gcc/4.4.3
>>>> -------------------------------------------------------------------
>>>> 
>>>> Let me know if that helps.
>>>> 
>>>> Josh
>>>> 
>>>> 
>>>> On Wed, Jun 22, 2011 at 4:16 AM, Mathieu Gontier
>>>> <mathieu.gont...@gmail.com> wrote:
>>>>> Dear all,
>>>>> 
>>>>> First of all, all my apologies because I post this message to both the bug
>>>>> and user mailing list. But for the moment, I do not know if it is a bug!
>>>>> 
>>>>> I am running a CFD structured flow solver at ORNL, and I have an access 
>>>>> to a
>>>>> small cluster (Smoky) using OpenMPI-1.4.2 with Infiniband by default.
>>>>> Recently we increased the size of our models, and since that time we have
>>>>> run into many infiniband related problems.  The most serious problem is a
>>>>> hard crash with the following error message:
>>>>> 
>>>>> [smoky45][[60998,1],32][/sw/sources/ompi/1.4.2/ompi/mca/btl/openib/connect/btl_openib_connect_oob.c:464:qp_create_one]
>>>>> error creating qp errno says Cannot allocate memory
>>>>> 
>>>>> If we force the solver to use ethernet (mpirun -mca btl ^openib) the
>>>>> computations works correctly, although very slowly (a single iteration 
>>>>> take
>>>>> ages). Do you have any idea what could be causing these problems?
>>>>> 
>>>>> If it is due to a bug or a limitation into OpenMPI, do you think the 
>>>>> version
>>>>> 1.4.3, the coming 1.4.4 or any 1.5 version could solve the problem? I read
>>>>> the release notes, but I did not read any obvious patch which could fix my
>>>>> problem. The system administrator is ready to compile a new package for 
>>>>> us,
>>>>> but I do not want to ask to install to many of them.
>>>>> 
>>>>> Thanks.
>>>>> --
>>>>> 
>>>>> Mathieu Gontier
>>>>> skype: mathieu_gontier
>>>>> _______________________________________________
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>> 
>>> -- 
>>> 
>>> Mathieu Gontier 
>>> skype: mathieu_gontier
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> -- 
> 
> Mathieu Gontier 
> skype: mathieu_gontier
> _______________________________________________
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users






Reply via email to