So, we (Mellanox) have observed this ourselves when no suitable CPC can be 
found. Seems the BTL associated with this port is not destroyed and the ref 
count is not decreased.  Not sure why you don't see the problem in 1.7. But we 
have a patch that I'll CMR today. Please review our symptoms, diagnosis, and 
proposed change. Ralph, maybe I can list you as a reviewer of the patch? I've 
reviewed myself and it looks fine, but wouldn't mind having another set of eyes 
on it since I don't want to be responsible for breaking the OpenIB BTL.

Thanks,

Josh Ladd


Reported by Yossi:
Hi,

There is a bug in open mpi (openib component) when one of the active ports is 
Ethernet.
The fix is attached, probably needs to be reviewed and submitted to ompi

Error flow:
1.      Openib component creates a btl instance for every active port 
(including Ethernet)
2.      Every btl holds a reference count to the device 
(mca_btl_openib_device_t::btls)
3.      Openib tries to create a "connection module" for every btl
4.      It fails to create connection module for the Ethernet port
5.      The btl for Ethernet port is not returned by openib component, in the 
list of btl modules
6.      The btl for Ethernet port is not destroyed during openib component 
finalize
7.      The device is not destroyed, because of the reference count
8.      The memory pool created by the device is not destroyed
9.      Later, rdma mpool module cleans up remaining pools during its finalize
10.     The memory pool created by openib is destroyed by rdma mpool component 
finalize
11.     The memory pool points to a function (openib_dereg_mr) which is already 
unloaded from memory (because mca_btl_openib.so was unloaded)
12.     Segfault because of a call to invalid function

The fix:  If a btl module is not going to be returned from openib component 
init, destroy it.






-----Original Message-----
From: devel-boun...@open-mpi.org [mailto:devel-boun...@open-mpi.org] On Behalf 
Of Ralph Castain
Sent: Monday, January 28, 2013 8:35 PM
To: Steve Wise
Cc: Open MPI Developers
Subject: Re: [OMPI devel] openib unloaded before last mem dereg

Out of curiosity, could you tell us how you configured OMPI?


On Jan 28, 2013, at 12:46 PM, Steve Wise <sw...@opengridcomputing.com> wrote:

> On 1/28/2013 2:04 PM, Ralph Castain wrote:
>> On Jan 28, 2013, at 11:55 AM, Steve Wise <sw...@opengridcomputing.com> wrote:
>> 
>>> Do you know if the rdmacm CPC is really being used for your connection 
>>> setup (vs other CPCs supported by IB)?  Cuz iwarp only supports rdmacm.  
>>> Maybe that's the difference?
>> Dunno for certain, but I expect it is using the OOB cm since I didn't direct 
>> it to do anything different. Like I said, I suspect the problem is that the 
>> cluster doesn't have iWARP on it.
> 
> Definitely, or it could be the different CPC used for IWvs IB is tickling the 
> issue.
> 
>>> Steve.
>>> 
>>> On 1/28/2013 1:47 PM, Ralph Castain wrote:
>>>> Nope - still works just fine. I didn't receive that warning at all, and it 
>>>> ran to completion without problem.
>>>> 
>>>> I suspect the problem is that the system I can use just isn't 
>>>> configured like yours, and so I can't trigger the problem. Afraid I 
>>>> can't be of help after all... :-(
>>>> 
>>>> 
>>>> On Jan 28, 2013, at 11:25 AM, Steve Wise <sw...@opengridcomputing.com> 
>>>> wrote:
>>>> 
>>>>> On 1/28/2013 12:48 PM, Ralph Castain wrote:
>>>>>> Hmmm...afraid I cannot replicate this using the current state of the 1.6 
>>>>>> branch (which is the 1.6.4rcN) on the only IB-based cluster I can access.
>>>>>> 
>>>>>> Can you try it with a 1.6.4 tarball and see if you still see the 
>>>>>> problem? Could be someone already fixed it.
>>>>> I still hit it on 1.6.4rc2.
>>>>> 
>>>>> Note iWARP != IB so you may not have this issue on IB systems for various 
>>>>> reasons.  Did you use the same mpirun line? Namely using this:
>>>>> 
>>>>> --mca btl_openib_ipaddr_include "192.168.170.0/24"
>>>>> 
>>>>> (adjusted to your network config).
>>>>> 
>>>>> Because if I don't use ipaddr_include, then I don't see this issue on my 
>>>>> setup.
>>>>> 
>>>>> Also, did you see these logged:
>>>>> 
>>>>> Right after starting the job:
>>>>> 
>>>>> ------------------------------------------------------------------
>>>>> -------- No OpenFabrics connection schemes reported that they were 
>>>>> able to be used on a specific port.  As such, the openib BTL 
>>>>> (OpenFabrics
>>>>> support) will be disabled for this port.
>>>>> 
>>>>>  Local host:           hpc-hn1.ogc.int
>>>>>  Local device:         cxgb4_0
>>>>>  Local port:           2
>>>>>  CPCs attempted:       oob, xoob, rdmacm
>>>>> ------------------------------------------------------------------
>>>>> --------
>>>>> ...
>>>>> 
>>>>> At the end of the job:
>>>>> 
>>>>> [hpc-hn1.ogc.int:07850] 5 more processes have sent help message 
>>>>> help-mpi-btl-openib-cpc-base.txt / no cpcs for port
>>>>> 
>>>>> 
>>>>> I think these are benign, but prolly indicate a bug: the mpirun is 
>>>>> restricting the job to use port 1 only, so the CPCs shouldn't be 
>>>>> attempting port 2...
>>>>> 
>>>>> Steve.
>>>>> 
>>>>> 
>>>>>> On Jan 28, 2013, at 10:03 AM, Steve Wise <sw...@opengridcomputing.com> 
>>>>>> wrote:
>>>>>> 
>>>>>>> On 1/28/2013 11:48 AM, Ralph Castain wrote:
>>>>>>>> On Jan 28, 2013, at 9:12 AM, Steve Wise <sw...@opengridcomputing.com> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> On 1/25/2013 12:19 PM, Steve Wise wrote:
>>>>>>>>>> Hello,
>>>>>>>>>> 
>>>>>>>>>> I'm tracking an issue I see in openmpi-1.6.3.  Running this command 
>>>>>>>>>> on my chelsio iwarp/rdma setup causes a seg fault every time:
>>>>>>>>>> 
>>>>>>>>>> /usr/mpi/gcc/openmpi-1.6.3-dbg/bin/mpirun --np 2 --host 
>>>>>>>>>> hpc-hn1,hpc-cn2 --mca btl openib,sm,self --mca 
>>>>>>>>>> btl_openib_ipaddr_include "192.168.170.0/24" 
>>>>>>>>>> /usr/mpi/gcc/openmpi-1.6.3/tests/IMB-3.2/IMB-MPI1 pingpong
>>>>>>>>>> 
>>>>>>>>>> The segfault is during finalization, and I've debugged this to the 
>>>>>>>>>> point were I see a call to dereg_mem() after the openib blt is 
>>>>>>>>>> unloaded via dlclose().  dereg_mem() dereferences a function pointer 
>>>>>>>>>> to call the btl-specific dereg function, in this case it is 
>>>>>>>>>> openib_dereg_mr().  However, since that btl has already been 
>>>>>>>>>> unloaded, the deref causes a seg fault.  Happens every time with the 
>>>>>>>>>> above mpi job.
>>>>>>>>>> 
>>>>>>>>>> Now, I tried this same experiment with openmpi-1.7rc6 and I don't 
>>>>>>>>>> see the seg fault, and I don't see a call to dereg_mem() after the 
>>>>>>>>>> openib btl is unloaded.  That's all well good. :)  But I'd like to 
>>>>>>>>>> get this fix pushed into 1.6 since that is the current stable 
>>>>>>>>>> release.
>>>>>>>>>> 
>>>>>>>>>> Question:  Can someone point me to the fix in 1.7?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> Steve.
>>>>>>>>> It appears that in ompi_mpi_finalize(), mca_pml_base_close() is 
>>>>>>>>> called which unloads the openib btl.  Then further down in 
>>>>>>>>> ompi_mpi_finalize(), mca_mpool_base_close() is called which ends up 
>>>>>>>>> calling dereg_mem() which seg faults trying to call into the unloaded 
>>>>>>>>> openib btl.
>>>>>>>>> 
>>>>>>>> That definitely sounds like a bug
>>>>>>>> 
>>>>>>>>> Anybody have thoughts?  Anybody care? :)
>>>>>>>> I care! It needs to be fixed - I'll take a look. Probably something 
>>>>>>>> that forgot to be cmr'd.
>>>>>>> Great!  If you want me to try out a fix or gather more debug, just 
>>>>>>> hollar.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Steve.
>>>>>>> 
> 


_______________________________________________
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

Reply via email to