Re: [OMPI devel] openib unloaded before last mem dereg

Jeff Squyres (jsquyres) Tue, 29 Jan 2013 10:03:55 -0500

It's on the ticket that I just assigned to you.  :-)


On Jan 29, 2013, at 10:03 AM, Steve Wise <[email protected]> wrote:

> Will do...once I get a patch.
> 
> STeve
> On 1/29/2013 7:40 AM, Jeff Squyres (jsquyres) wrote:
>> Thanks Josh.
>> 
>> Steve -- if you can confirm that this fixes your problem in the v1.6 series, 
>> we'll go ahead and commit the patch.
>> 
>> FWIW: the OpenFabrics startup code got a little cleanup/revamp on the 
>> trunk/v1.7 -- I suspect that's why you're not seeing the problem on 
>> trunk/v1.7 (e.g., look at the utility routines that were abstracted out to 
>> ompi/mca/common/verbs).
>> 
>> 
>> 
>> On Jan 29, 2013, at 2:41 AM, Joshua Ladd <[email protected]> wrote:
>> 
>>> So, we (Mellanox) have observed this ourselves when no suitable CPC can be 
>>> found. Seems the BTL associated with this port is not destroyed and the ref 
>>> count is not decreased.  Not sure why you don't see the problem in 1.7. But 
>>> we have a patch that I'll CMR today. Please review our symptoms, diagnosis, 
>>> and proposed change. Ralph, maybe I can list you as a reviewer of the 
>>> patch? I've reviewed myself and it looks fine, but wouldn't mind having 
>>> another set of eyes on it since I don't want to be responsible for breaking 
>>> the OpenIB BTL.
>>> 
>>> Thanks,
>>> 
>>> Josh Ladd
>>> 
>>> 
>>> Reported by Yossi:
>>> Hi,
>>> 
>>> There is a bug in open mpi (openib component) when one of the active ports 
>>> is Ethernet.
>>> The fix is attached, probably needs to be reviewed and submitted to ompi
>>> 
>>> Error flow:
>>> 1.  Openib component creates a btl instance for every active port 
>>> (including Ethernet)
>>> 2.  Every btl holds a reference count to the device 
>>> (mca_btl_openib_device_t::btls)
>>> 3.  Openib tries to create a "connection module" for every btl
>>> 4.  It fails to create connection module for the Ethernet port
>>> 5.  The btl for Ethernet port is not returned by openib component, in the 
>>> list of btl modules
>>> 6.  The btl for Ethernet port is not destroyed during openib component 
>>> finalize
>>> 7.  The device is not destroyed, because of the reference count
>>> 8.  The memory pool created by the device is not destroyed
>>> 9.  Later, rdma mpool module cleans up remaining pools during its finalize
>>> 10. The memory pool created by openib is destroyed by rdma mpool component 
>>> finalize
>>> 11. The memory pool points to a function (openib_dereg_mr) which is already 
>>> unloaded from memory (because mca_btl_openib.so was unloaded)
>>> 12. Segfault because of a call to invalid function
>>> 
>>> The fix:  If a btl module is not going to be returned from openib component 
>>> init, destroy it.
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: [email protected] [mailto:[email protected]] On 
>>> Behalf Of Ralph Castain
>>> Sent: Monday, January 28, 2013 8:35 PM
>>> To: Steve Wise
>>> Cc: Open MPI Developers
>>> Subject: Re: [OMPI devel] openib unloaded before last mem dereg
>>> 
>>> Out of curiosity, could you tell us how you configured OMPI?
>>> 
>>> 
>>> On Jan 28, 2013, at 12:46 PM, Steve Wise <[email protected]> 
>>> wrote:
>>> 
>>>> On 1/28/2013 2:04 PM, Ralph Castain wrote:
>>>>> On Jan 28, 2013, at 11:55 AM, Steve Wise <[email protected]> 
>>>>> wrote:
>>>>> 
>>>>>> Do you know if the rdmacm CPC is really being used for your connection 
>>>>>> setup (vs other CPCs supported by IB)?  Cuz iwarp only supports rdmacm.  
>>>>>> Maybe that's the difference?
>>>>> Dunno for certain, but I expect it is using the OOB cm since I didn't 
>>>>> direct it to do anything different. Like I said, I suspect the problem is 
>>>>> that the cluster doesn't have iWARP on it.
>>>> Definitely, or it could be the different CPC used for IWvs IB is tickling 
>>>> the issue.
>>>> 
>>>>>> Steve.
>>>>>> 
>>>>>> On 1/28/2013 1:47 PM, Ralph Castain wrote:
>>>>>>> Nope - still works just fine. I didn't receive that warning at all, and 
>>>>>>> it ran to completion without problem.
>>>>>>> 
>>>>>>> I suspect the problem is that the system I can use just isn't
>>>>>>> configured like yours, and so I can't trigger the problem. Afraid I
>>>>>>> can't be of help after all... :-(
>>>>>>> 
>>>>>>> 
>>>>>>> On Jan 28, 2013, at 11:25 AM, Steve Wise <[email protected]> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> On 1/28/2013 12:48 PM, Ralph Castain wrote:
>>>>>>>>> Hmmm...afraid I cannot replicate this using the current state of the 
>>>>>>>>> 1.6 branch (which is the 1.6.4rcN) on the only IB-based cluster I can 
>>>>>>>>> access.
>>>>>>>>> 
>>>>>>>>> Can you try it with a 1.6.4 tarball and see if you still see the 
>>>>>>>>> problem? Could be someone already fixed it.
>>>>>>>> I still hit it on 1.6.4rc2.
>>>>>>>> 
>>>>>>>> Note iWARP != IB so you may not have this issue on IB systems for 
>>>>>>>> various reasons.  Did you use the same mpirun line? Namely using this:
>>>>>>>> 
>>>>>>>> --mca btl_openib_ipaddr_include "192.168.170.0/24"
>>>>>>>> 
>>>>>>>> (adjusted to your network config).
>>>>>>>> 
>>>>>>>> Because if I don't use ipaddr_include, then I don't see this issue on 
>>>>>>>> my setup.
>>>>>>>> 
>>>>>>>> Also, did you see these logged:
>>>>>>>> 
>>>>>>>> Right after starting the job:
>>>>>>>> 
>>>>>>>> ------------------------------------------------------------------
>>>>>>>> -------- No OpenFabrics connection schemes reported that they were
>>>>>>>> able to be used on a specific port.  As such, the openib BTL
>>>>>>>> (OpenFabrics
>>>>>>>> support) will be disabled for this port.
>>>>>>>> 
>>>>>>>> Local host:           hpc-hn1.ogc.int
>>>>>>>> Local device:         cxgb4_0
>>>>>>>> Local port:           2
>>>>>>>> CPCs attempted:       oob, xoob, rdmacm
>>>>>>>> ------------------------------------------------------------------
>>>>>>>> --------
>>>>>>>> ...
>>>>>>>> 
>>>>>>>> At the end of the job:
>>>>>>>> 
>>>>>>>> [hpc-hn1.ogc.int:07850] 5 more processes have sent help message
>>>>>>>> help-mpi-btl-openib-cpc-base.txt / no cpcs for port
>>>>>>>> 
>>>>>>>> 
>>>>>>>> I think these are benign, but prolly indicate a bug: the mpirun is 
>>>>>>>> restricting the job to use port 1 only, so the CPCs shouldn't be 
>>>>>>>> attempting port 2...
>>>>>>>> 
>>>>>>>> Steve.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On Jan 28, 2013, at 10:03 AM, Steve Wise 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>> 
>>>>>>>>>> On 1/28/2013 11:48 AM, Ralph Castain wrote:
>>>>>>>>>>> On Jan 28, 2013, at 9:12 AM, Steve Wise 
>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> On 1/25/2013 12:19 PM, Steve Wise wrote:
>>>>>>>>>>>>> Hello,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I'm tracking an issue I see in openmpi-1.6.3.  Running this 
>>>>>>>>>>>>> command on my chelsio iwarp/rdma setup causes a seg fault every 
>>>>>>>>>>>>> time:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> /usr/mpi/gcc/openmpi-1.6.3-dbg/bin/mpirun --np 2 --host
>>>>>>>>>>>>> hpc-hn1,hpc-cn2 --mca btl openib,sm,self --mca
>>>>>>>>>>>>> btl_openib_ipaddr_include "192.168.170.0/24"
>>>>>>>>>>>>> /usr/mpi/gcc/openmpi-1.6.3/tests/IMB-3.2/IMB-MPI1 pingpong
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The segfault is during finalization, and I've debugged this to 
>>>>>>>>>>>>> the point were I see a call to dereg_mem() after the openib blt 
>>>>>>>>>>>>> is unloaded via dlclose().  dereg_mem() dereferences a function 
>>>>>>>>>>>>> pointer to call the btl-specific dereg function, in this case it 
>>>>>>>>>>>>> is openib_dereg_mr().  However, since that btl has already been 
>>>>>>>>>>>>> unloaded, the deref causes a seg fault.  Happens every time with 
>>>>>>>>>>>>> the above mpi job.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Now, I tried this same experiment with openmpi-1.7rc6 and I don't 
>>>>>>>>>>>>> see the seg fault, and I don't see a call to dereg_mem() after 
>>>>>>>>>>>>> the openib btl is unloaded.  That's all well good. :)  But I'd 
>>>>>>>>>>>>> like to get this fix pushed into 1.6 since that is the current 
>>>>>>>>>>>>> stable release.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Question:  Can someone point me to the fix in 1.7?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Steve.
>>>>>>>>>>>> It appears that in ompi_mpi_finalize(), mca_pml_base_close() is 
>>>>>>>>>>>> called which unloads the openib btl.  Then further down in 
>>>>>>>>>>>> ompi_mpi_finalize(), mca_mpool_base_close() is called which ends 
>>>>>>>>>>>> up calling dereg_mem() which seg faults trying to call into the 
>>>>>>>>>>>> unloaded openib btl.
>>>>>>>>>>>> 
>>>>>>>>>>> That definitely sounds like a bug
>>>>>>>>>>> 
>>>>>>>>>>>> Anybody have thoughts?  Anybody care? :)
>>>>>>>>>>> I care! It needs to be fixed - I'll take a look. Probably something 
>>>>>>>>>>> that forgot to be cmr'd.
>>>>>>>>>> Great!  If you want me to try out a fix or gather more debug, just 
>>>>>>>>>> hollar.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> Steve.
>>>>>>>>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> [email protected]
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
> 


-- 
Jeff Squyres
[email protected]
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Re: [OMPI devel] openib unloaded before last mem dereg

Reply via email to