Hello,
During system shutdown I have witnessed a checked ipoib_ndis6_cm IO work
thread fail:
1) IO work thread is blocked from running due to scheduling priorities beyond
the point in time at which port_destroy() wants to delete the port object
[cl_obj_destroy( &p_port->obj 0)]. The port object delete fails (ASSERT obj_ref
> 0 fires) due to the outstanding port references incurred by remaining posted
recv buffers. The 1st 128 WorkRequests have been pulled from the CQ by
__recv_cb_internal(), which then posts an IO work request to process the
remaining 384 recv work requests. The IO work request does not run prior to
port_detroy() being called.
2) The IO thread attempts to run but blows up (BSOD invalid memory reference)
as port structures required by the IO work thread have been free()'ed.
The fix is to recognize the port is not in the IB_QPS_RTS state, do not
schedule an IO work thread request and continue to pull recv work requests from
the CQ until empty.
Code snippets:
} else {
if ( h_cq && p_port->state == IB_QPS_RTS ) {
// increment reference to ensure no one release the
object while work iteam is queued
ipoib_port_ref( p_port, ref_recv_cb );
IoQueueWorkItem( p_port->pPoWorkItem,
__iopoib_WorkItem, DelayedWorkQueue, p_port);
WorkToDo = FALSE;
} else {
WorkToDo = TRUE;
}
}
__recv_cb(
IN const ib_cq_handle_t h_cq,
IN void
*cq_context )
{
uint32_t recv_cnt;
boolean_t WorkToDo;
do
{
WorkToDo = __recv_cb_internal(h_cq, cq_context, &recv_cnt);
} while( WorkToDo );
}
--- A/ulp/ipoib_ndis6_cm/kernel/ipoib_port.cpp Mon Sep 13 15:58:08 2010
+++ B/ulp/ipoib_NDIS6_CM/kernel/ipoib_port.cpp Mon Sep 20 08:47:08 2010
@@ -2222,7 +2222,7 @@
CL_ASSERT( status == IB_SUCCESS );
} else {
- if (h_cq) {
+ if ( h_cq && p_port->state == IB_QPS_RTS ) {
// increment reference to ensure no one release the
object while work iteam is queued
ipoib_port_ref( p_port, ref_recv_cb );
IoQueueWorkItem( p_port->pPoWorkItem,
__iopoib_WorkItem, DelayedWorkQueue, p_port);
@@ -2244,9 +2244,13 @@
IN const ib_cq_handle_t h_cq,
IN void
*cq_context )
{
- uint32_t recv_cnt;
+ uint32_t recv_cnt;
+ boolean_t WorkToDo;
- __recv_cb_internal(h_cq, cq_context, &recv_cnt);
+ do
+ {
+ WorkToDo = __recv_cb_internal(h_cq, cq_context, &recv_cnt);
+ } while( WorkToDo );
}
_______________________________________________
ofw mailing list
[email protected]
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ofw