On Thu, 2005-06-16 at 04:39, Itamar Rabenstein wrote: > Hi Hal, > I am trying to understand what is going here and i still dont see how this > happan . > > This prints are only set in UP mode .(is this your system UP?)
Yes. > the code is (function: dapl_evd_connection_callback): > spin_lock_irqsave(&ep->common.lock, ep->common.flags); > case on the event type > disconnect: dapl_ib_disconnect_clean(ep, TRUE); > spin_unlock_irqrestore(&ep->common.lock, ep->common.flags); > > from some reason in the middle between the lock and the unlock there is a > call to consumer > function (dat_ep_disconnetc) that try to disconnect the same ep and the lock > fail. Could this be a "local" disconnect race of some sort ? > the evd_cb function is either an interupt from the CM so i dont see how the > consumer can call > dat_ib_disconnect in the middle > or the user called twice to dat_ib_disconnect on the same ep and youe kernel > give preemption CONFIG_PREEMPT is not set in my kernel config. > i dont understand both (;-) > > can you try to run it with some debug? > at least ot know who called to dapl_evd_connection_callback ? All calls to dapl_evd_connection_callback are out of the CM except one case in dapl_ep_disconnect. In the case of dapl_ep_disconnect, the lock is obtained in dapl_ep_disconnect before the connection callback routine would/might have been called. One instance: Jun 16 12:47:33 localhost kernel: dapl_ep_disconnect: dapl_evd_connection_callback EP 0xc91e5bf8 CM ID 0x00000000 EP common lock 0xc91e5c08 Jun 16 12:47:34 localhost kernel: dapl_ep_disconnect: dapl_evd_connection_callback EP 0xc987ebf8 CM ID 0x00000000 EP common lock 0xc987ec08 Jun 16 12:47:34 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_ep.c:1110: spin_lock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:ce609c08) already locked by drivers/infiniband/ulp/dat-provider/dapl_cr.c/501 Jun 16 12:47:34 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_cr.c:512: spin_unlock Another instance: Jun 16 12:55:11 localhost kernel: dapl_ep_disconnect: dapl_evd_connection_callback EP 0xc64b1bf8 CM ID 0x00000000 EP common lock 0xc64b1c08 Jun 16 12:55:12 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_ep.c:1110: spin_lock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:ce5a6c08) already locked by drivers/infiniband/ulp/dat-provider/dapl_cr.c/501 Jun 16 12:55:12 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_cr.c:512: spin_unlock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:ce5a6c08) not locked Jun 16 12:55:12 localhost kernel: dapl_ep_disconnect: dapl_evd_connection_callback EP 0xc07debf8 CM ID 0x00000000 EP common lock 0xc07dec08 Yet another instance: Jun 16 12:55:12 localhost kernel: dapl_cm_active_cb_handler: TIMEWAIT EXIT dapl_evd_connection_callback EP 0xce609bf8 CM ID 0xc82dcdf8 EP common lock 0xce609c08 Jun 16 12:55:12 localhost kernel: dapl_ep_disconnect: dapl_evd_connection_callback EP 0xcbeb8bf8 CM ID 0x00000000 EP common lock 0xcbeb8c08 Jun 16 12:55:12 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_ep.c:1110: spin_lock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:cf35bc08) already locked by drivers/infiniband/ulp/dat-provider/dapl_cr.c/501 Jun 16 12:55:12 localhost kernel: drivers/infiniband/ulp/dat-provider/dapl_cr.c:512: spin_unlock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:cf35bc08) not locked -- Hal > Itamar > > > > --Original Message-- > > From: Hal Rosenstock [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, June 14, 2005 8:37 PM > > To: James Lentini > > Cc: [email protected] > > Subject: [openib-general] kdapl locking problem > > > > > > Hi, > > > > When running in loopback mode (client and server on same > > machine (x86)): > > kdapltest -T T -s <IP addr> -D mthca0a -d -t 2 -w 8 -i 20 > > client SR server SR > > I see the following locking problem: > > > > Jun 14 13:30:08 localhost kernel: > > drivers/infiniband/ulp/dat-provider/dapl_ep.c:1111: > > spin_lock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:c44b1c > > 18) already locked by > > drivers/infiniband/ulp/dat-provider/dapl_evd.c/756 > > Jun 14 13:30:08 localhost kernel: > > drivers/infiniband/ulp/dat-provider/dapl_evd.c:797: > > spin_unlock(drivers/infiniband/ulp/dat-provider/dapl_ep.c:c44b > > 1c18) not locked > > > > -- Hal > > > > _______________________________________________ > > openib-general mailing list > > [email protected] > > http://openib.org/mailman/listinfo/openib-general > > > > To unsubscribe, please visit > http://openib.org/mailman/listinfo/openib-general _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
