Re: [ewg] [PATCH] IPoIB: synchronize timer deletion with completion handler

2009-12-10 Thread Eli Cohen
On Wed, Dec 09, 2009 at 02:39:49PM -0800, Roland Dreier wrote:
 
   When calling del_timer_sync on priv-poll_timer, it is necessary to prevent
   farther arming of the timer which can be done by a completion handler. 
 Though
   it is harmless since the timer will eventually stop being rearmed, it is 
 better
   practice to avoid calling the timer function after it is deleted. This 
 patch
   handles this by using a new flag that is checked before arming the timer.
 
 have you seen this in practice?  If it can happen then it's not
 harmless, since the module could be unloaded with the timer pending.
 However I don't see how it could happen, since we only seem to delete
 the timer after we know that no more completions are coming (except for
 the case where we decide that the hardware is wedged but it really only
 takes a *long* time to respond at exactly the wrong time, and we somehow
 get a completion between the del_timer_sync and the modify QP to reset
 state -- which is so unlikely it seems not worth adding this extra
 complexity for -- maybe we could add the del_timer_sync to after we
 delete the CQ or something if you're really worried)

No, I haven't actually seen this happening but I think it could.
Consider this scenario:

1. ipoib_send() calls ib_req_notify_cq()
2. ipoib_ib_dev_stop() gets called
3. All pending WR complete; interrupt is generated but not yet
serviced.
4. ipoib_ib_dev_stop() calls del_timer_sync() and passes it.
5. Completion handler is finally serviced - it schedules
ipoib_ib_tx_timer_func()
6. IPoIB unloads and we're in trouble.

Not so likely to happen indeed. Your suggestion to defer deleting the
timer until after the CQ is destroyed will also solve this.
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] [PATCH] IPoIB: synchronize timer deletion with completion handler

2009-12-09 Thread Roland Dreier

  When calling del_timer_sync on priv-poll_timer, it is necessary to prevent
  farther arming of the timer which can be done by a completion handler. Though
  it is harmless since the timer will eventually stop being rearmed, it is 
  better
  practice to avoid calling the timer function after it is deleted. This patch
  handles this by using a new flag that is checked before arming the timer.

have you seen this in practice?  If it can happen then it's not
harmless, since the module could be unloaded with the timer pending.
However I don't see how it could happen, since we only seem to delete
the timer after we know that no more completions are coming (except for
the case where we decide that the hardware is wedged but it really only
takes a *long* time to respond at exactly the wrong time, and we somehow
get a completion between the del_timer_sync and the modify QP to reset
state -- which is so unlikely it seems not worth adding this extra
complexity for -- maybe we could add the del_timer_sync to after we
delete the CQ or something if you're really worried)
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg