Re: About a shortcoming of the verbs API

2010-08-09 Thread Bart Van Assche
On Mon, Aug 9, 2010 at 1:51 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] Further, the approach you outline in your follow on message for blkio, has problems.. Look at how IPOIB does NAPI to see how this must look. 1) ib_req_notify_cq must only be called if you are

Re: About a shortcoming of the verbs API

2010-08-09 Thread David Dillow
On Sun, 2010-08-08 at 20:19 +0200, Bart Van Assche wrote: On Sat, Aug 7, 2010 at 6:32 PM, Roland Dreier rdre...@cisco.com wrote: Not sure that I follow the problem you're worried about. A given tasklet can only be running on one CPU at any one time -- if an interrupt occurs and reschedules

Re: About a shortcoming of the verbs API

2010-08-09 Thread Vladislav Bolkhovitin
David Dillow, on 08/09/2010 06:49 PM wrote: On Sun, 2010-08-08 at 20:19 +0200, Bart Van Assche wrote: On Sat, Aug 7, 2010 at 6:32 PM, Roland Dreierrdre...@cisco.com wrote: Not sure that I follow the problem you're worried about. A given tasklet can only be running on one CPU at any one time

Re: About a shortcoming of the verbs API

2010-08-09 Thread David Dillow
On Mon, 2010-08-09 at 22:45 +0400, Vladislav Bolkhovitin wrote: David Dillow, on 08/09/2010 06:49 PM wrote: I'm not sure it makes sense to enable/disable this at runtime -- we don't do it for NAPI, why do it for block devices? I'm not even sure I'd want to see a config option for it in

Re: About a shortcoming of the verbs API

2010-08-08 Thread Bart Van Assche
On Sat, Aug 7, 2010 at 6:32 PM, Roland Dreier rdre...@cisco.com wrote: Not sure that I follow the problem you're worried about.  A given tasklet can only be running on one CPU at any one time -- if an interrupt occurs and reschedules the tasklet then it just runs again when it exits. Also

Re: About a shortcoming of the verbs API

2010-08-08 Thread Jason Gunthorpe
On Sun, Aug 08, 2010 at 08:16:55PM +0200, Bart Van Assche wrote: On Sun, Aug 8, 2010 at 3:38 AM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: [ ... ] No, all hardware pretty much works like this. The general flow is: IRQ happens ?(if level triggered 'ack' the IRQ to the HW,

Re: About a shortcoming of the verbs API

2010-08-07 Thread Bart Van Assche
On Tue, Jul 27, 2010 at 8:20 PM, Jason Gunthorpe jguntho...@obsidianresearch.com wrote: On Tue, Jul 27, 2010 at 08:03:25PM +0200, Bart Van Assche wrote: As far as I know it is not possible for a HCA to tell whether or not a CPU has finished executing the interrupt it triggered. So it is not

Re: About a shortcoming of the verbs API

2010-08-07 Thread Roland Dreier
The above implies that one must be careful when applying a common Linux practice, that is to defer interrupt handling from IRQ context to tasklet context. Since tasklets are executed with interrupts enabled, invoking ib_req_notify_cq(cq, IB_CQ_NEXT_COMP) from tasklet context may cause

Re: About a shortcoming of the verbs API

2010-08-07 Thread Jason Gunthorpe
On Sat, Aug 07, 2010 at 09:56:13AM +0200, Bart Van Assche wrote: The above implies that one must be careful when applying a common Linux practice, that is to defer interrupt handling from IRQ context to tasklet context. Since tasklets are executed with interrupts enabled, invoking

Re: About a shortcoming of the verbs API

2010-07-28 Thread Roland Dreier
- Some time ago I observed that the kernel reported soft lockups because of spin_lock() calls inside a completion handler. These spinlocks were not locked in any other context than the completion handler itself. And the lockups disappeared after having replaced the spin_lock() calls by

Re: About a shortcoming of the verbs API

2010-07-28 Thread Roland Dreier
Actually, I tried to implement the completion callback in a workqueue thread but ipoib_cm_handle_tx_wc() calls netif_tx_lock() which isn't safe unless it is called from an IRQ handler or netif_tx_lock_bh() is called first. Oh, sounds like a bug in IPoIB. I guess we could fix it by just

Re: About a shortcoming of the verbs API

2010-07-28 Thread Ralph Campbell
On Wed, 2010-07-28 at 11:05 -0700, Roland Dreier wrote: Actually, I tried to implement the completion callback in a workqueue thread but ipoib_cm_handle_tx_wc() calls netif_tx_lock() which isn't safe unless it is called from an IRQ handler or netif_tx_lock_bh() is called first. Oh,

Re: About a shortcoming of the verbs API

2010-07-28 Thread Ralph Campbell
On Wed, 2010-07-28 at 11:16 -0700, Roland Dreier wrote: Actually, I tried to implement the completion callback in a workqueue thread but ipoib_cm_handle_tx_wc() calls netif_tx_lock() which isn't safe unless it is called from an IRQ handler or netif_tx_lock_bh() is called first.

Re: About a shortcoming of the verbs API

2010-07-27 Thread Bart Van Assche
On Mon, Jul 26, 2010 at 9:22 PM, Roland Dreier rdre...@cisco.com wrote: [ ... ] Another approach is to just always run the completion processing for a given CQ on a single CPU and avoid locking entirely.  If you want more CPUs to spread the work, just use multiple CQs and multiple event

Re: About a shortcoming of the verbs API

2010-07-27 Thread Roland Dreier
In the applications I'm familiar with InfiniBand is being used not only because of its low latency but also because of its high throughput. Yes, I seem to recall hearing that people care about throughput as well. In order to handle such loads efficiently, interrupts have to be spread

Re: About a shortcoming of the verbs API

2010-07-27 Thread Bart Van Assche
On Tue, Jul 27, 2010 at 6:50 PM, Roland Dreier rdre...@cisco.com wrote: [ ... ] From Documentation/infiniband/core_locking.txt:  The low-level driver is responsible for ensuring that multiple  completion event handlers for the same CQ are not called  simultaneously.  The driver must

Re: About a shortcoming of the verbs API

2010-07-27 Thread Jason Gunthorpe
On Tue, Jul 27, 2010 at 08:03:25PM +0200, Bart Van Assche wrote: As far as I know it is not possible for a HCA to tell whether or not a CPU has finished executing the interrupt it triggered. So it is not possible for the HCA to implement the above requirement by delaying the generation of a

Re: About a shortcoming of the verbs API

2010-07-27 Thread Jason Gunthorpe
On Tue, Jul 27, 2010 at 09:28:54PM +0200, Bart Van Assche wrote: I have two more questions: - Some time ago I observed that the kernel reported soft lockups because of spin_lock() calls inside a completion handler. These spinlocks were not locked in any other context than the completion

Re: About a shortcoming of the verbs API

2010-07-26 Thread Bart Van Assche
On Mon, Jul 26, 2010 at 4:21 PM, Steve Wise sw...@opengridcomputing.com wrote: On 07/25/2010 01:54 PM, Bart Van Assche wrote: [ ... ] The only way I know of to prevent out-of-order completion processing with the current OFED verbs API is to protect the whole completion processing loop

Re: About a shortcoming of the verbs API

2010-07-26 Thread Roland Dreier
2. Double completion processing loop * Initialization: ib_req_notify_cq(cq, IB_CQ_NEXT_COMP); * Notification handler: struct ib_wc wc; do {     while (ib_poll_cq(cq, 1, wc) 0)         /* process wc */ } while (ib_req_notify_cq(cq, IB_CQ_NEXT_COMP |

About a shortcoming of the verbs API

2010-07-25 Thread Bart Van Assche
One of the most common operations when using the verbs API is to dequeue and process completions. For many applications, e.g. storage protocols, processing completions in order is a correctness requirement. Unfortunately with the current IB verbs API it is not possible to process completions in