Hi James,

I will try to explain the reason behind this patch:

In IB, a “normal” working flow, for a consumer, is:
-       Receive a CQ notification callback
-       Wakeup polling thread
-       Poll for completion (empty the queue) 
-       Request completion notification

There is no problem here.

In kdapl, however, the consumer will keep getting upcalls, until he sets the 
upcall policy to disable. So a working flow will be:
-       Receive an evd upcall
-       Disable evd upcall policy
-       Wakeup polling thread
-       Dequeue all evd’s
-       Enable evd upcall policy

There is a race here:
A completion can come after the last dequeu and before the Enabling. The 
provider won’t call for the consumer (policy is disabled) and the consumer 
would not dequeu any more because he “knows” the queue is empty.

I think it is a very bad idea, to solve this race by adding another evd_dequeue 
after you enable the upcall policy.
If you do that you would have a polling thread (because while you dequeue one 
completion you can have many more following) and at the same time you will 
receive upcall from the dapl provider.
Beside the fact that this is an expensive and unnecessary context switch you 
have an upcall and a thread racing.
You will have a situation that the upcall has an event at hand and the thread 
has an event, both not handled yet - you will have to queue them again 
internally or something to keep the order. And I think that is only a partial 
list of the problems in this case.

SO…

My suggestion is simple, it solves the race, it saves the unnecessary context 
switch and it spares the complexity from the consumer side.
The solution is to notify the consumer when he tries to enable upcall policy, 
that the queue is actually not empty, and force him to continue polling (in the 
same thread context he is now).
dat_evd_modify_upcall is guarded by a spin_lock_irqsave,  when it checks the 
queue and so the race would not occur.

BTW,
I’m not sure if it is still the case, but I think that one of the ulps in 
openib, did not use a kernel thread for dequeu-ing. This is a very bad design, 
as the upcall can be polling for *long* periods of time, in a tasklet/interrupt 
context.

That’s it…
Sorry for the long mail – I hope It was not to blur …

Guy.





-----Original Message-----
From: James Lentini [mailto:[EMAIL PROTECTED]
Sent: Thu 8/18/2005 10:28 PM
To: Guy German
Cc: Openib
Subject: Re: [openib-general][PATCH][kdapl]: FMR and EVD patch
 


Hi Guy,

The one piece of this patch that remains unaccepted is:

Index: ib/dapl_evd.c
===================================================================
--- ib/dapl_evd.c       (revision 3136)
+++ ib/dapl_evd.c       (working copy)
@@ -1028,6 +1028,7 @@
 {
        struct dapl_evd *evd;
        int status = 0;
+       int pending_events;
 
        evd = (struct dapl_evd *)evd_handle;
        dapl_dbg_log (DAPL_DBG_TYPE_API, "%s: (evd=%p, upcall_policy=%d)\n",
@@ -1035,14 +1036,25 @@
 
        spin_lock_irqsave(&evd->common.lock, evd->common.flags);
        if ((upcall_policy != DAT_UPCALL_TEARDOWN) &&
-           (upcall_policy != DAT_UPCALL_DISABLE) &&
-           (evd->evd_flags & DAT_EVD_DTO_FLAG)) {
-               status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
-               if (status) {
-                       printk(KERN_ERR "%s: dapls_ib_completion_notify failed "
-                              "(status=0x%x)\n",__func__, status);
+           (upcall_policy != DAT_UPCALL_DISABLE)) {
+               pending_events = dapl_rbuf_count(&evd->pending_event_queue);
+               if (pending_events) {
+                       dapl_dbg_log(DAPL_DBG_TYPE_WARN,
+                                    "%s: (evd %p) there are still %d pending "
+                                    "events in the queue - policy stays 
disabled\n",
+                                    __func__, evd_handle, pending_events);
+                       status = -EBUSY;
                        goto bail;
                }
+               if (evd->evd_flags & DAT_EVD_DTO_FLAG) {
+                       status = ib_req_notify_cq(evd->cq, IB_CQ_NEXT_COMP);
+                       if (status) {
+                               printk(KERN_ERR "%s: dapls_ib_completion_notify"
+                                       " failed (status=0x%x) \n",__func__,
+                                       status);
+                               goto bail;
+                       }
+               }
        }
        evd->upcall_policy = upcall_policy;
        evd->upcall = *upcall;

The IB analog to this function, ib_req_notify_cq(), does not require 
that the CQ be empty. The kDAPL specification does not define an empty 
EVD as a requirement for modifying the upcall and previous 
implementations of the API have not made this requirement. 

_______________________________________________
openib-general mailing list
[email protected]
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Reply via email to