The race is that you've deleted the work queue element that is enqueued on the iwcm_wq. It's as simple as that. To prove it to yourself, apply your patch. Turn on memory debug support in the kernel and recompile your code. Then run rdma_krping clients in four different threads against your server with an I/O count of 1.
You'll hit the race and can look at it yourself. I don't know any better way to explain it... Sorry. On 11/13/06 10:44 PM, "Krishna Kumar2" <[EMAIL PROTECTED]> wrote: > Hi Tom, > >> No, to understand why go look at the implementation of queue_work. BTW, > this > > I was describing the implementation of queue_work() in my > previous mail. So sorry to be dense, but I do not understand > why this patch introduces a race. Can you explain the race > that you had found ? What I understood of queue_work() is : > > If cm_work_handler() is already running and processing the > last entry at the same time this new entry was added, it is > guaranteed to find this new entry in it's current run iteration, > and process it. The only issue is with the extra queue_work > by iwcm parallely on a different cpu for the same case. > > So if iwcm had done a redundant "queue_work" on this queue, > which, besides adding the new entry to the workqueue, also > does a wakeup of "worker_thread" (which is still running the > previous iteration of run_workqueue -> cm_work_handler). > I am assuming that the wake up function is > default_wake_function(), since I couldn't locate in wait* code > where this is initialized. > > When cm_work_handler finishes removing this new entry, it > returns to worker_thread, which will do a schedule() and > sleep till it is woken up again (since default_wake_function > found that the thread is already running and had done > nothing). Are you referring to a race where the queue_work > is done between the time cm_work_handler finished running > and before it gets back to schedule ? I feel that should not > matter as the run_workqueue() will find this entry in it's > cwq->worklist and continue processing instead of exiting to > worker_thread() and schedule(). > > Still confused about the race :) > > Thanks, > > - KK _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
