On Tue, Feb 2, 2010 at 6:59 PM, Mike Christie <[email protected]> wrote: > On 02/02/2010 09:25 AM, Erez Zilber wrote: >> >> On Thu, Aug 6, 2009 at 5:32 PM, Mike Christie<[email protected]> >> wrote: >>> >>> On 08/06/2009 05:26 AM, Erez Zilber wrote: >>>> >>>> On Wed, Aug 5, 2009 at 10:22 PM, Mike Christie<[email protected]> >>>> wrote: >>>>> >>>>> On 08/05/2009 12:34 PM, Erez Zilber wrote: >>>>>>> >>>>>>> I found it. The problem is that we will send the signal if the xmit >>>>>>> thread is running or not. If it is not running the workqueue code >>>>>>> will >>>>>>> keep getting woken up to handle the signal, but because we have not >>>>>>> called queue_work the workqueue code will not let the thread run so >>>>>>> we >>>>>>> never get to flush the signal until we reconnect and send down a >>>>>>> login >>>>>>> pdu (the login pdu does a queue_work finally). >>>>>>> >>>>>> When you say "the xmit thread is running", I guess that you mean that >>>>>> the xmit thread is busy with IO, right? Note that I said that this >>>>> >>>>> No. workqueue.c:worker_thread() is spinning. It is looping because >>>>> there >>>>> is a signal pending, but the iscsi work code which has the >>>>> flush_signals >>>>> is not getting run because there is no work queued. >>>>> >>>>> So you could add a >>>>> >>>>> if (signal_pending(current)) >>>>> flush_signals(current) >>>>> >>>>> to worker_thread() "for" loop and I think this will fix the problem. >>>>> >>>> >>>> Looks like this solves the problem. I've added the following patch to >>>> the centos 5.3 kernel (2.6.18-128.1.6.el5): >>>> >>>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c >>>> index 8594efb..e148ed8 100644 >>>> --- a/kernel/workqueue.c >>>> +++ b/kernel/workqueue.c >>>> @@ -253,6 +253,9 @@ static int worker_thread(void *__cwq) >>>> >>>> set_current_state(TASK_INTERRUPTIBLE); >>>> while (!kthread_should_stop()) { >>>> + if (signal_pending(current)) >>>> + flush_signals(current); >>>> + >>>> add_wait_queue(&cwq->more_work,&wait); >>>> if (list_empty(&cwq->worklist)) >>>> schedule(); >>>> >>>> I'm running with open-iscsi.git + 2 commits from linux-2.6-iscsi.git >>>> (9c302cc45b70ecc4b606d65a445902381066061b& >>>> 75be23dc40ba2f215779d5ba60fda9a762271bbe). >>>> >>>> Will you push it upstream& into the RHEL kernel? >>>> >>> >>> I am not sure. I was thinking that switching from a workqueue to a >>> thread is the right thing to do. The drawback is that the workqueue is >>> nice when there are multiple sessions for a host like is done with >>> bnx2i, cxgb3i and be_iscsi. I can just queue_work and pass the >>> connection to send on. If I switch to a work_queue I have to add my own >>> code to do that. >>> >>> I am going to post a patch like you did to linux-kernel and see what >>> people say is best. If it goes in then I will port to RHEL. >>> >>> --~--~---------~--~----~------------~-------~--~----~ >>> You received this message because you are subscribed to the Google Groups >>> "open-iscsi" group. >>> To post to this group, send email to [email protected] >>> To unsubscribe from this group, send email to >>> [email protected] >>> For more options, visit this group at >>> http://groups.google.com/group/open-iscsi >>> -~----------~----~----~----~------~----~------~--~--- >>> >>> >> >> Mike, >> >> We had this discussion a long time ago. I don't remember what >> eventually happened with it. Did you push the workqueue patch to the >> kernel? What about the suspend-and-wake patch? >> > > It looks like I posted it at Red Hat and never got a response, and I > probably then forgot about it and never asked upstream. Will send mail > upstream now. > > --
I encountered this problem ~6 months ago and found some workaround. Now, I moved to new (and faster) HW, and I'm hitting this again and again in scenarios with lots of I/O + killing the target machine. Erez -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
