On Tue, Feb 2, 2010 at 6:59 PM, Mike Christie <micha...@cs.wisc.edu> wrote:
> On 02/02/2010 09:25 AM, Erez Zilber wrote:
>> On Thu, Aug 6, 2009 at 5:32 PM, Mike Christie<micha...@cs.wisc.edu>
>>  wrote:
>>> On 08/06/2009 05:26 AM, Erez Zilber wrote:
>>>> On Wed, Aug 5, 2009 at 10:22 PM, Mike Christie<micha...@cs.wisc.edu>
>>>>  wrote:
>>>>> On 08/05/2009 12:34 PM, Erez Zilber wrote:
>>>>>>> I found it. The problem is that we will send the signal if the xmit
>>>>>>> thread is running or not. If it is not running the workqueue code
>>>>>>> will
>>>>>>> keep getting woken up to handle the signal, but because we have not
>>>>>>> called queue_work the workqueue code will not let the thread run so
>>>>>>> we
>>>>>>> never get to flush the signal until we reconnect and send down a
>>>>>>> login
>>>>>>> pdu (the login pdu does a queue_work finally).
>>>>>> When you say "the xmit thread is running", I guess that you mean that
>>>>>> the xmit thread is busy with IO, right? Note that I said that this
>>>>> No. workqueue.c:worker_thread() is spinning. It is looping because
>>>>> there
>>>>> is a signal pending, but the iscsi work code which has the
>>>>> flush_signals
>>>>> is not getting run because there is no work queued.
>>>>> So you could add a
>>>>> if (signal_pending(current))
>>>>>         flush_signals(current)
>>>>> to worker_thread() "for" loop and I think this will fix the problem.
>>>> Looks like this solves the problem. I've added the following patch to
>>>> the centos 5.3 kernel (2.6.18-128.1.6.el5):
>>>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>>>> index 8594efb..e148ed8 100644
>>>> --- a/kernel/workqueue.c
>>>> +++ b/kernel/workqueue.c
>>>> @@ -253,6 +253,9 @@ static int worker_thread(void *__cwq)
>>>>          set_current_state(TASK_INTERRUPTIBLE);
>>>>          while (!kthread_should_stop()) {
>>>> +               if (signal_pending(current))
>>>> +                       flush_signals(current);
>>>> +
>>>>                  add_wait_queue(&cwq->more_work,&wait);
>>>>                  if (list_empty(&cwq->worklist))
>>>>                          schedule();
>>>> I'm running with open-iscsi.git + 2 commits from linux-2.6-iscsi.git
>>>> (9c302cc45b70ecc4b606d65a445902381066061b&
>>>> 75be23dc40ba2f215779d5ba60fda9a762271bbe).
>>>> Will you push it upstream&    into the RHEL kernel?
>>> I am not sure. I was thinking that switching from a workqueue to a
>>> thread is the right thing to do. The drawback is that the workqueue is
>>> nice when there are multiple sessions for a host like is done with
>>> bnx2i, cxgb3i and be_iscsi. I can just queue_work and pass the
>>> connection to send on. If I switch to a work_queue I have to add my own
>>> code to do that.
>>> I am going to post a patch like you did to linux-kernel and see what
>>> people say is best. If it goes in then I will port to RHEL.
>>> --~--~---------~--~----~------------~-------~--~----~
>>> You received this message because you are subscribed to the Google Groups
>>> "open-iscsi" group.
>>> To post to this group, send email to open-iscsi@googlegroups.com
>>> To unsubscribe from this group, send email to
>>> open-iscsi+unsubscr...@googlegroups.com
>>> For more options, visit this group at
>>> http://groups.google.com/group/open-iscsi
>>> -~----------~----~----~----~------~----~------~--~---
>> Mike,
>> We had this discussion a long time ago. I don't remember what
>> eventually happened with it. Did you push the workqueue patch to the
>> kernel? What about the suspend-and-wake patch?
> It looks like I posted it at Red Hat and never got a response, and I
> probably then forgot about it and never asked upstream. Will send mail
> upstream now.
> --

I encountered this problem ~6 months ago and found some workaround.
Now, I moved to new (and faster) HW, and I'm hitting this again and
again in scenarios with lots of I/O + killing the target machine.


You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-is...@googlegroups.com.
To unsubscribe from this group, send email to 
For more options, visit this group at 

Reply via email to