Re: [PATCH] decrease sndtmo

Erez Zilber Tue, 02 Feb 2010 10:56:41 -0800

On Tue, Feb 2, 2010 at 6:59 PM, Mike Christie <[email protected]> wrote:
> On 02/02/2010 09:25 AM, Erez Zilber wrote:
>>
>> On Thu, Aug 6, 2009 at 5:32 PM, Mike Christie<[email protected]>
>>  wrote:
>>>
>>> On 08/06/2009 05:26 AM, Erez Zilber wrote:
>>>>
>>>> On Wed, Aug 5, 2009 at 10:22 PM, Mike Christie<[email protected]>
>>>>  wrote:
>>>>>
>>>>> On 08/05/2009 12:34 PM, Erez Zilber wrote:
>>>>>>>
>>>>>>> I found it. The problem is that we will send the signal if the xmit
>>>>>>> thread is running or not. If it is not running the workqueue code
>>>>>>> will
>>>>>>> keep getting woken up to handle the signal, but because we have not
>>>>>>> called queue_work the workqueue code will not let the thread run so
>>>>>>> we
>>>>>>> never get to flush the signal until we reconnect and send down a
>>>>>>> login
>>>>>>> pdu (the login pdu does a queue_work finally).
>>>>>>>
>>>>>> When you say "the xmit thread is running", I guess that you mean that
>>>>>> the xmit thread is busy with IO, right? Note that I said that this
>>>>>
>>>>> No. workqueue.c:worker_thread() is spinning. It is looping because
>>>>> there
>>>>> is a signal pending, but the iscsi work code which has the
>>>>> flush_signals
>>>>> is not getting run because there is no work queued.
>>>>>
>>>>> So you could add a
>>>>>
>>>>> if (signal_pending(current))
>>>>>         flush_signals(current)
>>>>>
>>>>> to worker_thread() "for" loop and I think this will fix the problem.
>>>>>
>>>>
>>>> Looks like this solves the problem. I've added the following patch to
>>>> the centos 5.3 kernel (2.6.18-128.1.6.el5):
>>>>
>>>> diff --git a/kernel/workqueue.c b/kernel/workqueue.c
>>>> index 8594efb..e148ed8 100644
>>>> --- a/kernel/workqueue.c
>>>> +++ b/kernel/workqueue.c
>>>> @@ -253,6 +253,9 @@ static int worker_thread(void *__cwq)
>>>>
>>>>          set_current_state(TASK_INTERRUPTIBLE);
>>>>          while (!kthread_should_stop()) {
>>>> +               if (signal_pending(current))
>>>> +                       flush_signals(current);
>>>> +
>>>>                  add_wait_queue(&cwq->more_work,&wait);
>>>>                  if (list_empty(&cwq->worklist))
>>>>                          schedule();
>>>>
>>>> I'm running with open-iscsi.git + 2 commits from linux-2.6-iscsi.git
>>>> (9c302cc45b70ecc4b606d65a445902381066061b&
>>>> 75be23dc40ba2f215779d5ba60fda9a762271bbe).
>>>>
>>>> Will you push it upstream&    into the RHEL kernel?
>>>>
>>>
>>> I am not sure. I was thinking that switching from a workqueue to a
>>> thread is the right thing to do. The drawback is that the workqueue is
>>> nice when there are multiple sessions for a host like is done with
>>> bnx2i, cxgb3i and be_iscsi. I can just queue_work and pass the
>>> connection to send on. If I switch to a work_queue I have to add my own
>>> code to do that.
>>>
>>> I am going to post a patch like you did to linux-kernel and see what
>>> people say is best. If it goes in then I will port to RHEL.
>>>
>>> --~--~---------~--~----~------------~-------~--~----~
>>> You received this message because you are subscribed to the Google Groups
>>> "open-iscsi" group.
>>> To post to this group, send email to [email protected]
>>> To unsubscribe from this group, send email to
>>> [email protected]
>>> For more options, visit this group at
>>> http://groups.google.com/group/open-iscsi
>>> -~----------~----~----~----~------~----~------~--~---
>>>
>>>
>>
>> Mike,
>>
>> We had this discussion a long time ago. I don't remember what
>> eventually happened with it. Did you push the workqueue patch to the
>> kernel? What about the suspend-and-wake patch?
>>
>
> It looks like I posted it at Red Hat and never got a response, and I
> probably then forgot about it and never asked upstream. Will send mail
> upstream now.
>
> --


I encountered this problem ~6 months ago and found some workaround.
Now, I moved to new (and faster) HW, and I'm hitting this again and
again in scenarios with lots of I/O + killing the target machine.

Erez

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Re: [PATCH] decrease sndtmo

Reply via email to