On 08/05/2009 11:26 AM, Mike Christie wrote:
> On 08/05/2009 11:01 AM, Erez Zilber wrote:
>> On Wed, Aug 5, 2009 at 6:19 PM, Mike Christie<micha...@cs.wisc.edu>   wrote:
>>> On 08/04/2009 01:12 PM, Erez Zilber wrote:
>>>> On Tue, Aug 4, 2009 at 8:17 PM, Mike Christie<micha...@cs.wisc.edu>     
>>>> wrote:
>>>>> Erez Zilber wrote:
>>>>>> I'm running with open-iscsi.git HEAD + the check suspend bit patch +
>>>>>> the wake xmit on error patch. If I disconnect the cable on the
>>>>>> initiator side (even while not running IO), I see that after sending
>>>>>> the signal, the  iscsi_q_XX thread reaches 100% cpu. I ran it over
>>>>>> several 1GB/ 10 GB drivers and got the same results.
>>>>>> If I remove the  wake xmit on error patch, I don't see this behavior.
>>>>> Shoot, I have been running the xmit wakeup and suspend bit patch here
>>>>> fine. Let me do some more testing.
>>>>> Is this something you always hit? Could you send me the final patch you
>>>>> ended up using?
>>>> I see this every time. Note that I'm not running with
>>>> linux-2.6-iscsi.git. I'm using the open-iscsi.git tree + the 2 patches
>>>> that I took without any change (using git-show) from the
>>>> linux-2.6-iscsi.git tree. Which tree did you test it on?
>>>> I added some printks to the code and saw that the signal does get sent
>>>> from iscsi_sw_tcp_conn_stop, but I didn't see that (rc == -EINTR || rc
>>>> == -EAGAIN) in  iscsi_sw_tcp_xmit (), even when I ran IO on that
>>>> session.
>>> Does r in iscsi_sw_tcp_xmit_segment == 0?
>> No, it is never zero.
>>> If not I think you need a diffferent patch. In one of the patch versions
>>> iscsi_sw_tcp_xmit_segment could return -ENODATA (this is when I had a
>>> check for suspend_tx in there). iscsi_sw_tcp_xmit did not check this and
>>> so I think  we can loop.
>>> Could you try the attached patch. It was made over open-iscsi.git for
>>> you. I dropped the suspend bit check in iscsi_sw_tcp_xmit_segment,
>>> because it is not needed. If we end up blocking the signal will wake us.
>> I ran it and got the same 100% cpu usage. Did you try to run it on
>> your machines with open-iscsi.git? Did you see a different behavior?
> I just ran it. Maybe I am looking for the wrong thing though.
> For your problem, when the signal is sent does the recovery go ok and we
> end up reconnecting? But the problem is just that the xmit thread takes
> up 100% of the cpu?

Ignore this. I see the problem now. I was thinking you did not 
reconnect. I see the cpu usage. Let me do some digging.

You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to open-iscsi@googlegroups.com
To unsubscribe from this group, send email to 
For more options, visit this group at http://groups.google.com/group/open-iscsi

Reply via email to