On 08/05/2009 11:26 AM, Mike Christie wrote: > On 08/05/2009 11:01 AM, Erez Zilber wrote: >> On Wed, Aug 5, 2009 at 6:19 PM, Mike Christie<micha...@cs.wisc.edu> wrote: >>> On 08/04/2009 01:12 PM, Erez Zilber wrote: >>>> On Tue, Aug 4, 2009 at 8:17 PM, Mike Christie<micha...@cs.wisc.edu> >>>> wrote: >>>>> Erez Zilber wrote: >>>>>> I'm running with open-iscsi.git HEAD + the check suspend bit patch + >>>>>> the wake xmit on error patch. If I disconnect the cable on the >>>>>> initiator side (even while not running IO), I see that after sending >>>>>> the signal, the iscsi_q_XX thread reaches 100% cpu. I ran it over >>>>>> several 1GB/ 10 GB drivers and got the same results. >>>>>> >>>>>> If I remove the wake xmit on error patch, I don't see this behavior. >>>>>> >>>>> Shoot, I have been running the xmit wakeup and suspend bit patch here >>>>> fine. Let me do some more testing. >>>>> >>>>> Is this something you always hit? Could you send me the final patch you >>>>> ended up using? >>>> I see this every time. Note that I'm not running with >>>> linux-2.6-iscsi.git. I'm using the open-iscsi.git tree + the 2 patches >>>> that I took without any change (using git-show) from the >>>> linux-2.6-iscsi.git tree. Which tree did you test it on? >>>> >>>> I added some printks to the code and saw that the signal does get sent >>>> from iscsi_sw_tcp_conn_stop, but I didn't see that (rc == -EINTR || rc >>>> == -EAGAIN) in iscsi_sw_tcp_xmit (), even when I ran IO on that >>>> session. >>>> >>> Does r in iscsi_sw_tcp_xmit_segment == 0? >>> >> No, it is never zero. >> >>> If not I think you need a diffferent patch. In one of the patch versions >>> iscsi_sw_tcp_xmit_segment could return -ENODATA (this is when I had a >>> check for suspend_tx in there). iscsi_sw_tcp_xmit did not check this and >>> so I think we can loop. >>> >>> Could you try the attached patch. It was made over open-iscsi.git for >>> you. I dropped the suspend bit check in iscsi_sw_tcp_xmit_segment, >>> because it is not needed. If we end up blocking the signal will wake us. >> I ran it and got the same 100% cpu usage. Did you try to run it on >> your machines with open-iscsi.git? Did you see a different behavior? >> > > I just ran it. Maybe I am looking for the wrong thing though. > > For your problem, when the signal is sent does the recovery go ok and we > end up reconnecting? But the problem is just that the xmit thread takes > up 100% of the cpu? >
Ignore this. I see the problem now. I was thinking you did not reconnect. I see the cpu usage. Let me do some digging. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to open-iscsi@googlegroups.com To unsubscribe from this group, send email to open-iscsi+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---