On 08/03/2009 12:37 PM, Erez Zilber wrote: > On Mon, Aug 3, 2009 at 7:27 PM, Mike Christie<[email protected]> wrote: >> On 08/03/2009 02:31 AM, Erez Zilber wrote: >>> On Sat, Aug 1, 2009 at 6:31 AM, Mike Christie<[email protected]> >>> wrote: >>>> Mike Christie wrote: >>>>> Mike Christie wrote: >>>>>> Mike Christie wrote: >>>>>>> Mike Christie wrote: >>>>>>>> On 07/31/2009 07:43 AM, Erez Zilber wrote: >>>>>>>>> I thought that this patch just reduces the timeout from 15 to 3. Does >>>>>>>>> it also fix the 3 sndtmo periods or is it another patch? What was the >>>>>>>> Another patch. You should have it. >>>>>>>> >>>>>>>>> bug that caused the 3 sndtmo periods waiting? >>>>>>>>> >>>>>>>> I think I meant two. >>>>>>>> >>>>>>>> If iscsi_tcp_xmit_segment sent some data, then got EAGIN, it would drop >>>>>>>> the EAGIN. iscsi_xmit would then retry the operation so we would wait >>>>>>>> again. >>>>>>>> >>>>>>> There is one more bug. >>>>>>> >>>>>>> Could you try the 15->3 sec snd tmp patch plus the attached patch? >>>>>>> >>>>>>> Another problem is that if we had multiple tasks on the cmd or requeue >>>>>>> lists, and iscsi_tcp returns a error, the write_space function can still >>>>>>> run and queue iscsi_data_xmit. If it was a legetimate problem and >>>>>>> iscsi_conn_failure was run but we raced and iscsi_data_xmit was run >>>>>>> first it could miss the suspend bit checks, and start trying to send >>>>>>> data again and hit another timeout. >>>>>>> >>>>>> Here is a updated patch that also fixes the problem for cxgb3i and >>>>>> iscsi_tcp. >>>>>> >>>>> Sorry. This one fixes a possible leak. >>>>> >>>> And here is a patch to signal the xmit thread. I am not sure what I was >>>> doing wrong before. For some reason the thread would not break out of >>>> the wait. With this patch it is working. It is built over the >>>> check-suspend2.patch. >>>> >>> I've updated my tree to the open-iscsi.git head (commit >>> f10c7942ad0dd26388eed0b46c44bad429fce0ad) and applied the following 3 >>> patches on top of it: >>> 1. iscsi_tcp-reduce-sk-sndtmo.patch >>> 2. check-suspend2 >>> 3. wake-xmit-on-err - this one breaks because I applied >>> iscsi_tcp-reduce-sk-sndtmo.patch. I had to change the following hunk: >>> >>> @@ -304,7 +304,7 @@ static int iscsi_sw_tcp_xmit(struct iscsi_conn *conn) >>> * is getting stopped. libiscsi will know so propogate err >>> * for it to do the right thing. >>> */ >>> - if (rc == -EAGAIN) >>> + if (rc == -EAGAIN || rc == -EINTR || rc == -ENODATA) >>> return rc; >>> else if (rc< 0) { >>> rc = ISCSI_ERR_XMIT_FAILED; >>> >>> to: >>> >>> diff --git a/kernel/iscsi_tcp.c b/kernel/iscsi_tcp.c >>> index af02499..65492e4 100644 >>> --- a/kernel/iscsi_tcp.c >>> +++ b/kernel/iscsi_tcp.c >>> @@ -283,7 +283,7 @@ static int iscsi_sw_tcp_xmit(struct iscsi_conn *conn) >>> * is getting stopped. libiscsi will know so propogate err >>> * for it to do the right thing. >>> */ >>> - if (rc == -EAGAIN) >>> + if (rc == -EAGAIN || rc == -EINTR || rc == -ENODATA) >>> return -ENOBUFS; >>> else if (rc< 0) { >>> rc = ISCSI_ERR_XMIT_FAILED; >>> >>> I'm not sure if this is what you meant. Anyway, I saw on another >> It was. >> >>> thread that you plan to modify iscsi_tcp-reduce-sk-sndtmo so it will >>> depend on the transport class timeout. >> We actually do not need to do that when we have wake-xmit-on-err, >> because that patch should wake the xmit thread right away and we do not >> have to worry about waiting in sendpage/sendmsg. >> > > So, the sk sndtmo patch is not required anymore, right? >
Yeah. I do not think it is needed for this problem. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/open-iscsi -~----------~----~----~----~------~----~------~--~---
