Dev, Vasu wrote:
>> -----Original Message-----
>> From: Mike Christie [mailto:[EMAIL PROTECTED]
>> Sent: Tuesday, August 12, 2008 12:45 PM
>> To: Dev, Vasu
>> Cc: [email protected]
>> Subject: Re: [Open-FCoE] [RFC PATCH 1/2] libfc: removed checks for
>> timer_pending and then updating exch ref count.
>>
>> Dev, Vasu wrote:
>>>>> exch_done would then need to be modified so that instead of
>>>>> fc_exch_release doing
>>>>>
>>>>>                  spin_lock_bh(&mp->em_lock);
>>>>>                  WARN_ON(mp->total_exches <= 0);
>>>>>                  mp->total_exches--;
>>>>>                  mp->exches[ep->xid - mp->min_xid] = NULL;
>>>>>                  list_del(&ep->ex_list);
>>>>>                  spin_unlock_bh(&mp->em_lock);
>>>>>
>>>>> exch_done would do that, then release its hold. And actually exch_done
>>>>> would need to check that if there were multiple ep->resp functions (one
>>>>> from fc_fcp_send_cmd and one from the abort)
>>>>> then we do not do the mp
>>>>> manipulation above until the exch_done has been called for the last
>>>>> ep->resp.
>>>>>
>>> I don't understand calling exch_done() twice from upper layer of
>> fc_exch.c
>>> and checking for multiple ep->resp functions to fix above described issue
>> Do you mean you do not understand how moving the clearing of the mp
>> exches ref removes the need for the hack to add the extra check refcount
>> check in the original patch?
> 
> I thought you were talking about issue of fcp exch abort and extra exch done
> issue and so called hack is different thing. The hack is for receiving any
> frame when we are about free exch, that can be from jammer or mal formed
> corrupt frame. So I think two issues are mixed here.

My suggestion for moving the mp clearing is for the same thing you 
describe above. It is to prevent the race where one process is trying to 
do a hold while the other is trying to do the final release.

With your patch, if there was a valid exchange that we are trying to 
free and we get invalid frames then your patch will allow the invalid 
frame to get a hold on the ep instead of accessing freed memory. Or if 
it gets the lock after the release function has cleared ep from the mp 
array then we drop it.

With my suggestion if there was a valid ecxhange that we are trying to 
free and we get invalid frames my patch will drop them if they get the 
mp lock after exch_done had got it and cleared the array. If they get 
the lock before exch_done does then it will get a ref instead of 
accessing free mem.



> 
> I gathered that you were talking about this issue with fcp and abort :-
> 
> 1. fcp issued the abort but abort response pending and no extra exch ref for 
> abort.
> 2. scsi response status comes back it is about to call exch done
> 3. Abort respone comes that cleanup exchange.
> 4. Now exch done from 2 will release the same exchange which is already 
> release by 3. So this will cause extra exch ref decrement.
> 
> So how so called hack is going to prevent this issue?

Wait the hack is the extra refcount check in the release function right? 
I am not saying that hack will prevent that.

The issue about getting a extra ref on the exch when we do an abort is 
becuase in my original idea I had fc_exch.c just pass everything up like 
is done today for normal processing, and we were calling ep->resp for 
the abts response too. So if we called a release for the normal 
processing (this is for the hold when it was created), then we would 
need a ref for the abort response so the ep is not freed.


> 
> You were talking about "multiple ep->resp functions" and one of them from
> abort but currently fc_exch.c doesn't call upper layer for abort respone.
> If you meant to add calling upper layer for abort response as

Yes, I did. Maybe that is where we were missing each other. I have been 
trying to fix the code so the pkt abort funciton gets a response and 
that is how I got into this mess :)

I was calling it for the normal IO and for the abort though, so fc_fcp 
would have had to sort everything out.

Sorry about that. I meant to send a email to you last night telling you 
not to worry about responding because I would try to send a patch to 
make it clear and fix issues I had found while working on it.


> I also mentioned in my point 2 below then I misunderstood your notes.
> 
>> Or do you not understand what I meant by how this will actually be
>> implemented?
>>
>>
>>> along with extra exch ref for abort in fc_exch.c. How are you planning to
>>> set these multiple exch resp functions and how that is going to change
>> We do not need multiple functions. In my patch I was telling you about
> 
> I guess that patch will level set us on solution since now this lengthy
> discussion is creating more confusion. I'll wait for that patch to
> understand your solution.
> 
>> the response handler for the function that started the exchange will be
>> able to handle normal completion of the IO and abort response, since it
>> is the same object being worked on (same fsp in fc_fcp.c's case). The
>> fc_fcp_recv and other completion paths in fc_fcp.c will be smart enough
>> to know that if it has sent an abort and has got status for a command
>> then it needs to wait for the abort response before calling exch_done.
>> And actually are we not supposed to call the upper layers if there is an
>> abort in progress in csae we get a recovery qualifier.
>>
> 
> The fc_exch.c does recovery qualifier processing but we should call
> fc_fcp.c for abort response to sync exch done issue describe above.
> I also suggested in point 2 below to add call to upper layer for
> abort response as I said before also.
>

I did not mean to call it at all :) My fault it was not clear. My 
question was that I thought fcp4 or fc-fs said that once we send an 
abort we should not process any other sequences, and we have to wait for 
the abort response to see if there was a recovery qual before we can 
process the normal IO. So I just mean that for those normal IO sequences 
are we supposed to be calling the ep->resp function when a abts is in 
flight?


snip other crap that was a result of me misinterpreting your mail. For 
some reason, when you did not want the abort to grab a ref I thought you 
were going to just add some code in fc_exch to do what I think both of 
us were saying fc_fcp.c was going to do.
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Reply via email to