Re: [Open-FCoE] ELS time-outs - increase to 20 seconds?

James Smart Fri, 16 Oct 2009 07:20:19 -0700

Joe Eykholt wrote:
> James Smart wrote:
>> FC-LS-2 (r2.11 section 4.2.2 item e) requires ELS timeouts to be 2*R_A_TOV
> 
> Thanks, I thought I saw that somewhere!
> 
> It says the originator "shall detect an Exchange error ... if the Reply 
> Sequence
> is not received within a timeout interval of 2 X R_A_TOV."
> 
> I read that as saying the maximum timeout is 2 * R_A_TOV, it doesn't
> say we can't detect an error earlier, but the intent is probably that.

Sure - all depends on what kind of interoperability you want.  The more on the 
edge you are, the more you (or the admin really) must tightly control the 
environment.

Just keep in mind - designs see things like this, and design to it - e.g. they 
may still be initializing, will receive it, and know they have a window in 
which they don't have to respond immediately. Parallel SCSI had this spec rule 
that said you must accept a command within something like 100ms after coming 
alive on the bus, but didn't have to send a respond to it for up to 10s. You 
would be surprised how much this practice continues.  Which is also what I'm 
guessing the IVR tried to take advantage of here too.

(Granted, 20s seems huge in todays network times...)

> 
> That would apply to FLOGI as well, but I'd like to retry FLOGIs
> faster, partially because of the auto-mode in FIP.  I don't think that
> causes a problem.
>
> The other alternative is for FIP not to allow libfc to start fabric
> logins until it has either selected an FCF or decided it's time to
> try a non-FIP FLOGI.   That would make increasing the libfc FLOGI
> timeout irrelevant, but it's simpler just to timeout FLOGIs faster.

Yes - it would apply to all ELS's.  But you're right on FLOGI's. There the one 
exception to cut corners on.  As it's only between you and the switch, there's 
not a full round trip delay in this path, and if the link is live, the switch 
should be alive too. You should allow some "bridging delay" to the F_Port, but 
this doesn't exist with the current fcoe connections. Your 2 seconds is 
probably good (should be ms response times).

-- james

> 
>       Joe
> 
> 
>> Joe Eykholt wrote:
>>> Hi all,
>>>
>>> We noticed a problem in a complex fabric with IVR - inter-VSAN routing
>>> with certain targets.
>>>
>>> IVR provides a way for a shared device such as a tape drive to appear
>>> in multiple VSANs by a sort of proxying/NAT setup.  It's available on
>>> some MDS switches.
>>>
>>> What happens is that the switches can delay the PLOGI to the target
>>> for longer than the current E_D_TOV (2-second) timeout.  When libfc 
>>> aborts
>>> the PLOGI, the abort causes the target to send back a LOGO.  This may
>>> not cause a problem, when we retry, but it would be nicer if we didn't
>>> retry so quickly.
>>>
>>> Other HBAs use a timeout of 2 * R_A_TOV = 20 seconds, I'm told.
>>>
>>> Does anyone see a problem with changing libfc accordingly?
>>>
>>> This shouldn't cause a problem because PLOGI retries should be rare,
>>> but if they do occur would delay target setup by a lot.
>>>
>>> I would keep the FLOGI retry period at 2 seconds, however.
>>>
>>>     Thanks,
>>>     Joe
>>> _______________________________________________
>>> devel mailing list
>>> [email protected]
>>> http://www.open-fcoe.org/mailman/listinfo/devel
>>>
>>>   
> 
> 
_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Re: [Open-FCoE] ELS time-outs - increase to 20 seconds?

Reply via email to