Re: [Open-FCoE] libfc: RFC: rport / discovery fixes

Joe Eykholt Mon, 01 Jun 2009 14:28:47 -0700

Robert Love wrote:
> On Mon, 2009-06-01 at 11:10 -0700, Joe Eykholt wrote:
>> Hi all,
>>
>> In last week's discussion I said I was working on using the lport mutex
>> instead of the disc mutex to serialize discovery, and having the rport
>> list be under the lport mutex, and maintained by the rport module.
>>
>> I'm backing away from those aspects of my changes.  Partly because of
>> objections, and partly because I had trouble making it work.
>>
>> It turns out discovery needs its own locks in order to cancel its work
>> queue items while the lport lock is held, e.g., from lport_enter_reset().
>> Having the rports list under the discovery lock is convenient as well.
>>
>> Now for a new snag.  Rogue rports.   These raise a problem when
>> we're converting from a rogue to a real rport and an ELS such as LOGO
>> arrives.  Since we need to drop the rport lock while doing 
>> fc_remote_port_add(),
>> the LOGO finds the rogue rport, and re-queues work to delete it.  We end
>> up deleting it twice, once in the CREATE event, and once in the LOGO event.
>>
>> This usually didn't happen before because logout would flush any outstanding
>> work, but that required logo to drop the lock, during which the rport state
>> could change, which seems totally unacceptable to me.  We shouldn't drop
>> the lock until we're done with the rport in any of the ELS handlers.
>>
> I agree, dropping the lock there is really bad.
> 
>> Back to the work handler problem:  I suppose this could be fixed up
>> after doing the fc_remote_port_add() by re-evaluating the rogue's state,
>> but I'm not sure that would completely work.  It might also be more
> 
> Yeah, seems a bit hacky still.
> 
>> easily fixed if we could do the fc_remote_port_add() while holding the
>> rport mutex, but that's not currently allowed.  It might work anyway
>> if the add were done with roles set to UNKNOWN and then changed later
>> after dropping the mutex.
>>
> This doesn't sound too good either.
> 
>> Things would be a lot simpler if we went back to having the
>> rdata structure separately allocated from the fc_rport, and with a life of
>> its own, and perhaps its own kref to keep it around while in use.
>> That would get rid of the need for rogues, and we would just have a
>> pointer to the transport fc_rport, and it would have a pointer back to our 
>> rdata.
>>
> This is basically the opposite approach to making rogues a part of the
> transport. It's allowing libfc to manage rports on its own and then
> showing them to the OS when they're *ready*. I think this is how the
> other LLDs do it.
> 
> I don't have a problem with this either. I think that we made too many
> assumptions about what the transport provided, regarding rports, when we
> started using the fc_rport structure for everything.
> 
> I guess there's the possibility that we could end up with both changes
> as well although with your approach the transport changes may not be
> necessary.
> 
>> This also solves another problem.  In rport_lookup(), we hold the disc_mutex
>> to protect the list, and do a get_device() on the rport to keep it from going
>> away, then drop the disc_mutex.  However, the rport lock isn't held, so the
>> rport could be about to be deleted.  I have rport_lookup() checking the 
>> state,
>> but without grabbing the lock.  The caller of lookup (e.g., ELS receive 
>> handling)
>> must grab the rport mutex, check the state, and if it's NONE (I'm renaming 
>> that
>> DELETED), then drop the mutex and repeat the lookup.   That will be
>> unnecessary if we have our own rdata structure separately allocated.
>>
>> It would change many of our interfaces to use our struct instead of fc_rport.
>> Does this sound OK?
>>
> Fine with me. Just keep in mind that there are EM changes and NPIV that
> you're likely to conflict with by making a lot of changes.
> 
>> One suggestion has been to make scsi_transport_fc's rport allow these other
>> states as well, allowing them to be created earlier and change identity
>> (WWPN, WWNN) after being created.  Any changes along these lines would seem
>> to involve the other drivers that use them, however, and I'm not prepared to
>> test those.  It seems like way too much of a change.  Maybe that's a longer
>> term project someone else would be willing to do.
>>
> I should have read the whole mail before commenting earlier. :)
> 
>> Along with this, I'd like to rename struct fc_rport_libfc_priv to something 
>> else.
>> Any suggestions?  How about fc_peer?  or fc_pport?  I want something short 
>> like that,
>> not specifically a target or initiator, not necessarily an N port or
>> a discovered port, maybe just a port?
>>
> Of the ones you list I think I like fc_pport best, although it's a bit
> strange. fc_peer doesn't mean that much to me since it's missing "port"
> and fc_port would falsely indicate some type of inheritance (that
> fc_lport should be a child of fc_port) or something (to me at least).


I have the same misgivings, and after starting to code this up, it seems
really strange to have a function called fc_rport_login that takes a pport
as an arg, ... that would make us want to call it fc_pport_login, and then
to rename fc_rport.c to fc_pport.c ...etc. ... it get's too messy.

Also, despite the name, fnic is accessing fields in fc_rport_libfc_priv,
to find out whether to do retry, etc., on offloaded ops.

So, to keep things sane, I'd like to keep the name the same, or maybe
just drop the libfc portion. (fc_rport_priv).  Less code changes.

        Joe


_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Re: [Open-FCoE] libfc: RFC: rport / discovery fixes

Reply via email to