Re: [Open-FCoE] libfc: RFC: rport / discovery fixes

Robert Love Mon, 01 Jun 2009 11:47:17 -0700

On Mon, 2009-06-01 at 11:10 -0700, Joe Eykholt wrote:
> Hi all,
> 
> In last week's discussion I said I was working on using the lport mutex
> instead of the disc mutex to serialize discovery, and having the rport
> list be under the lport mutex, and maintained by the rport module.
> 
> I'm backing away from those aspects of my changes.  Partly because of
> objections, and partly because I had trouble making it work.
> 
> It turns out discovery needs its own locks in order to cancel its work
> queue items while the lport lock is held, e.g., from lport_enter_reset().
> Having the rports list under the discovery lock is convenient as well.
> 
> Now for a new snag.  Rogue rports.   These raise a problem when
> we're converting from a rogue to a real rport and an ELS such as LOGO
> arrives.  Since we need to drop the rport lock while doing 
> fc_remote_port_add(),
> the LOGO finds the rogue rport, and re-queues work to delete it.  We end
> up deleting it twice, once in the CREATE event, and once in the LOGO event.
> 
> This usually didn't happen before because logout would flush any outstanding
> work, but that required logo to drop the lock, during which the rport state
> could change, which seems totally unacceptable to me.  We shouldn't drop
> the lock until we're done with the rport in any of the ELS handlers.
> 
I agree, dropping the lock there is really bad.


> Back to the work handler problem:  I suppose this could be fixed up
> after doing the fc_remote_port_add() by re-evaluating the rogue's state,
> but I'm not sure that would completely work.  It might also be more

Yeah, seems a bit hacky still.

> easily fixed if we could do the fc_remote_port_add() while holding the
> rport mutex, but that's not currently allowed.  It might work anyway
> if the add were done with roles set to UNKNOWN and then changed later
> after dropping the mutex.
> 
This doesn't sound too good either.

> Things would be a lot simpler if we went back to having the
> rdata structure separately allocated from the fc_rport, and with a life of
> its own, and perhaps its own kref to keep it around while in use.
> That would get rid of the need for rogues, and we would just have a
> pointer to the transport fc_rport, and it would have a pointer back to our 
> rdata.
> 
This is basically the opposite approach to making rogues a part of the
transport. It's allowing libfc to manage rports on its own and then
showing them to the OS when they're *ready*. I think this is how the
other LLDs do it.

I don't have a problem with this either. I think that we made too many
assumptions about what the transport provided, regarding rports, when we
started using the fc_rport structure for everything.

I guess there's the possibility that we could end up with both changes
as well although with your approach the transport changes may not be
necessary.

> This also solves another problem.  In rport_lookup(), we hold the disc_mutex
> to protect the list, and do a get_device() on the rport to keep it from going
> away, then drop the disc_mutex.  However, the rport lock isn't held, so the
> rport could be about to be deleted.  I have rport_lookup() checking the state,
> but without grabbing the lock.  The caller of lookup (e.g., ELS receive 
> handling)
> must grab the rport mutex, check the state, and if it's NONE (I'm renaming 
> that
> DELETED), then drop the mutex and repeat the lookup.   That will be
> unnecessary if we have our own rdata structure separately allocated.
> 
> It would change many of our interfaces to use our struct instead of fc_rport.
> Does this sound OK?
> 
Fine with me. Just keep in mind that there are EM changes and NPIV that
you're likely to conflict with by making a lot of changes.

> One suggestion has been to make scsi_transport_fc's rport allow these other
> states as well, allowing them to be created earlier and change identity
> (WWPN, WWNN) after being created.  Any changes along these lines would seem
> to involve the other drivers that use them, however, and I'm not prepared to
> test those.  It seems like way too much of a change.  Maybe that's a longer
> term project someone else would be willing to do.
> 
I should have read the whole mail before commenting earlier. :)

> Along with this, I'd like to rename struct fc_rport_libfc_priv to something 
> else.
> Any suggestions?  How about fc_peer?  or fc_pport?  I want something short 
> like that,
> not specifically a target or initiator, not necessarily an N port or
> a discovered port, maybe just a port?
> 
Of the ones you list I think I like fc_pport best, although it's a bit
strange. fc_peer doesn't mean that much to me since it's missing "port"
and fc_port would falsely indicate some type of inheritance (that
fc_lport should be a child of fc_port) or something (to me at least).

_______________________________________________
devel mailing list
[email protected]
http://www.open-fcoe.org/mailman/listinfo/devel

Re: [Open-FCoE] libfc: RFC: rport / discovery fixes

Reply via email to