On Wed, 2008-09-10 at 12:41 -0500, Mike Christie wrote: > Robert Love wrote: > > On Wed, 2008-09-10 at 10:03 -0500, Mike Christie wrote: > >> Robert Love wrote: > >>> On Tue, 2008-09-09 at 10:22 -0700, Robert Love wrote: > >>>> The following series implements a change to the rport state machine and > >>>> applies a locking strategy to that state machine. > >>>> > >>>> 1) A callback function is added to the LP for RP events (READY, FAILED, > >>>> LOGO) > >>>> 2) GNN_ID and GPN_ID are removed, rport state machine starts at PLOGI > >>>> - uses a dummy rport until PLOGI response > >>>> 3) A READY state is added to the RP state machine > >>>> 4) The RP state machine locking has been given a locking strategy > >>>> 5) RP only has one error/retry handling routine > >>>> 6) Naming conventions are changed > >>>> rport (remote port) = fc_rport pointer > >>>> rdata (remote port private) = fc_rport_libfc_priv pointer > >>>> > >>>> I think these patches still need a little cleaning up, but they're > >>>> mostly good. > >>>> I'm curious what everyone thinks about the folowing changes to: > >>>> > >>>> a) the locking policy > >>>> b) the event callback mechanism > >>>> c) the consolidation of fc_rport_error and fc_rport_retry. > >>>> > >>>> All comments are welcome. If there are no major issues I'll do a little > >>>> cleanup, > >>>> drop the RFC, resubmit and then commit. > >>> I have three problems with this patch-set right now. > >>> > >>> 1. Need to delete dummy rports when there is a timeout scenario since > >>> they're not tracked by the FC transport class. > >>> > >>> 2. Should delay the retry a bit if there is a resource allocation error. > >>> > >>> 3. Can't hold locks while calling some functions. > >>> > >>> It's this last one that's going to be difficult. My problem is that I > >>> want to call fc_remote_port_add/delete while I'm holding the rport lock. > >>> Those functions expect to be called without a lock held. Also, I haven't > >>> finished the lport locking, but I expect that the event_callback > >>> function will want to lock the lport so that it can change the lport's > >>> state. > >>> > >>> In all of these cases I want to call a function while I'm holding the > >>> rport lock and those functions expect to be called without the lock > >>> held. Right now my patches drop the lock momentarily before calling one > >>> of these functions and then re-grab the lock when the function returns. > >>> This is not ideal. If I keep it this way I'd need to re-check the state > >>> once I re-grab the lock and it'll get ugly. > >>> > >> For the fc class rport issues the problem is that those functions can > >> use GFP_KERNEL or flush a work queue? Do we have to use spin locks for > > > > I believe so. The fc_remote_port_add() function uses GFP_KERNEL. > > > >> the rport and lrport locking? Why not use a mutex or semaphore, since we > >> have process context and it is not performance critical? > > > > That's a good idea. I changed my patches to use a semaphore and it's > > working just fine. However, I don't understand the contexts that we > > might be called in from SCSI. Does SCSI always call us from process > > context? For recv and sysfs we should always be in process context, but > > SCSI I'm not sure... > > > > Did you change the rport and lport or just the rport locking? > Well, currently I just changed the rport code. I'm trying to get through that fist and then I'll do the lport stuff (reorg and locking). You're right that it's all tied together though.
> For normal IO paths we can be called from a softirq or thread. This path > never takes the rport or lport lock though. > Excellent, I was just about to go through those paths to make sure. > The scsi eh paths (abort, lun reset, host reset, etc) will always be > called from a thread. host reset will call lport_reset. I think in the Yeah, I just saw this too. > code with your patches that code might be missing a lock. the I'm sure that it is. :) I'll get to it. I've got a start on the lport stuff, but it still needs a fair amount of work. > tt->lport_reset is set to fc_lport_enter_reset and I thought you want to > hold the lport lock. > > If you are changing the rport and lrport locking then don't forget about > checking out the lrport code. I think you want to change that to use a > workqueue like how the rport code is, because fc_rport_timeout is run > from a work queue can be use a semaphore, but fc_lport_timeout is run > from a timer (softirq/bottom-half) context so that cannot sleep. > Good point. I'll make this change... > I think actually my comment about having process context is wrong for > the fc_exch.c code paths too. fc_exch_timeout would need to be converted > to use the workqueue like the rport code too. fc_exch_timeout is run > from a timer and can call into lport/rport/ns code that takes a lock. I'll look into this too. Thanks for the comments. _______________________________________________ devel mailing list [email protected] http://www.open-fcoe.org/mailman/listinfo/devel
