On 02/25/2010 07:20 AM, Or Gerlitz wrote:
Mike Christie wrote:
The nop watch dog is more of generic catch all problems. I think I was
saying offlist that iscsi_tcp is not able to detect many transport
problems quickly. For example if you just pulled a cable somewhere in
the network, iscsi_tcp may not get some event telling us this. If nops
are off, then we would have to wait for the scsi cmd timer to fire. To
speed this type of failure up, we do the nops.


If for iser you can figure out every transport problem with some sort of
event notification system then you could turn off nops.

understood, both. Here's the thing: nops serve us well, so I don't want to 
disable them. We do get event notifications and we can't ignore them.

As it stands the problem I face now is that on one of the error flows, e.g if 
you pull the cable and then ep_disconnect is called as of nop out timeout...
it takes the IB CM (Connection Manager) about 1 minute under the params used by 
iser to invoke the iser callback which in turn will signal to ep_disconect.

iser doesn't call directly to the IB CM and currently doesn't have control on 
the timeout setting of the IB CM and generally speaking this 1 minute value is 
good for other possible problems, so at this point I don't think I want to 
change that.

Suppose I have a way to make ep_disconnect not to block for more then few milli 
seconds, e.g by adding a polling api, the same or similar to ep_poll as you 
suggested OR by changing iser to let ep_disconnect return the context to iscsid 
and only after (say) a minute at some later point in time 
iscsi_destroy_endpoint will be called, am I back in business?


I think forget the polling business. My idea sucked. I was only thinking about fixing the issue with blocking all of iscsid. I was not thinking of speeding up failover like you are. I think I like where you are going with the refcounting.



If the answer is YES, keep reading below, but before all what comes next, lets 
make sure to clarify this!

--------------------------------------------

Let me know if you need something like a ep_discocnnect_poll() where the
ep_disconnect() starts the disconnect, then then new ep_disconnect_poll
would wait for it to complete.

the poll option is simple and maybe even trivial to implement from iser point 
of view but it needs changes in user space and the kernel iscsi transport code, 
quite heavy in volume I think.

I started to look on the 2nd option and it gets quite complicated. I don't want 
to load you with the iser detailed, but I'd be happy to check with you that its 
a valid option. Valid in the sense that its legal for ep_disconnect to return 
with the ep not being destroyed.

I looked on commit b40977d95fb3a1898ace6a7d97e4ed1a33a440a4 "iser: fix handling of 
scsi cmnds during recovery" and saw the refcounting you added to the iser end point 
(struct iser_conn) handling flow. Currently its incremented in one place (bind) and 
decremented in either of three places iscsi_iser_conn_{stop, destroy} and 
iser_conn_terminate which is called from iser ep disconnect.

iser_conn_terminate blocks on the event of the ib connection being down and this 
sometimes takes these 60 seconds. If it legal to really clean the ep after ep 
disconnect returns I can try and use this ref count that you added and 
increment/decrement it two more times, one for the IB CM and one for QP buffers. Such 
that only when the CM delivered the event&&  all the buffers where flushed by 
the HW iser_conn_put will sense the refcount hitting zero and will call 
iser_conn_release which will call iscsi_destroy_endpoint. Not that I have a robust 
implementation for this nor its free of possible problems... what do you think which of 
the two is the way to go or should I try a 3rd method.


The refcouning method sounds good. If iser has cleaned up what gets set in iscsi_iser_conn_bind once its ep_disconnect has completed, then you should be ok. So iser_conn->ib_conn has to be NULLd so later when iscsi_iser_conn_bind is called for the new conn it can be set. And we will have to watch out for a rmmod while there are ib_conns left to completely destroy.

--
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to