David Dillow wrote:
On Thu, 2009-10-22 at 20:04 -0400, Vu Pham wrote:
David Dillow wrote:
Yes and you can not disable intirely. I'm still looking at
benefits/advantages to disable it entirely
To me, the advantage is I have a perfectly viable backup path to the
storage, and can immediately start issuing commands to it rather than
waiting for any timeout. On my systems, 1 second can be up to 1500 MB
transferred and a _huge_ number of compute cycles. And I expect those
numbers to grow.
You can still do so with these patches applied by using the right device
name (ie. /dev/sdXXX)
You also don't seem to use the user supplied setting, but hard code the
time to 5 seconds?
I use the user supplied setting for local async event on port error
where link is broken from host to switch
Perhaps that part should be in the patch that adds that support, then?
That's patch #4
For case link broken from target port to switch. We detect this case by
receiving connection closed or wqe error and when this happen unknown
certain seconds already passed by; therefore, I sleep 5 seconds instead
of using user supplied value.
This makes a certain amount of sense; I was confused by the two
unrelated changes in this patch. I'm still not all that happy about a
hard-coded 5 seconds, especially with no explanation about the magic
number.
As I said above, it's not magic at all, it just that certain unknown
seconds already passed by, therefore, just pick X seconds to sleep on.
To really sleep user supplied number of seconds, we need to register
trap to SM and receiving trap for a node leaving the fabric.
It requires a lot of changes in srp_daemon (registering to trap, passing
event down to srp driver) and srp driver (handling this event)
Well, if this were done, then you wouldn't need to sleep at all would
you? Just wait for the trap telling you the target rejoined the fabric?
Perhaps you'd want a delay before tearing down the target connection,
but then that could be part of the user settings above?
Not that I'm sure it is worth it, though.
If it's done, you still need to sleep target->device_loss_timeout
(instead of some unknown seconds + 5) to tear down connection so that
dm-multipath can fail-over.
srp_daemon get the trap right away when target port in/out of fabric, it
pass these events down to srp driver, and srp driver need to sleep
target->device_loss_timeout.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html