David Dillow wrote:
On Thu, 2009-10-22 at 20:04 -0400, Vu Pham wrote:
David Dillow wrote:
Yes and you can not disable intirely. I'm still looking at benefits/advantages to disable it entirely

To me, the advantage is I have a perfectly viable backup path to the
storage, and can immediately start issuing commands to it rather than
waiting for any timeout. On my systems, 1 second can be up to 1500 MB
transferred and a _huge_ number of compute cycles. And I expect those
numbers to grow.

You can still do so with these patches applied by using the right device name (ie. /dev/sdXXX)

You also don't seem to use the user supplied setting, but hard code the
time to 5 seconds?
I use the user supplied setting for local async event on port error where link is broken from host to switch

Perhaps that part should be in the patch that adds that support, then?

That's patch #4
For case link broken from target port to switch. We detect this case by receiving connection closed or wqe error and when this happen unknown certain seconds already passed by; therefore, I sleep 5 seconds instead of using user supplied value.

This makes a certain amount of sense; I was confused by the two
unrelated changes in this patch. I'm still not all that happy about a
hard-coded 5 seconds, especially with no explanation about the magic
number.
As I said above, it's not magic at all, it just that certain unknown seconds already passed by, therefore, just pick X seconds to sleep on.
To really sleep user supplied number of seconds, we need to register trap to SM and receiving trap for a node leaving the fabric. It requires a lot of changes in srp_daemon (registering to trap, passing event down to srp driver) and srp driver (handling this event)

Well, if this were done, then you wouldn't need to sleep at all would
you? Just wait for the trap telling you the target rejoined the fabric?
Perhaps you'd want a delay before tearing down the target connection,
but then that could be part of the user settings above?

Not that I'm sure it is worth it, though.
If it's done, you still need to sleep target->device_loss_timeout (instead of some unknown seconds + 5) to tear down connection so that dm-multipath can fail-over.

srp_daemon get the trap right away when target port in/out of fabric, it pass these events down to srp driver, and srp driver need to sleep target->device_loss_timeout.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to