I was able to get my failover time down to about 25-30 seconds:

Mar  1 12:32:37 bentCluster-1 kernel: tg3: eth0: Link is down.

Mar  1 12:33:03 bentCluster-1 multipathd: checker failed path 8:224 in
map mpath0
Mar  1 12:33:03 bentCluster-1 kernel: end_request: I/O error, dev sdo,
sector 1249431
Mar  1 12:33:03 bentCluster-1 multipathd: mpath0: remaining active
paths: 1

I ended up setting:

[r...@bentcluster-1 ~]# echo noop > /sys/block/sdn/queue/scheduler
[r...@bentcluster-1 ~]# echo noop > /sys/block/sdo/queue/scheduler

[r...@bentcluster-1 ~]# echo 64 > /sys/block/sdn/queue/max_sectors_kb
[r...@bentcluster-1 ~]# echo 64 > /sys/block/sdo/queue/max_sectors_kb

[r...@bentcluster-1 ~]# echo "5" > /sys/block/sdn/device/timeout
[r...@bentcluster-1 ~]# echo "5" > /sys/block/sdo/device/timeout

I couldn't get it under 90 seconds without "/sys/block/sdn/device/
timeout" being set and in my best test I hit 26 seconds.  I have a
couple questions:

1.  Do I need the scsi timeout to be turned down or could I be hitting
the bug Mike mentioned?

2.  The patch that Mike attached to this tread, is there a Red Hat BZ
associated with it so I can track its progress?  If not should I open
a BZ?

3.  In a best case scenario what kind of failover time can I expect
with multipath and iSCSI?  I see about 25-30 seconds, is this
accurate?  I saw 3 second failover time using bonded NICs instead of
dm-multipath, is there any specific reason to use multipathd instead
of channel bonding?

Thanks for all the help everyone!

-Ben

-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/open-iscsi?hl=en.

Reply via email to