I have done some testing today on the linux kernel 2.6.27.19 aoe implementation using the mpio vblade patch previously provided and have good results. Using the kernel driver from coraid does not seem like it will work correct for this. The reason being, I couldn't find a rerouting of packets to another target like in the linux source. The specific lines in the linux kernel 2.6.27.19
aoecmd.c line 550: if (n > HELPWAIT /* see if another target can help */ && (tt != d->targets || d->targets[1])) d->htgt = tt; aoe.h line 97: HELPWAIT = 20, My test setup: Target machine (tgt) with two network interfaces running a vblade for one file advertised on eth0 and eth1: vbladed -i eth1 1 0 eth0 /mnt/d2.img Initiator (intr) just using one interface eth0: intr:/# aoe-stat e1.0 10.485GB eth0 up intr:/# mkfs.xfs /dev/etherd/e1.0 ... intr:/# mount /dev/etherd/e1.0 /mnt -o _netdev Run something I know will take awhile: intr:/# cd /mnt/; dd if=/dev/zero of=1 bs=24k count=120000 oflag=sync Network stats on target while dd runs: tgt:/# ifstat eth0 eth1 KB/s in KB/s out KB/s in KB/s out 20963.33 1203.90 20961.22 1203.71 21828.00 1252.98 21828.47 1252.90 22015.32 1263.72 22014.22 1263.52 23466.55 1347.02 23468.52 1346.94 .... Bring down an interface on the target and check stats: tgt:/# ifconfig eth1 0 down; ifstat eth0 KB/s in KB/s out 0.06 0.14 0.12 0.13 0.93 0.34 0.18 0.13 0.36 0.13 0.06 0.13 0.06 0.13 0.18 0.17 0.18 0.13 0.47 0.41 3.52 3.44 0.15 0.13 4.51 0.52 0.21 0.13 7.72 6.38 1.98 1.88 0.12 0.13 0.06 0.13 0.06 0.13 0.06 0.13 0.18 0.13 0.18 0.13 36904.46 2088.69 43033.43 2436.36 Since HELPWAIT is set to 20 seconds and ifstat outputs about every 1 second, it lines up with that setting before eth0 started receiving packets again. While this was happening the initiator was sending retransmits (failing): intr:/# cat /dev/etherd/err retransmit e1.0 oldtag=09ed1...@1004e1971 newtag=09f11971 ... .... When HELPWAIT is reached, the retransmit errors stop. And eventually the dd finishes _after_ failing eth1 on the target: 2949316608 bytes (2.9 GB) copied, 121.712 seconds, 24.2 MB/s Another note, dmesg is clear of xfs/aoe errors. Can somebody else try testing? -- Matth Ingersoll On Feb 27, 2009, at 5:32 PM, Matthew Ingersoll wrote: > Finally had some time to read through my own patch and fixed a bug > (static size set that should be dynamic for pollfd), cleaned up > mallocs and made naming more consistent. I also realized that it > was pasted inline and seems to have been corrupted at some point, > this time I'm trying it as an attachment - is there a recommended > method? > > -- > Matth Ingersoll > > > <vblade-19-mpio-1.diff> > > > > On Feb 25, 2009, at 10:26 PM, Matthew Ingersoll wrote: > >> Just finished a vblade-19 MPIO patch. So far it seems to work great >> for throughput and disables an interface on poll() error. I'm not >> sure how the change will be handle on the initiator - seems like aoe- >> revalidate works ok. Hopefully somebody can give me some feedback. > > ------------------------------------------------------------------------------ Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA -OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise -Strategies to boost innovation and cut costs with open source participation -Receive a $600 discount off the registration fee with the source code: SFAD http://p.sf.net/sfu/XcvMzF8H _______________________________________________ Aoetools-discuss mailing list Aoetools-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aoetools-discuss