We were plagued by this problem a while ago. "closewait" status means the driver sees the device but is waiting for the block device to close before automatically revalidating it.
What version of the initiator (aoe.ko) are you using? After some inspection of the aoe driver source, I now understand that the RTT calculations do not take into account packets that are permanently lost. So it is possible for the driver to get into a state in which the network is flooded with request packets, resent after each TTL expiration, while the TTL is not adjusted. After aoe_deadsecs seconds elapse (300 by default, IIRC) the device is marked down. The aoe_maxout defaults are flawed. With a single target (say, e1.1) and a single initiator, the device will be queried for its "buffer count" (in response to a "query config" status) and return, for example, 64. The aoe initiator then uses this number as the default value of aoe_maxout, and will send up to 64 requests to the target before receiving a response. Now suppose there are 2 ethernet links from the initiator to the target (multipath). The aoe initiator will send up to 2*64, or 128, requests before receiving a response, which can overwhelm the target. It gets worse than that. If the shelf has 3 different slots (e.g. e1.1, e1.2, e1.3), the Linux aoe initiator will queue up to 64 requests per slot per interface (3 * 2 * 64, 384). And if there are 4 different hosts all connecting to the same target, multiply this by 4 (1536). That's far more outstanding requests than the target can safely handle, and intermediate switch buffers are likely to get flooded as well. Here's how we handled it: - Enable hardware flow control on all Ethernet devices carrying aoe traffic. The usual wisdom with hardware flow control is to leave it off, since TCP has pretty good congestion control. However AOE is not TCP, and there is ample evidence that AOE performs better with it enabled. - Ensure network buffers are large enough to store outstanding packets. This is particularly important if you are running jumbo frames. In our sysctl.conf I have: net.core.rmem_default = 262144 net.core.rmem_max = 16777216 net.core.wmem_default = 262144 net.core.wmem_max = 16777216 - Lower the aoe_maxout parameter of the aoe module as much as necessary to preserve stability of the storage network. As mentioned above the default aoe_maxout is obtained by querying the device. Cut this in half, or less, and run some performance tests. We've lowered it all the way to 8 without much sacrifice in performance. - Buy good network switches, if you haven't done so already. The network is only as good as its weakest component. Switches are not a good place to save money, I've found, and not all are made the same. Try a few different models if you have the luxury. Good luck, -Jeff From: Lachlan Evans [mailto:aoetools-disc...@conf.net.au] Sent: Thursday, March 03, 2011 12:50 AM To: aoetools-discuss@lists.sourceforge.net Subject: [Aoetools-discuss] Down,closewait under load Hi list, I've encountered an re-occuring issue where a single AoE device goes into the closewait,down state. I'm hoping someone here might be able to point me in the right direction of where to look to find the underlying cause. A little about the setup: two hosts, one acting as a SAN the other as a Xen host. Both running Debian Squeeze using Debian distributed AoE packages. A 5 disk RAID-6 array configured using md and LVM on the SAN. LVM volumes are then exported via AoE using vblade. There are 5 volumes exported from the SAN to the Xen host: e0.0 171.798GB bond0 up e0.1 268.435GB bond0 up e0.2 53.687GB bond0 up e0.3 128.849GB bond0 up e0.4 53.687GB bond0 up which are then used by the Windows 2003 Server Xen DomU as its disk devices. The issue first occurred on February 17th 19:13 where this was recorded: Feb 17 19:13:30 vmsrv kernel: [456093.648028] VBD Resize: new size 0 I believe this log entry originates from Xen's VBD driver reporting the change. And aoe-stat on the Xen host displaying: e0.0 171.798GB bond0 up e0.1 268.435GB bond0 up e0.2 53.687GB bond0 up e0.3 128.849GB bond0 closewait,down e0.4 53.687GB bond0 up Over night last night: Mar 2 20:28:23 vmsrv kernel: [900000.336023] VBD Resize: new size 0 and aoe-stat displaying: e0.0 171.798GB bond0 up e0.1 268.435GB bond0 closewait,down e0.2 53.687GB bond0 up e0.3 128.849GB bond0 up e0.4 53.687GB bond0 up An aoe-revalidate instantly resolves the issue but in the mean time the disks are unavailable. What leads me to believing that this is an issue related to load is that both occurences have occurred within our backup schedule which generates a large amount of load particularly on the SAN. Up until about a month ago we were running a combination of IET+open-iscsi and the backup schedule (which has not changed since) didn't seem to impact on that combination. Any pointers would be greatly appreciated. Cheers, Lachlan
------------------------------------------------------------------------------ Free Software Download: Index, Search & Analyze Logs and other IT data in Real-Time with Splunk. Collect, index and harness all the fast moving IT data generated by your applications, servers and devices whether physical, virtual or in the cloud. Deliver compliance at lower cost and gain new business insights. http://p.sf.net/sfu/splunk-dev2dev
_______________________________________________ Aoetools-discuss mailing list Aoetools-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aoetools-discuss