We were plagued by this problem a while ago.  "closewait" status means
the driver sees the device but is waiting for the block device to close
before automatically revalidating it.

 

What version of the initiator (aoe.ko) are you using?

 

After some inspection of the aoe driver source, I now understand that
the RTT calculations do not take into account packets that are
permanently lost.  So it is possible for the driver to get into a state
in which the network is flooded with request packets, resent after each
TTL expiration, while the TTL is not adjusted.  After aoe_deadsecs
seconds elapse (300 by default, IIRC) the device is marked down.

 

The aoe_maxout defaults are flawed.  With a single target  (say, e1.1)
and a single initiator, the device will be queried for its "buffer
count" (in response to a "query config" status) and return, for example,
64.  The aoe initiator then uses this number as the default value of
aoe_maxout, and will send up to 64 requests to the target before
receiving a response.  Now suppose there are 2 ethernet links from the
initiator to the target (multipath).  The aoe initiator will send up to
2*64, or 128, requests before receiving a response, which can overwhelm
the target.

 

It gets worse than that.  If the shelf has 3 different slots (e.g. e1.1,
e1.2, e1.3), the Linux aoe initiator will queue up to 64 requests per
slot per interface (3 * 2 * 64, 384).  And if there are 4 different
hosts all connecting to the same target, multiply this by 4 (1536).
That's far more outstanding requests than the target can safely handle,
and intermediate switch buffers are likely to get flooded as well.

 

Here's how we handled it:

 

-      Enable hardware flow control on all Ethernet devices carrying aoe
traffic.  The usual wisdom with hardware flow control is to leave it
off, since TCP has pretty good congestion control.  However AOE is not
TCP, and there is ample evidence that AOE performs better with it
enabled.

 

-      Ensure network buffers are large enough to store outstanding
packets.  This is particularly important if you are running jumbo
frames.  In our sysctl.conf I have:

 

net.core.rmem_default = 262144

net.core.rmem_max = 16777216

net.core.wmem_default = 262144

net.core.wmem_max = 16777216

 

-      Lower the aoe_maxout parameter of the aoe module as much as
necessary to preserve stability of the storage network.  As mentioned
above the default aoe_maxout is obtained by querying the device.  Cut
this in half, or less, and run some performance tests.  We've lowered it
all the way to 8 without much sacrifice in performance.

 

-      Buy good network switches, if you haven't done so already.  The
network is only as good as its weakest component.  Switches are not a
good place to save money, I've found, and not all are made the same.
Try a few different models if you have the luxury.

 

Good luck,

 

-Jeff

 

From: Lachlan Evans [mailto:aoetools-disc...@conf.net.au] 
Sent: Thursday, March 03, 2011 12:50 AM
To: aoetools-discuss@lists.sourceforge.net
Subject: [Aoetools-discuss] Down,closewait under load

 

Hi list,

I've encountered an re-occuring issue where a single AoE device goes
into the closewait,down state.  I'm hoping someone here might be able to
point me in the right direction of where to look to find the underlying
cause.

A little about the setup:  two hosts, one acting as a SAN the other as a
Xen host.   Both running Debian Squeeze using Debian distributed AoE
packages.
 A 5 disk RAID-6 array configured using md and LVM on the SAN.  LVM
volumes are then exported via AoE using vblade.  There are 5 volumes
exported from the SAN to the Xen host:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 up
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 up
      e0.4        53.687GB  bond0 up

which are then used by the Windows 2003 Server Xen DomU as its disk
devices.

The issue first occurred on February 17th 19:13 where this was recorded:

Feb 17 19:13:30 vmsrv kernel: [456093.648028] VBD Resize: new size 0

I believe this log entry originates from Xen's VBD driver reporting the
change.

And aoe-stat on the Xen host displaying:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 up
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 closewait,down
      e0.4        53.687GB  bond0 up

Over night last night:

Mar  2 20:28:23 vmsrv kernel: [900000.336023] VBD Resize: new size 0

and aoe-stat displaying:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 closewait,down
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 up
      e0.4        53.687GB  bond0 up

An aoe-revalidate instantly resolves the issue but in the mean time the
disks are unavailable.

What leads me to believing that this is an issue related to load is that
both occurences have occurred within our backup schedule which generates
a large amount of load particularly on the SAN.  Up until about a month
ago we were running a combination of IET+open-iscsi and the backup
schedule (which has not changed since) didn't seem to impact on that
combination.

Any pointers would be greatly appreciated.

Cheers,

Lachlan




------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to