Hi list,

I've encountered an re-occuring issue where a single AoE device goes into
the closewait,down state.  I'm hoping someone here might be able to point me
in the right direction of where to look to find the underlying cause.

A little about the setup:  two hosts, one acting as a SAN the other as a Xen
host.   Both running Debian Squeeze using Debian distributed AoE packages.
 A 5 disk RAID-6 array configured using md and LVM on the SAN.  LVM volumes
are then exported via AoE using vblade.  There are 5 volumes exported from
the SAN to the Xen host:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 up
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 up
      e0.4        53.687GB  bond0 up

which are then used by the Windows 2003 Server Xen DomU as its disk devices.

The issue first occurred on February 17th 19:13 where this was recorded:

Feb 17 19:13:30 vmsrv kernel: [456093.648028] VBD Resize: new size 0

I believe this log entry originates from Xen's VBD driver reporting the
change.

And aoe-stat on the Xen host displaying:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 up
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 closewait,down
      e0.4        53.687GB  bond0 up

Over night last night:

Mar  2 20:28:23 vmsrv kernel: [900000.336023] VBD Resize: new size 0

and aoe-stat displaying:

      e0.0       171.798GB  bond0 up
      e0.1       268.435GB  bond0 closewait,down
      e0.2        53.687GB  bond0 up
      e0.3       128.849GB  bond0 up
      e0.4        53.687GB  bond0 up

An aoe-revalidate instantly resolves the issue but in the mean time the
disks are unavailable.

What leads me to believing that this is an issue related to load is that
both occurences have occurred within our backup schedule which generates a
large amount of load particularly on the SAN.  Up until about a month ago we
were running a combination of IET+open-iscsi and the backup schedule (which
has not changed since) didn't seem to impact on that combination.

Any pointers would be greatly appreciated.

Cheers,

Lachlan
------------------------------------------------------------------------------
Free Software Download: Index, Search & Analyze Logs and other IT data in 
Real-Time with Splunk. Collect, index and harness all the fast moving IT data 
generated by your applications, servers and devices whether physical, virtual
or in the cloud. Deliver compliance at lower cost and gain new business 
insights. http://p.sf.net/sfu/splunk-dev2dev 
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to