On 16 Nov 2009, at 13:07, Jesse Becker wrote: > On Mon, Nov 16, 2009 at 03:33:46AM -0500, Simon Kirby wrote: >> However, I am seeing a lot of cases (depending on the load) where >> this >> algorithm is being confused by the time it takes for the response >> to come >> back from the target. Since there is network-level ACK in the AOE >> protocol, the actual time it takes for a response to come back >> depends >> entirely on how long it gets wedged in the target buffer queues, disk >> speed, etc. A request that happens to hit cache can come back very >> quickly, while other requests can be quite slow. > > I'll just chime in here with a simple "me too." I see what appears to > be the same symptoms, although apparently a lot more in the way of > performance degredation.
+1 We've seen lots of this too, specifically under high concurrency, against vblade, qaoed and Coraid targets. Maybe an obvious observation, but we've found that having a good switch really helps, a fast one with plenty of buffers. Procurve switches have worked well for us. Beware the 2800 series which won't do flow control with jumbo frames (but do enable it to be configured!). The 2900 series are excellent, we went for the 5406zl for expandability which has been awesome. Investing in the good switches, nics has reduced our retransmit rates to a fairly low level, even under high concurrency and load. As for the timeout/retransmit issue, without looking at the src, perhaps the retransmit algo is just being a bit dumb? Lustre does some very clever stuff with adaptive timeouts, maybe some of these ideas would also be useful in aoe. It might be possible to detect flow control on an interface or monitor for a correlation between retransmit/unexpected rsp rates. Perhaps the target could provide in- band feedback on the health of the real block device, iowait could be a useful metric here. I'd be interested in discussing/testing this further. Mark ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ Aoetools-discuss mailing list Aoetools-discuss@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/aoetools-discuss