On 16 Nov 2009, at 13:07, Jesse Becker wrote:

> On Mon, Nov 16, 2009 at 03:33:46AM -0500, Simon Kirby wrote:
>> However, I am seeing a lot of cases (depending on the load) where  
>> this
>> algorithm is being confused by the time it takes for the response  
>> to come
>> back from the target.  Since there is network-level ACK in the AOE
>> protocol, the actual time it takes for a response to come back  
>> depends
>> entirely on how long it gets wedged in the target buffer queues, disk
>> speed, etc.  A request that happens to hit cache can come back very
>> quickly, while other requests can be quite slow.
>
> I'll just chime in here with a simple "me too."  I see what appears to
> be the same symptoms, although apparently a lot more in the way of
> performance degredation.

+1

We've seen lots of this too, specifically under high concurrency,  
against vblade, qaoed and Coraid targets.

Maybe an obvious observation, but we've found that having a good  
switch really helps, a fast one with plenty of buffers. Procurve  
switches have worked well for us. Beware the 2800 series which won't  
do flow control with jumbo frames (but do enable it to be  
configured!). The 2900 series are excellent, we went for the 5406zl  
for expandability which has been awesome.

Investing in the good switches, nics has reduced our retransmit rates  
to a fairly low level, even under high concurrency and load.

As for the timeout/retransmit issue, without looking at the src,  
perhaps the retransmit algo is just being a bit dumb? Lustre does some  
very clever stuff with adaptive timeouts, maybe some of these ideas  
would also be useful in aoe. It might be possible to detect flow  
control on an interface or monitor for a correlation between  
retransmit/unexpected rsp rates. Perhaps the target could provide in- 
band feedback on the health of the real block device, iowait could be  
a useful metric here.

I'd be interested in discussing/testing this further.

Mark

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Aoetools-discuss mailing list
Aoetools-discuss@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/aoetools-discuss

Reply via email to