At 03:28 PM 5/6/2005 +0200, Eric Auer wrote:
> The optimization is the one that tries to align EDI to an > eight-byte boundary before the main REP MOVSD. And that optimization only > makes sense because once you align EDI, you commonly align ESI along with > it, at least in the three areas to be optimized.
Very interesting. I think you cannot optimize for that (even though it would allow fast access bursts) because that would require the move distance to be a multiple of 8 bytes. However, if the distance is FOUND to already be a multiple of 8 bytes, extra code (in the EMM386 int1587 handler and in the HIMEM memory copy function) could take care to do up to 7 MOVSB before doing the main REP MOVSD.
In case you wondered why people might not pay as much attention as you want when you talk about optimization, this is a good example.
Let's do the basic arithmetic. Assume EDI and ESI are same alignment, as frequently occurs. If there is a memory move to perform of 334 bytes with EDI at alignment 7 modulo 8, what do I do?
Answer: I move one byte to get EDI alignment to eight-byte boundary, with 333 bytes left to move. I then move 82 DWORDs that are 8-byte aligned via REP MOVSD for the cache line optimization. Done, 5 bytes to go. I move 5 bytes to clear up the remainder. Result: 328/334 or 98% of all bytes are moved in an optimal pattern at 3x speed. Overhead? A few instructions. Design decisions? Movement over three transfers.
Worst case with large moves: assume ESI and EDI are random to each other (which is often untrue), and can be odd (far more unlikely to be true) Aligning EDI is a smaller performance optimization even without cache line moves, but I won't count that. 12.5% of the time a worst case transfer will enjoy a huge performance gain of almost three times normal, for all typical transfers.
Normal case, the gain happens 100% of the time. Everybody with a Pentium Pro, II, or III dances a jig of joy. Pentium 4? Maybe, optimization docs are unclear.
All right, that's enough for me. I'm not spending more time and attention to talk about additional optimizations with you.
------------------------------------------------------- This SF.Net email is sponsored by: NEC IT Guy Games. Get your fingers limbered up and give it your best shot. 4 great events, 4 opportunities to win big! Highest score wins.NEC IT Guy Games. Play to win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20 _______________________________________________ Freedos-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/freedos-devel
