Hi Michael, > <massive snippage> > Is there a question in all that?
No. This is why I conclude my mail with: > maybe we better move the EMM386 optimization stuff to off-list? (directed to Arkady) > You have missed the point of proper optimization. Algorithms optimized > first, only where they matter. Yes. Sure. In particular the allocator might be affected by that. Looping through 1000+ table entries and checking the 48 byte bit string of a non- small number of them can take quite a while, in particular if you have to do it for each of 1000+ "get VCPI 4k page" calls from a DOS extender which for one reason or another prefers VCPI over XMS for allocation. But it is hard to tell how many table entries are scanned per alloc call on average if you call VCPI alloc page 1000 times in a row. Yet again, that can make the difference between linear and quadratic relation of "number of VCPI pages allocated" to "time consumed". I believe that my relatively slow 500 MHz K6-2 does show some noticeable CPU load for alloc (e.g. small pause / fans going faster), but I do not understand the alloc algorithm well enough to really tell. This is why I am tempting Arkady the optimization expert to have a look ;-). > The optimization is the one that tries to align EDI to an > eight-byte boundary before the main REP MOVSD. And that optimization only > makes sense because once you align EDI, you commonly align ESI along with > it, at least in the three areas to be optimized. Very interesting. I think you cannot optimize for that (even though it would allow fast access bursts) because that would require the move distance to be a multiple of 8 bytes. However, if the distance is FOUND to already be a multiple of 8 bytes, extra code (in the EMM386 int1587 handler and in the HIMEM memory copy function) could take care to do up to 7 MOVSB before doing the main REP MOVSD. Reasonable overhead and even quite good chances that it will often be used! All allocations are at multiples of 1kB (XMS, EMS, VCPI), and many programs like DOS extenders and RAM disks grab and move contents in chunks which are both aligned to N*16 byte boundaries as well as having a size of M*16 bytes... But then, that case already DOES use the fastest REP MOVSD without any extra work from EMM386/HIMEM! So you are right. Optimization would be much ado about nothing here, because it cannot really boost performance at all if the caller wants to do a non-aligned-in-all-ways movement / because performance already is optimal if the caller wants a perfectly aligned movement anyway :-|. QUESTION related to that: Does the FreeDOS kernel make sure that BUFFERs and the deblocking buffer (1 sector buffer in low DOS RAM to avoid having to transfer to/from UMB or HMA) are nicely aligned? > Note that carrying around multiple memory copy functioins in EMM386 and > testing CPUs with dynamic configuration to the appropriate version of the > memory copy isn't worth the hassle and extra EMM386 code. Almost agreed - but aligning the rep movsd instruction itself to a multiple (IP wise) of 4 or 8 would still be a good idea, to make sure that it does not wrap around the edges of a cache line. Affects the VDS TRANSFER_BUFF, CHK_CHANNEL, the EMM386 SIMULATE_INT1587, the EMS 4.0 (less important) ems4_memory_region, HIMEM xms_move_xms, and would be easy to implement and test (for whether it improves speed) for all the TASM owners out there ;-). I know that chances are relatively low (rep movsd is 3 bytes opcode if you are in a 16bit CS, but cache lines are N*16 bytes big), but you know Murphy's Law - what can go wrong will go wrong. 5 explicit ALIGN 4 or ALIGN 8 commands in the source code give nice and straightforward protection here. Eric ------------------------------------------------------- This SF.Net email is sponsored by: NEC IT Guy Games. Get your fingers limbered up and give it your best shot. 4 great events, 4 opportunities to win big! Highest score wins.NEC IT Guy Games. Play to win an NEC 61 plasma display. Visit http://www.necitguy.com/?r=20 _______________________________________________ Freedos-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/freedos-devel
