Hi developers!

It seems that our XCOPY (rxcopy, updated to compile not only in
Turbo C++ 3, but also in Turbo C 2 and OpenWatcom, around 2005,
using Pat's light and fast PRF printf, KITTEN localization etc.)
is noticeably slower than MS DOS XCOPY. Actually, it also makes
a noticeable difference to load NANSI to speed up screen writes,
but the more interesting difference is in disk I/O strategies.

Looking at the source, XCOPY does many checks per file, which is
good as-is, but there may be TWO places which could be optimized:

copy_file() has a 16 kB local buffer for copying, e.g. on stack.

Allocating a FIXED, ALIGNED, 16 kB or 32 kB buffer, which must
not cross a multiple of 64 kB in terms of linear address space
should speed up some things. Align it to linear ????0 or ???00.

xcopy_file() which calls copy_file() checks getdfree() for EACH
file, which might be slow (may have to count free FAT clusters).

Calling getdfree() less often and keeping track of changescould
speed up things here, BUT would have to deliberately err on the
cautious side by assuming file and directory creation to cost a
whole cluster each time: When the estimate predicts not enough
space to be free, XCOPY can call getdfree() again to get a more
accurate idea of disk space and re-sync the estimate with reality
before checking the space again and really giving up if no space.

As you see, those things are not THAT trivial, but not extremely
hard either. One problem is that you would need a REAL harddisk
to test that, on RAW hardware, as otherwise caches of your host
operating system would bend the benchmark results when you test
XCOPY speed in some sort of DOS window or emulator. A few 100 MB
of test data should be enough, you can use RUNTIME to check speed
and of course you should use a bit of CACHE, e.g. 5, 20 or 250 MB
of UHDD cache or 5 or 20 MB of LBACACHE, which should be flushed
before each benchmark run.

For comparison, FORMAT 0.91w contains 2 buffers of 10.5 kB size
each and uses whichever of the two happens to not cross a 64 kB
DMA boundary (see driveio.h and init.c) which is not so elegant,
but works quite okay.

So... Anybody with the right hardware who would enjoy doing a
bit of testing and cautious (#) tuning? Thanks in advance! :-)

Regards, Eric

(#) no full rewrites of XCOPY please :-D

PS: I got some odd results with DISKCOMP, does it work OK for you?
It tries to read and md5sum each disk, to compare them without the
need to swap disks often and without the need to store content :-)



_______________________________________________
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Reply via email to