(Replying to multiple mails again.)
On at 2023-04-21 19:03 -0400, jer...@shidel.net wrote:
Hi Tom,
On Apr 21, 2023, at 2:01 PM, tom ehlert <t...@drivesnapshot.de> wrote:
Hi ecm,
[1]:
https://gitlab.com/DOSx86/logger/-/blob/aae3dfddcdacfea18950a96ce9449767c20b2d66/source/logger/common.inc#L267
this got me looking into this 'too slow' detection method.
and it is indeed slow. as in molasse. let me explain.
a) isn't
%%Comparing:
inc di
lodsb
cmp al, [es:di]
jne %%Next
loop %%Comparing
more or less the definition of
repe cmpsw
??
Yes, more or less it was a “repe cmpsb”
That was a while ago and I forget the reasoning I did it that way. Possibly,
not requiring a case-specific match.
b) decompiling the actual code, it's basically
for (seg = 10; seg < 0xa000; seg++)
{
if (fmemcmp(MK_FP(seg, 9), "LOGGER", DriverLength))
{
return found_our_driver_at(seg);
}
}
return failure;
that's indeed slow as it compares all memory up to 0xa000 to lookup the driver,
or up to the drivers address (which is much better most of the time.
Yup it is a very slow way to locate the driver. It was only meant to be
temporary until better method was implemented.
The better method will most likely be INT 0x2D (AMIS).
During the development stage, locating the driver needed to be done quickly so
other things could be developed and tested.
That process went something like this:
1) We could walk the MCB chain…. That will require some overhead and
complexity. Too big of a pain for now (see below).
Commented by me below.
2) We could walk the device driver chain. Thats fairly straight forward and
easy enough to implement. Lets do that… Hmmm, not all device drivers are
showing up in the chain and logger is not being seen. Let’s not worry about
that for now and do something else.
Don't know how you managed to do that, it is indeed straightforward.
3) We could hook an interrupt somewhere. Yeah, that will be good and reliable.
Lets do that. But, which one will not cause any conflicts or collisions. Hmmm,
lets worry about it later and just get something that works for now.
4) Well a brute force search will work. It is slower than a glacier and as
clever as a stone. But, it will work well enough that I can get to work on
writing and testing the important stuff.
5) Brute force search it is then with a very few optimizations… For now, good
enough.
Even though the delay incurred when launching the interface program is not very
noticable, it is way to slow.
Now that the important stuff has been written and is being tested, the brute
force search needs to go.
when I mentioned microseconds, I had the DOS memory chain in mind where you
would have
to compare 20-50 locations to your drivers name.
I did consider walking the MCB chain to find it. But, that comes with its own
set of problems. Some blocks contain sub-blocks.
SD "system data" blocks with sub-MCBs are a fact on most kernels, but
they're not terribly complicated to detect and handle. First two name
bytes are "SD" and the owner is 8. Sub-MCBs have signature letters
indicating the type of the block rather than "M" link / "Z" no link, and
they always span the entire space up to the end of the container SD block.
The upper memory blocks may or may not be linked into the primary MCB chain.
This is more of a problem. You can ignore a "Z" (ie, treat it as if it
is an "M") if you know that the next block after the "Z" block is the
first UMCB. You can find the first UMCB by trying to call interrupt 2Fh,
function 1261h, CY; if it returns NC then ax = first UMCB. Or if the
kernel does not support that function, switch off the UMB link, walk
until "Z", switch on the UMB link, walk until you're past the MCB that
was "Z" prior to this (but is now "M"). Of course, you want to preserve
the original UMB link state around this. Example code is in my TSR
example [1].
For a single search, it may be more efficient to just do the UMB link
state enabling around your MCB walker as is, and honor any "Z" as the
final end marker then.
[1]: https://hg.pushbx.org/ecm/tsr/file/749e0f25364c/transien.asm#l96
There are other aspects involving interaction between the Driver, Interface and
Log that are also just “good enough for now” and will probably be changed a
great deal.
It is an alpha version for a reason.
:-)
On at 2023-04-23 19:17 -0400, jer...@shidel.net wrote:
For “fun”, I implemented initial versions of both the 0x2b interrupt hook and a partial 0x2d implementation to locate the Logger device driver.
The 0x2d version is very bare bones and does not include several functions
required to be “fully compliant”. It only responds to the install check
function for whichever multiplexer it allocates. Other function requests only
respond with AL=0x00 which i guess is “not implemented”.
Yes, setting al to zero is the way to indicate a function is not
implemented.
As such, 0x2d uses 35 bytes more resident memory than 0x2b. (actually 49 bytes
more if you count a “title” string that is not required). If I spend the time
making it fully compliant, it will require even more. It really would need
function 04 (determine chained ints) and should have 06 (device driver info).
Why not show your examples? Perhaps we can point out some optimisations
to shave off a few bytes here and there. I have some ideas about how to
write a very small int 2Dh handler, eg:
int_2D:
; IISP header here
; magic byte sequence common to Ralf Brown and ecm programs:
cmp ah, 0
.multiplex_number: equ $ - 1
; SMC, storing the multiplex number in the instruction immediate
je .handle
; magic byte sequence end
jmp far [cs:.next]
.handle:
cmp al, 0
; check if installation check
mov al, 0
; prepare "function not supported" return code
jne .iret
; jump to iret if not installation check -->
; (flags still as set from the compare instruction)
mov al, 0FFh
mov cx, 0100h
mov di, amis_signature
mov dx, cs
.iret:
iret
ETA: I found that you actually already uploaded your work! Nicely done.
Here's what I am referring to [2].
No need to reserve 15 bytes for the AMIS signature description, just let
the assembler do its job and allocate exactly as many bytes as needed.
The multiplex number, as in my example above, can be embedded into the
code of the handler. (I do not call this the "multiplexer" as you do
here, rather, "multiplexer" in my use refers to the resident program.
The number is just the "multiplex number".) This is what both Ralf
Brown's examples and my resident programs do.
If you do put the multiplex number there at the end of the signatures,
it may save (very little) space to initialise it as zero rather than
minus one, if one were to compress your binary.
Preserving bx around the scan call is not needed; if someone uses int
2Dh in a way incompatible to AMIS then this may not suffice to save you.
At least in my opinion, I just assume that only AMIS users handle it.
The %%InvalidResponse branching is a good idea, as that indicates the
multiplex number is not "in use" by AMIS but also not "free" by AMIS.
However, I would have it branch to %%Next rather than to %%NotFound.
The cld is probably not needed here. A DOS application executable can
assume it is started with UP (direction flag clear). A DOS device driver
usually can assume as much too; if you are really afraid of DOS passing
DN (direction flag set) then put the cld at the beginning of your init
handler.
The actual interrupt handler does not have to preserve the flags it gets
from its entrypoint (your pushf / popf). If you drop those instructions,
then after the xor you can just put a single iret instruction instead of
a short jump.
The comment in the InstallHook macro wrongly refers to "int 0x2b" here,
should be 2Dh.
In the common.inc file [3], in the macro CheckCompatible you can just
change this code sequence:
je %%Compatible
stc
jmp %%Done
%%Compatible:
clc
%%Done:
To this:
je %%Done
; flags are ZR NC if branching here
stc
; insure CY if we didn't branch
%%Done:
In the log-sys.asm file [4], you can optimise the device entrypoints
somewhat:
First, do not use a "Driver.Strategy" that saves away the request header
address. Instead, move all your processing (from "Driver.Routine") into
the strategy entrypoint and access the request header using es:bx. Your
"interrupt" entrypoint is just a single retf then (which you can share
with the resident handler's instruction that returns from your strategy
entrypoint). This is compatible with all common DOS kernels, certainly
with the FreeDOS kernel.
Second, you can write the resident device entrypoint to just always
return the status 8103h to the request header's status field. Make your
entrypoint code so that "Initialize" is called only initially, when
command 0 is sent, and later in the installation code patch your
entrypoint to run the minimal resident handler that only returns the error.
Finally, if you do *not* use the IISP headers (which I strongly
recommend you do use them), you can write the chain instruction as "jmp
0:0" and then "BIOSInt10 equ $ - 4" after that instruction, so as to do
SMC that patches the immediate far jump instruction. That saves 4 bytes
per chain instruction.
I didn't check all the other resident code for optimisations yet, but I
can do so if you're interested.
[2]:
https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/hookamis.inc
[3]:
https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/common.inc
[4]:
https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/log-sys.asm
I think overall, it might be better to figure out why walking the device driver
chain was not working correctly. Most likely, it was some dumb thing I was
doing wrong. It would have the smallest resident footprint of the locating
schemes.
That is true.
I’ll probably take another look at that. Or maybe, I’ll just stick with the 0x2d multiplexer despite it’s larger footprint.
All are better (and much faster) than the temporary brute force search in
previous versions.
:-)
On at 2023-04-24 08:24 -0400, jer...@shidel.net wrote:
I implemented walking the device driver chain to locate the Logging Driver. Not sure what dumb thing I did wrong the first time I looked at it. But, it seems to work fine now.
After writing the 3 different methods to locate the driver, here are the
results for the current ALPHA version.
(COM=LOGGER interface binary, SYS=LOGGER device driver)
1) When using the 0x2b interrupt hook, COM size 4460, SYS size 3006, SYS resident size 1216. This is definitely the fastest method to locate the driver. It is a single interrupt call with verification and it is done. But as discussed before, it is a “new standard” that most likely should be avoided.
2) When using the partial implementation of the AMIS interrupt 0x2d
multiplexer, COM size 4476, SYS size 3055, SYS resident size 1232.
So the resident size is only 16 bytes more for the minimal AMIS
multiplexer? The 49 bytes you mentioned before seem to be the SYS
executable's file size rather than resident. I think optimising the
resident size is more important.
If it is made "fully compliant” with the AMIS protocol, it will add a minimum
of 22 more bytes. Most likely, it will eventually be double or triple that. There
are advantages to using AMIS. But, it requires calling the interrupt 255 times to
check every multiplexer.
Up to 256 times, actually.
If you do not expect to install multiple instances of the same
multiplexer program, then you can start your searches at the same end,
both for detecting the resident instance as well as for determining a
free multiplex number during installation. (Which you already do it
turns out, so let's just consider this a comment on my part.) Eg start
both loops at 00h and iterate until you have tried all numbers past 0FFh.
That means *if* the resident is already installed, it will be found
early within the first few allocated multiplex numbers. (Immediately on
the first installation check, if you do not have any other AMIS
multiplexers installed currently.) It doesn't matter in what particular
order you search, you just should search in the same order for both cases.
Other than that, I would urge you to use the IISP headers even if you do
not include any other part of AMIS. It is helpful for everyone to make
your downlinks available. (This includes all interrupts you're hooking,
such as your interrupt 10h hook.)
If fully implemented, it would also increase the resident footprint by roughly
10% of the current requirements. If additional functionality is added to the
driver (things like screen capture, hot keys, 3rd party driver control
interface, etc) the larger size would be less important. Plus, it would provide
a “semi-official” interface to that additional driver functionality.
Indeed.
3) When walking the device driver chain, COM size 4480, SYS size 2966, SYS resident size 1168. This is also very quick and should only require checking a couple links in the device driver chain. It requires no extra code in the driver and has the smallest memory resident footprint of the three methods. It seems to work great under FreeDOS, MS-DOS and PC-DOS. There could be compatibility issues under other DOS distributions. However, there are some safeguards in place to prevent getting stuck if walking an invalid device driver chain. Although it would be nice, I do not really care if it is not compatible with other DOS platforms.
I think for the time being, walking the driver chain will be the way to go.
Eventually, it will probably be moved to AMIS in a future version.
As for using a single binary that doubles as both the device driver and driver
interface, I’m still undecided. There are good reasons for doing that. First,
it would be just a single binary to worry about. Second, the overall size would
be reduced by not duplicating the code shared between them.
I agree. That's why, for example, the lDebug executable is a triple-mode
program, it can be loaded either as a DOS application ("EXE mode"), DOS
device driver (device mode), or instead of a kernel (boot loaded mode).
While building specific special-purpose builds for just one mode is
possible using build options, and that does save some disk space and a
bit of resident memory use, the default is to build the "go everywhere,
do everything" binary.
But, there are some drawbacks as well. Although the total size would be
smaller, the single binary would be much larger than the two individual
binaries. Even though the resident footprint of the driver would remain
unchanged, it would require a larger free memory block when initially loaded.
I think this concern is of very little value.
It would also require loading everything each time the interface program is
used to directly append the log. The extra couple KB won’t matter much when
running from a hard disk or if caching is used. But if run from a floppy, the
loading delay would add up with repeated execution. To be fair, that probably
will not matter very much. Finally if space is critical on a boot floppy,
using two binaries allows having just the small driver on the boot diskette and
putting the interface program on a separate diskette.
Easy, then: Provide build options to build the device-driver-only
binary, or the all-in-one executable, from the same sources.
I think the advantages of a single binary probably outweigh having two separate
ones. But like using AMIS and providing a 3rd party log control interface, I
feel that would be better suited for a version 2.x release. For version 1.x, I
think it may be better to use the SYS+COM pair of binaries. I’m still undecided.
:-)
In any case, thanks for your work!
Regards,
ecm
_______________________________________________
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel