(Replying to multiple mails again.)

On at 2023-04-21 19:03 -0400, jer...@shidel.net wrote:
Hi Tom,

On Apr 21, 2023, at 2:01 PM, tom ehlert <t...@drivesnapshot.de> wrote:

Hi ecm,


[1]: 
https://gitlab.com/DOSx86/logger/-/blob/aae3dfddcdacfea18950a96ce9449767c20b2d66/source/logger/common.inc#L267

this got me looking into this 'too slow' detection method.
and it is indeed slow. as in molasse. let me explain.


a) isn't

  %%Comparing:
        inc             di
        lodsb
        cmp             al, [es:di]
        jne             %%Next
        loop            %%Comparing

more or less the definition of

        repe cmpsw

??

Yes, more or less it was a  “repe cmpsb”

That was a while ago and I forget the reasoning I did it that way. Possibly, 
not requiring a case-specific match.

b) decompiling the actual code, it's basically


  for (seg = 10; seg < 0xa000; seg++)
        {
        if (fmemcmp(MK_FP(seg, 9), "LOGGER", DriverLength))
                {
                return found_our_driver_at(seg);
                }
        }
   return failure;

that's indeed slow as it compares all memory up to 0xa000 to lookup the driver,
or up to the drivers address (which is much better most of the time.

Yup it is a very slow way to locate the driver. It was only meant to be 
temporary until better method was implemented.

The better method will most likely be INT 0x2D (AMIS).

During the development stage, locating the driver needed to be done quickly so 
other things could be developed and tested.

That process went something like this:

1) We could walk the MCB chain…. That will require some overhead and 
complexity. Too big of a pain for now (see below).

Commented by me below.

2) We could walk the device driver chain. Thats fairly straight forward and 
easy enough to implement. Lets do that… Hmmm, not all device drivers are 
showing up in the chain and logger is not being seen. Let’s not worry about 
that for now and do something else.

Don't know how you managed to do that, it is indeed straightforward.

3) We could hook an interrupt somewhere. Yeah, that will be good and reliable. 
Lets do that. But, which one will not cause any conflicts or collisions. Hmmm, 
lets worry about it later and just get something that works for now.
4) Well a brute force search will work. It is slower than a glacier and as 
clever as a stone. But, it will work well enough that I can get to work on 
writing and testing the important stuff.
5) Brute force search it is then with a very few optimizations… For now, good 
enough.

Even though the delay incurred when launching the interface program is not very 
noticable, it is way to slow.

Now that the important stuff has been written and is being tested, the brute 
force search needs to go.

when I mentioned microseconds, I had the DOS memory chain in mind where you 
would have
to compare 20-50 locations to your drivers name.

I did consider walking the MCB chain to find it. But, that comes with its own 
set of problems. Some blocks contain sub-blocks.

SD "system data" blocks with sub-MCBs are a fact on most kernels, but they're not terribly complicated to detect and handle. First two name bytes are "SD" and the owner is 8. Sub-MCBs have signature letters indicating the type of the block rather than "M" link / "Z" no link, and they always span the entire space up to the end of the container SD block.

The upper memory blocks may or may not be linked into the primary MCB chain.

This is more of a problem. You can ignore a "Z" (ie, treat it as if it is an "M") if you know that the next block after the "Z" block is the first UMCB. You can find the first UMCB by trying to call interrupt 2Fh, function 1261h, CY; if it returns NC then ax = first UMCB. Or if the kernel does not support that function, switch off the UMB link, walk until "Z", switch on the UMB link, walk until you're past the MCB that was "Z" prior to this (but is now "M"). Of course, you want to preserve the original UMB link state around this. Example code is in my TSR example [1].

For a single search, it may be more efficient to just do the UMB link state enabling around your MCB walker as is, and honor any "Z" as the final end marker then.


[1]: https://hg.pushbx.org/ecm/tsr/file/749e0f25364c/transien.asm#l96


There are other aspects involving interaction between the Driver, Interface and 
Log that are also just “good enough for now” and will probably be changed a 
great deal.

It is an alpha version for a reason.

:-)


On at 2023-04-23 19:17 -0400, jer...@shidel.net wrote:
For “fun”, I implemented initial versions of both the 0x2b interrupt hook and a partial 0x2d implementation to locate the Logger device driver.
The 0x2d version is very bare bones and does not include several functions 
required to be “fully compliant”. It only responds to the install check 
function for whichever multiplexer it allocates. Other function requests only 
respond with AL=0x00 which i guess is “not implemented”.

Yes, setting al to zero is the way to indicate a function is not implemented.

As such, 0x2d uses 35 bytes more resident memory than 0x2b. (actually 49 bytes 
more if you count a “title” string that is not required). If I spend the time 
making it fully compliant, it will require even more. It really would need 
function 04 (determine chained ints) and should have 06 (device driver info).

Why not show your examples? Perhaps we can point out some optimisations to shave off a few bytes here and there. I have some ideas about how to write a very small int 2Dh handler, eg:

 int_2D:
  ; IISP header here

  ; magic byte sequence common to Ralf Brown and ecm programs:
  cmp ah, 0
.multiplex_number: equ $ - 1
  ; SMC, storing the multiplex number in the instruction immediate
  je .handle
  ; magic byte sequence end
  jmp far [cs:.next]
.handle:
  cmp al, 0
  ; check if installation check
  mov al, 0
  ; prepare "function not supported" return code
  jne .iret
  ; jump to iret if not installation check -->
  ; (flags still as set from the compare instruction)
  mov al, 0FFh
  mov cx, 0100h
  mov di, amis_signature
  mov dx, cs
.iret:
  iret


ETA: I found that you actually already uploaded your work! Nicely done. Here's what I am referring to [2].

No need to reserve 15 bytes for the AMIS signature description, just let the assembler do its job and allocate exactly as many bytes as needed.

The multiplex number, as in my example above, can be embedded into the code of the handler. (I do not call this the "multiplexer" as you do here, rather, "multiplexer" in my use refers to the resident program. The number is just the "multiplex number".) This is what both Ralf Brown's examples and my resident programs do.

If you do put the multiplex number there at the end of the signatures, it may save (very little) space to initialise it as zero rather than minus one, if one were to compress your binary.

Preserving bx around the scan call is not needed; if someone uses int 2Dh in a way incompatible to AMIS then this may not suffice to save you. At least in my opinion, I just assume that only AMIS users handle it.

The %%InvalidResponse branching is a good idea, as that indicates the multiplex number is not "in use" by AMIS but also not "free" by AMIS. However, I would have it branch to %%Next rather than to %%NotFound.

The cld is probably not needed here. A DOS application executable can assume it is started with UP (direction flag clear). A DOS device driver usually can assume as much too; if you are really afraid of DOS passing DN (direction flag set) then put the cld at the beginning of your init handler.

The actual interrupt handler does not have to preserve the flags it gets from its entrypoint (your pushf / popf). If you drop those instructions, then after the xor you can just put a single iret instruction instead of a short jump.

The comment in the InstallHook macro wrongly refers to "int 0x2b" here, should be 2Dh.


In the common.inc file [3], in the macro CheckCompatible you can just change this code sequence:


        je      %%Compatible
        stc
        jmp     %%Done
%%Compatible:
        clc
%%Done:

To this:

  je %%Done
  ; flags are ZR NC if branching here
  stc
  ; insure CY if we didn't branch
%%Done:


In the log-sys.asm file [4], you can optimise the device entrypoints somewhat:

First, do not use a "Driver.Strategy" that saves away the request header address. Instead, move all your processing (from "Driver.Routine") into the strategy entrypoint and access the request header using es:bx. Your "interrupt" entrypoint is just a single retf then (which you can share with the resident handler's instruction that returns from your strategy entrypoint). This is compatible with all common DOS kernels, certainly with the FreeDOS kernel.

Second, you can write the resident device entrypoint to just always return the status 8103h to the request header's status field. Make your entrypoint code so that "Initialize" is called only initially, when command 0 is sent, and later in the installation code patch your entrypoint to run the minimal resident handler that only returns the error.

Finally, if you do *not* use the IISP headers (which I strongly recommend you do use them), you can write the chain instruction as "jmp 0:0" and then "BIOSInt10 equ $ - 4" after that instruction, so as to do SMC that patches the immediate far jump instruction. That saves 4 bytes per chain instruction.

I didn't check all the other resident code for optimisations yet, but I can do so if you're interested.


[2]: https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/hookamis.inc [3]: https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/common.inc [4]: https://gitlab.com/DOSx86/logger/-/blob/677389417b24f12548c58ea1a0cfc96510a0377f/source/logger/log-sys.asm


I think overall, it might be better to figure out why walking the device driver 
chain was not working correctly. Most likely, it was some dumb thing I was 
doing wrong. It would have the smallest resident footprint of the locating 
schemes.

That is true.

I’ll probably take another look at that. Or maybe, I’ll just stick with the 0x2d multiplexer despite it’s larger footprint.
All are better (and much faster) than the temporary brute force search in 
previous versions.

:-)


On at 2023-04-24 08:24 -0400, jer...@shidel.net wrote:
I implemented walking the device driver chain to locate the Logging Driver. Not sure what dumb thing I did wrong the first time I looked at it. But, it seems to work fine now.
After writing the 3 different methods to locate the driver, here are the 
results for the current ALPHA version.

(COM=LOGGER interface binary, SYS=LOGGER device driver)

1) When using the 0x2b interrupt hook, COM size 4460, SYS size 3006, SYS resident size 1216. This is definitely the fastest method to locate the driver. It is a single interrupt call with verification and it is done. But as discussed before, it is a “new standard” that most likely should be avoided.
2) When using the partial implementation of the AMIS interrupt 0x2d 
multiplexer, COM size 4476, SYS size 3055, SYS resident size 1232.

So the resident size is only 16 bytes more for the minimal AMIS multiplexer? The 49 bytes you mentioned before seem to be the SYS executable's file size rather than resident. I think optimising the resident size is more important.

If it is made "fully compliant” with the AMIS protocol, it will add a minimum 
of 22 more bytes. Most likely, it will eventually be double or triple that. There 
are advantages to using AMIS. But, it requires calling the interrupt 255 times to 
check every multiplexer.

Up to 256 times, actually.

If you do not expect to install multiple instances of the same multiplexer program, then you can start your searches at the same end, both for detecting the resident instance as well as for determining a free multiplex number during installation. (Which you already do it turns out, so let's just consider this a comment on my part.) Eg start both loops at 00h and iterate until you have tried all numbers past 0FFh.

That means *if* the resident is already installed, it will be found early within the first few allocated multiplex numbers. (Immediately on the first installation check, if you do not have any other AMIS multiplexers installed currently.) It doesn't matter in what particular order you search, you just should search in the same order for both cases.

Other than that, I would urge you to use the IISP headers even if you do not include any other part of AMIS. It is helpful for everyone to make your downlinks available. (This includes all interrupts you're hooking, such as your interrupt 10h hook.)


If fully implemented, it would also increase the resident footprint by roughly 
10% of the current requirements. If additional functionality is added to the 
driver (things like screen capture, hot keys, 3rd party driver control 
interface, etc) the larger size would be less important. Plus, it would provide 
a “semi-official” interface to that additional driver functionality.

Indeed.

3) When walking the device driver chain, COM size 4480, SYS size 2966, SYS resident size 1168. This is also very quick and should only require checking a couple links in the device driver chain. It requires no extra code in the driver and has the smallest memory resident footprint of the three methods. It seems to work great under FreeDOS, MS-DOS and PC-DOS. There could be compatibility issues under other DOS distributions. However, there are some safeguards in place to prevent getting stuck if walking an invalid device driver chain. Although it would be nice, I do not really care if it is not compatible with other DOS platforms.
I think for the time being, walking the driver chain will be the way to go. 
Eventually, it will probably be moved to AMIS in a future version.

As for using a single binary that doubles as both the device driver and driver 
interface, I’m still undecided. There are good reasons for doing that. First, 
it would be just a single binary to worry about. Second, the overall size would 
be reduced by not duplicating the code shared between them.

I agree. That's why, for example, the lDebug executable is a triple-mode program, it can be loaded either as a DOS application ("EXE mode"), DOS device driver (device mode), or instead of a kernel (boot loaded mode). While building specific special-purpose builds for just one mode is possible using build options, and that does save some disk space and a bit of resident memory use, the default is to build the "go everywhere, do everything" binary.

But, there are some drawbacks as well. Although the total size would be 
smaller, the single binary would be much larger than the two individual 
binaries. Even though the resident footprint of the driver would remain 
unchanged, it would require a larger free memory block when initially loaded.

I think this concern is of very little value.

It would also require loading everything each time the interface program is 
used to directly append the log. The extra couple KB won’t matter much when 
running from a hard disk or if caching is used. But if run from a floppy, the 
loading delay would add up with repeated execution. To be fair, that probably 
will not matter very much.  Finally if space is critical on a boot floppy, 
using two binaries allows having just the small driver on the boot diskette and 
putting the interface program on a separate diskette.

Easy, then: Provide build options to build the device-driver-only binary, or the all-in-one executable, from the same sources.

I think the advantages of a single binary probably outweigh having two separate 
ones. But like using AMIS and providing a 3rd party log control interface, I 
feel that would be better suited for a version 2.x release. For version 1.x, I 
think it may be better to use the SYS+COM pair of binaries. I’m still undecided.

:-)

In any case, thanks for your work!

Regards,
ecm



_______________________________________________
Freedos-devel mailing list
Freedos-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/freedos-devel

Reply via email to