I didn't previously note this, but this nop suggests that this code
was not
written with an assembler. nop-padding allows you to expand and
shrink code
sections without recalculating addresses.
This is why I'm currently extremely biased towards straight-line code
at the expense of program size: I've allowed myself an editor, but
not an assembler. Knuth claims that a single-pass bootstrapping
assembler such as MIXAL should be an afternoon project[0]; I hope
that he's referring to the time he took for his sixth and not for his
first.
(just checked some hand assembly: did 36 instructions in 27 minutes,
about 46 s/instruction to assemble and 75 s/address for the
relocations. That would say I should be willing to use many more
jumps, except that the mistakes I made during the hand assembly were:
(1x) wrong opcode due to poor handwriting, (2x) registers switched in
mov r,r, and (1x) my jump offset was completely bogus; the first two
sorts of mistakes I normally minimize by heavily copy-pasting working
code segments, but the offsets must be recalculated every time --at
least, as you noted, modulo the nop-padding!-- something changes in
size.)
Whew. That's pretty tricky. At the end, %dx is mem[label + 0x200]
- (0x101 +
%cx). The last part is presumably the address where we're currently
assembling, and the first part is the value of the referenced label.
I wonder if there's a simpler way to write the above.
There is indeed. Current code uses XLATB (just after setting ' ' to
dot) and is therefore down to the minimal single backwards branch per
pass.
This is presumably a very clever way to convert hexadecimal digits
into a
binary nibble. I've never learned enough about AAM and AAD to make
use of
them.
Posted it a while ago[1] on kragen-discuss, this relies on linear
algebra that isn't apparently implemented in some recent non-intel
HW, but should have worked "in period".
Which is where int 21h function 02h writes it to stdout from. The
interrupt
list I found warns that %dl=0x09 will get converted to some 0x20s,
but that
can't be right, or this program would fail to assemble itself.
That is indeed correct: I just retried self-reassembly and indeed
have 0x20's here (along with an easily observed incorrect
executable!), which should teach me something about sending mail
after roughly eyeballing the output to see that the fixups came out
instead of testing (or even just dd'ing the disk sector back to unix
for diff'ing).
A workaround should simply be to "sub dl,0x99" at the very end
instead of for each nybble, but because:
- there's also some bx-relative addressing that runs afoul of this, and
- i'm suffering intermittent disk corruption (which, however much it
adds to the versimilitude, isn't very enjoyable)
I think I'll switch to C-like writes instead for the time being
(although they probably weren't around in 1.0 either)
Maybe a real dosbox would help, as your octalinput.com also redirects
the typescript to the output file under my setup.
My temptation to write a one-pass stack-based octal version is even
greater now
:). It probably can't be under 40 bytes, but maybe it could be
under 70. And
it could be used to write the above.
Under 70 sounds plausible.
I'm currently taking:
76 - straight octal input (about half of which is various cargo-
culted int 21's)
96 - one-pass; labels only good after definition
111- two-pass; labels also good for forward branches
(add 2 bytes to each to decode hex instead of octal)
So I'd bet you'll easily have it in roughly 40+20 = 60 bytes.
-Dave
[0] http://sbel.wisc.edu/Courses/ME964/Literature/
knuthProgramming1974.pdf
Knuth, Structured Programming with go to Statements, Computing
Surveys, Dec 1974
... MIXAL is a good example of a "quick and dirty assembler", a
genre of software which will always be useful in its proper role.
Such an assembler is characterized by language restrictions that
make simple one-pass assembly possible, and it has several
noteworthy advantages when we are first preparing programs for a
new machine: a) it is a great improvement over numeric machine
code; b) its rules are easy to state; and c) it can be implemented
in an afternoon or so, thus getting an efficient assembler working
quickly on what may be very primitive equipment. So far I have
implemented six such assemblers ...
[1] http://lists.canonical.org/pipermail/kragen-discuss/2011-May/
001167.html
--
To unsubscribe: http://lists.canonical.org/mailman/listinfo/kragen-discuss