Hi Tomas,

> Also, it would allow for macros/shortcuts to automate common patterns.

I think I have to try to explain some of the reasons that drive me. When
I program in assembly, I switch to a completely different mindset.

While writing a core function in assembly, I want it to be as close to
the "optimum" as possible. It does not matter how often I re-arrange the
code, or how long I need for that, because this function will be written
just once, but called very often later. There lies also the fun of it.

This is completely different from using Lisp to write applications.
There I want to be as productive as possible, to have an easier life,
and to be able to abstract as much as possible.

On the assembly level, I want to know each individual bit personally.
Try not to hide anything. I want to keep my mind within the data model,
as described e.g. in "doc64/structures".

For example, I had a long fight with myself whether I should introduce
and use constants like 'BIG', 'CDR', or 'TAIL'. They are defined in
"src64/defs.l":

   (equ BIG 4)    # Rest of a bignum + bignum tag
   (equ CDR 8)    # CDR part of a list cell
   (equ TAIL -8)  # Tail of a symbol

A completely straightforward, good and "normal" thing in daily
programming. It is used like that:

   ld E (E CDR)  # Take CDR

What is my problem with that? It hides the "true" nature of the
underlying data structures. It could instead be written as

   ld E (E 8)  # Take CDR

which results in the x86-64 code

   mov 8(%rbx), %rbx

Now when I use an opaque constant 'CDR' instead of '8', I easily forget
what goes on on the low level, and have a higher concept of a "CDR" in
mind. This makes it more difficult in some situations to keep in control
of the lower levels, and to recognize common patterns. If I use 'TAIL'
instead of '-8', I easily forget how I am accessing the pointers of a
cell, how they are related to the pointers of neighboring cells etc.
Then I have to keep both concepts in mind at the same time, and
constantly switch between them. The awareness about the nature of
constant like 4, -8 is necessary to interconnect them to the pointer
tags in the lowest four bits

   cnt   ... S010
   big   ... S100
   sym   ... 1000
   cell  ... 0000

and I need to constantly juggle with knowing that "cnt is 2", "big is 4"
and so on.

For the same reasons I was reluctant to introduce macros like

   cnt A  # A short?

which could equally be written as

   test A 2  # A short?

or, in this case

   test B 2  # A short?

Could I explain what I mean? Though each higher-level abstraction makes
the code easier, more readable and (the important point) better
searchable, it departs me more and more from the real model.

This becomes very obvious when I debug the code with 'gdb'. Then you see
only pointer structures, identified by the tag bits, and numeric
constant offsets like -8, 4 and 8. Now I'm used to immediately see the
type of a data object in the debugger, knowing that if it ends with a
'8' it is a symbol, and if it ends with a '2' it is a short number or
string. If I see in the debugger

   $rbx = 0x2b484d2d6538

(contents of the register 'E'), it is a symbol, because it ends with a
'8'. So I can inspect it by replacing the 8 with a 0 to get the cell
pointer:

   (gdb) x/2g 0x2b484d2d6530
   0x2b484d2d6530: 0x0000000000000612      0x0000000000619438

You see that the 'TAIL' (the CAR of that cell) ends with a '2', so this
is a name. The hex code "61" is the ASCII char "a". What we have here is
the symbol 'a'.

   The value '0x0000000000619438' of that is a symbol again (turns out
   to be NIL):

   (gdb) x/2g 0x0000000000619430
   0x619430 <data_start+560>:      0x0000000004c494e2      0x0000000000619438

("4e" is 'N', "49" is 'I' and "4c" is 'L')


Well, as you see, I stayed with 'TAIL', and 'CDR', use the test macros
'cnt', 'big', 'sym' etc., and also implemented flow macros like 'if' and
'while'. I always try to keep their double-nature in mind. But there are
limits on how far I want to go.

> E.g. push/pop: there are lots of places where I can see patterns like

This would be too far IMHO.

> ...
> (asmFn 'apply 2 (X Y Z)
>    ... )

This would tempt the programmer to write it for every function, and he
would cease to optimize the flow locally. For example, many function
will do a "push X" in the beginning, and push other registers only when
a certain condition arises.

To be sure, such optimizations have no measurable impact on the
performance of the code. But still they are important for me, as they
bring it closer to the (to be defined) "optimum". Let's say, they make
up the fun ;-)


> Maybe the question should be whether there are ways of building the
> assembly code programmatically rather than manually?

For core functions, this would defeat the described purpose. But for
application level libraries (like your "ffi.l") it might be a valuable
option.

Cheers,
- Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Reply via email to