Hi Alex,

Thanks for the explanation.  I haven't seriously programmed in
assembly since 1986. However, the mental transitions that you mention
are very familiar to me.  I remember starting to write macros for
everything and very soon it didn't matter that there were often
unnecessary instructions hidden in the macros.  Performance was
not important then, code readability was.  The goal soon became to
write everything in C and use assembler as little as possible. Your
decisions here to write picolisp64 in assembler and define an optimum
virtual assembler should lead to a very optimum system.

Cheers,
 - Rand


On Thu, Jul 23, 2009 at 8:21 AM, Alexander Burger<a...@software-lab.de> wrot=
e:
> Hi Tomas,
>
>> Also, it would allow for macros/shortcuts to automate common patterns.
>
> I think I have to try to explain some of the reasons that drive me. When
> I program in assembly, I switch to a completely different mindset.
>
> While writing a core function in assembly, I want it to be as close to
> the "optimum" as possible. It does not matter how often I re-arrange the
> code, or how long I need for that, because this function will be written
> just once, but called very often later. There lies also the fun of it.
>
> This is completely different from using Lisp to write applications.
> There I want to be as productive as possible, to have an easier life,
> and to be able to abstract as much as possible.
>
> On the assembly level, I want to know each individual bit personally.
> Try not to hide anything. I want to keep my mind within the data model,
> as described e.g. in "doc64/structures".
>
> For example, I had a long fight with myself whether I should introduce
> and use constants like 'BIG', 'CDR', or 'TAIL'. They are defined in
> "src64/defs.l":
>
> =C2=A0 (equ BIG 4) =C2=A0 =C2=A0# Rest of a bignum + bignum tag
> =C2=A0 (equ CDR 8) =C2=A0 =C2=A0# CDR part of a list cell
> =C2=A0 (equ TAIL -8) =C2=A0# Tail of a symbol
>
> A completely straightforward, good and "normal" thing in daily
> programming. It is used like that:
>
> =C2=A0 ld E (E CDR) =C2=A0# Take CDR
>
> What is my problem with that? It hides the "true" nature of the
> underlying data structures. It could instead be written as
>
> =C2=A0 ld E (E 8) =C2=A0# Take CDR
>
> which results in the x86-64 code
>
> =C2=A0 mov 8(%rbx), %rbx
>
> Now when I use an opaque constant 'CDR' instead of '8', I easily forget
> what goes on on the low level, and have a higher concept of a "CDR" in
> mind. This makes it more difficult in some situations to keep in control
> of the lower levels, and to recognize common patterns. If I use 'TAIL'
> instead of '-8', I easily forget how I am accessing the pointers of a
> cell, how they are related to the pointers of neighboring cells etc.
> Then I have to keep both concepts in mind at the same time, and
> constantly switch between them. The awareness about the nature of
> constant like 4, -8 is necessary to interconnect them to the pointer
> tags in the lowest four bits
>
> =C2=A0 cnt =C2=A0 ... S010
> =C2=A0 big =C2=A0 ... S100
> =C2=A0 sym =C2=A0 ... 1000
> =C2=A0 cell =C2=A0... 0000
>
> and I need to constantly juggle with knowing that "cnt is 2", "big is 4"
> and so on.
>
> For the same reasons I was reluctant to introduce macros like
>
> =C2=A0 cnt A =C2=A0# A short?
>
> which could equally be written as
>
> =C2=A0 test A 2 =C2=A0# A short?
>
> or, in this case
>
> =C2=A0 test B 2 =C2=A0# A short?
>
> Could I explain what I mean? Though each higher-level abstraction makes
> the code easier, more readable and (the important point) better
> searchable, it departs me more and more from the real model.
>
> This becomes very obvious when I debug the code with 'gdb'. Then you see
> only pointer structures, identified by the tag bits, and numeric
> constant offsets like -8, 4 and 8. Now I'm used to immediately see the
> type of a data object in the debugger, knowing that if it ends with a
> '8' it is a symbol, and if it ends with a '2' it is a short number or
> string. If I see in the debugger
>
> =C2=A0 $rbx =3D 0x2b484d2d6538
>
> (contents of the register 'E'), it is a symbol, because it ends with a
> '8'. So I can inspect it by replacing the 8 with a 0 to get the cell
> pointer:
>
> =C2=A0 (gdb) x/2g 0x2b484d2d6530
> =C2=A0 0x2b484d2d6530: 0x0000000000000612 =C2=A0 =C2=A0 =C2=A00x000000000=
0619438
>
> You see that the 'TAIL' (the CAR of that cell) ends with a '2', so this
> is a name. The hex code "61" is the ASCII char "a". What we have here is
> the symbol 'a'.
>
> =C2=A0 The value '0x0000000000619438' of that is a symbol again (turns ou=
t
> =C2=A0 to be NIL):
>
> =C2=A0 (gdb) x/2g 0x0000000000619430
> =C2=A0 0x619430 <data_start+560>: =C2=A0 =C2=A0 =C2=A00x0000000004c494e2 =
=C2=A0 =C2=A0 =C2=A00x0000000000619438
>
> ("4e" is 'N', "49" is 'I' and "4c" is 'L')
>
>
> Well, as you see, I stayed with 'TAIL', and 'CDR', use the test macros
> 'cnt', 'big', 'sym' etc., and also implemented flow macros like 'if' and
> 'while'. I always try to keep their double-nature in mind. But there are
> limits on how far I want to go.
>
>> E.g. push/pop: there are lots of places where I can see patterns like
>
> This would be too far IMHO.
>
>> ...
>> (asmFn 'apply 2 (X Y Z)
>> =C2=A0 =C2=A0... )
>
> This would tempt the programmer to write it for every function, and he
> would cease to optimize the flow locally. For example, many function
> will do a "push X" in the beginning, and push other registers only when
> a certain condition arises.
>
> To be sure, such optimizations have no measurable impact on the
> performance of the code. But still they are important for me, as they
> bring it closer to the (to be defined) "optimum". Let's say, they make
> up the fun ;-)
>
>
>> Maybe the question should be whether there are ways of building the
>> assembly code programmatically rather than manually?
>
> For core functions, this would defeat the described purpose. But for
> application level libraries (like your "ffi.l") it might be a valuable
> option.
>
> Cheers,
> - Alex
-- 
UNSUBSCRIBE: mailto:picol...@software-lab.de?subject=unsubscribe

Reply via email to