Re: An assembly question from the past

2022-04-03 Thread Alexander Burger
Hi Kevin,

> Previously, I've gotten pil21 (and originally pil64) to run bare metal on
> the RPI4 4/8 GB models.
> ...
> functionality, and adding some C/ASM for bootup and setting up the MMU and
> exception vectors, before branching to the pil21 interpreter loop (instead
> 
> I wrote a small UART/RS-232 driver to allow REPL functionality, with the
> ...
> was at a stage where it could be bootstrapped from the REPL, i.e. an entire
> "kernel" could be written dynamically from the REPL - similar to PilOS. Of
> ...
> time to create the PilPhone :)

Wow! That is an impressive list of achievements! Please continue to keep us
up-to-date!

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-04-02 Thread Kevin Ednalino
Hello,

To answer your question without going into gory details, assuming the
program was compiled for a 32-bit arch where the pointer size is typically
4 bytes (vs 8 bytes on 64-bit), any code assuming this would need to be
addressed, especially bit operations. Generally, it's straight-forward to
run 32-bit on 64-bit than vice versa with the 64-bit on 32-bit, due to the
latter needing more registers, address space differences etc.

Previously, I've gotten pil21 (and originally pil64) to run bare metal on
the RPI4 4/8 GB models. Qemu has an emulator for the RPI3 only (still?) so
it wasn't terribly useful, which meant replacing the SD card a bunch of
times until I could get a proper REPL booted.

Similar to PilOS, it required factoring out C libraries, POSIX and I/O
functionality, and adding some C/ASM for bootup and setting up the MMU and
exception vectors, before branching to the pil21 interpreter loop (instead
of a kernel for example). The interpreter was modified to allow access to
system registers from Lisp to handle interrupts and other low level
functionality. Everything ran at EL1 (equivalent to ring level 0 on x86)
and in a single-address space, so it had access to the entire memory space
including the peripherals. What's cool is I could optimize the TLB to
reduce misses, which meant Lisp code could potentially run faster, along
with less OS overhead like context-switching.

I wrote a small UART/RS-232 driver to allow REPL functionality, with the
REPL implemented in Lisp (rather than compiled into the interpreter). It
was an interesting experience to say the least. As a proof of concept, it
was at a stage where it could be bootstrapped from the REPL, i.e. an entire
"kernel" could be written dynamically from the REPL - similar to PilOS. Of
course, there are other design issues to consider including a
muti-core/interpreter design, namespacing security and C support. I'm
hoping to attempt it again for the PinePhone or ROCKPro64, when I have the
time to create the PilPhone :)

Adding to the list, there was also a minimal PicoLisp implementation on JS:
https://github.com/Grahack/EmuLisp . It could act as a cross-platform
implementation of PicoLisp, albeit with different characteristics, like the
ersatz version or pil64 emulator; I wonder if the JIT would make the
interpreter faster in certain situations, like with numbers, similar to
PyPy. Unforunately, WASM isn't feasible due to stack limitations which
makes implementing a GC difficult. Plenty of possibilities!

On Thu, Mar 31, 2022 at 11:31 AM Henry Baker  wrote:

> Actually, 64-bits is also interesting.  I only asked about 32-bits, since
> I had lots of old 32-bit machines
> around; bare metal on a Raspberry Pi (32 or 64 bits) would also be
> interesting. RPi4's are hard to get
> just now, but RPi3's run 64 bits (albeit more slowly).
>
> I haven't played with Arm VM's but I presume that they are just as good as
> Intel VM's.
>
> An aside: if one is executing on a 64-bit machine, how hard is it to
> execute a 32-bit 'application'? Can one easily start up a 32--bit thread
> inside a 64-bit machine?
>
> This 32 inside 64 question is purely theoretical (for the moment)...
>
> -Original Message-
> From:
> Sent: Mar 31, 2022 2:59 AM
> To:
> Subject: Re: An assembly question from the past
>
> On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote:
> > This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS
>
> Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64
> bits.
>
> ☺/ A!ex
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>
>
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe
>


Re: An assembly question from the past

2022-04-01 Thread Tomas Hlavaty
On Thu 31 Mar 2022 at 15:25, Henry Baker  wrote:
> An aside: if one is executing on a 64-bit machine, how hard is it to
> execute a 32-bit 'application'? Can one easily start up a 32--bit
> thread inside a 64-bit machine?    This 32 inside 64 question is
> purely theoretical (for the moment)...

I use the 32 bit picolisp written in C.
It runs fine on amd64 machines with 64 bit GNU Linux
and also runs fine on 32 bit and 64 bit raspberry pi.

Alex wrote several picolisp implementations which have different
dependencies, for example:

- minipicolisp: minimal dependencies, less functionality

- 32 bit picolisp in C (the one I am using)

- pil64: 64 bit picolisp in assembly

- picolisp in java

- pil21: picolisp using llvm

There are other projects like bare metal picolisp or picolisp in fpga
but I haven't kept track of those.

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-31 Thread Henry Baker
Actually, 64-bits is also interesting.  I only asked about 32-bits, since I had 
lots of old 32-bit machines
around; bare metal on a Raspberry Pi (32 or 64 bits) would also be interesting. 
RPi4's are hard to get
just now, but RPi3's run 64 bits (albeit more slowly).
 
I haven't played with Arm VM's but I presume that they are just as good as 
Intel VM's.
 
An aside: if one is executing on a 64-bit machine, how hard is it to execute a 
32-bit 'application'? Can one easily start up a 32--bit thread inside a 64-bit 
machine?
 
This 32 inside 64 question is purely theoretical (for the moment)...
 
-Original Message-
From: 
Sent: Mar 31, 2022 2:59 AM
To: 
Subject: Re: An assembly question from the past
 
On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote:
> This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS
 
Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64 bits.
 
☺/ A!ex
 
--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

 

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-31 Thread Alexander Burger
On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote:
> This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS

Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64 bits.

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-30 Thread C K Kashyap
This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS

On Wed, Mar 30, 2022 at 4:42 PM Henry Baker  wrote:

> I haven't been following this thread terribly closely, so I hope this
> question isn't off-base.
>
> Is there a version of picolisp that runs on 80386/80486/80586 'bare metal'
> (or at least 'bare VM') -- talking directly to a HW serial port and reading
> from a FAT file system?
>
> -Original Message-
> From:
> Sent: Mar 30, 2022 10:38 AM
> To:
> Subject: Re: An assembly question from the past
>
> On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote:
> > Just to give some background - I've been working on the attempt to port
> > miniPicoLisp to windows (more like making vanilla C as the only
> > dependency).
>
> Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only
> stdio
> library functions.
>
>
> > For the stack - I believe that Pil
> > successfully existed without coroutines for decades right :)
>
> Yes. Coroutines are very nice in some situations, but with more programming
> effort you can always implement a conventional solution instead.
>
> > Somehow llvm - even though it's "industry standard" now - I feel that it
> > imposes too much as a dependency - the very fact that it's written in c++
> > is a turn off for me :)
>
> I agree with both statements.
>
> ☺/ A!ex
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>
>
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe
>


Re: An assembly question from the past

2022-03-30 Thread Henry Baker
I haven't been following this thread terribly closely, so I hope this question 
isn't off-base.
 
Is there a version of picolisp that runs on 80386/80486/80586 'bare metal' (or 
at least 'bare VM') -- talking directly to a HW serial port and reading from a 
FAT file system?
 
-Original Message-
From: 
Sent: Mar 30, 2022 10:38 AM
To: 
Subject: Re: An assembly question from the past
 
On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote:
> Just to give some background - I've been working on the attempt to port
> miniPicoLisp to windows (more like making vanilla C as the only
> dependency).
 
Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only stdio
library functions.
 
 
> For the stack - I believe that Pil
> successfully existed without coroutines for decades right :)
 
Yes. Coroutines are very nice in some situations, but with more programming
effort you can always implement a conventional solution instead.
 
> Somehow llvm - even though it's "industry standard" now - I feel that it
> imposes too much as a dependency - the very fact that it's written in c++
> is a turn off for me :)
 
I agree with both statements.
 
☺/ A!ex
 
--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe

 

--
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-30 Thread C K Kashyap
> Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only
> stdio
> library functions.
>

Thanks :Alex :) ... almost Vanilla C I think - with some gcc toppings (VLA
particularly) ;) I also moved away from pointer tagging in favor of an
extra "part" in the cell. This takes away any alignment requirement as well

Re: An assembly question from the past

2022-03-30 Thread Alexander Burger
On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote:
> Just to give some background - I've been working on the attempt to port
> miniPicoLisp to windows (more like making vanilla C as the only
> dependency).

Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only stdio
library functions.


> For the stack - I believe that Pil
> successfully existed without coroutines for decades right :)

Yes. Coroutines are very nice in some situations, but with more programming
effort you can always implement a conventional solution instead.


> Somehow llvm - even though it's "industry standard" now - I feel that it
> imposes too much as a dependency - the very fact that it's written in c++
> is a turn off for me :)

I agree with both statements.

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-30 Thread C K Kashyap
Thanks for the clear explanation Alex,
Just to give some background - I've been working on the attempt to port
miniPicoLisp to windows (more like making vanilla C as the only
dependency). I wanted to make sure that I understood the cost of not going
with assembly. Since I use https://github.com/libtom/libtommath for BigNum
I think I am okay on the flags front. For the stack - I believe that Pil
successfully existed without coroutines for decades right :) and I can see
how I could mimic coroutines in the "user space".
Somehow llvm - even though it's "industry standard" now - I feel that it
imposes too much as a dependency - the very fact that it's written in c++
is a turn off for me :)
Regards,
Kashyap

On Wed, Mar 30, 2022 at 12:13 AM Alexander Burger 
wrote:

> Hi Kashyap,
>
> > I can see how you would have to end up writing the whole thing in
> assembly
> > - in the example you shared. Would it be right to say that its only the
> > carry flag that you need or is it just an example and there are other
> flags
> > too?
>
> Pil64 used three flags (zero, sign and carry). CPUs usually have a lot more
> of them, e.g. overflow, but I decided to go without them.
>
> Some functions returned values in one or more registers, plus some flags.
> This
> is much more powerful than the single return value supported by C.
>
> > Can I say that the need is restricted to the use of BigNum?
>
> On the machine instruction level, the carry is used in a lot more
> situations,
> like comparisons or arithmetic shifts.
>
>
> > The ability to set/get the stack I presume needs to be compared with
> > setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not
> > efficient enough?
>
> No, setjmp/longjmp is fine. Pil21 uses it too. But in some situations you
> need
> to set the stack pointer explicitly (e.g. when allocating coroutine stack
> areas)
> or read it (stack overflow checks).
>
> ☺/ A!ex
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>


Re: An assembly question from the past

2022-03-30 Thread Alexander Burger
Hi Kashyap,

> I can see how you would have to end up writing the whole thing in assembly
> - in the example you shared. Would it be right to say that its only the
> carry flag that you need or is it just an example and there are other flags
> too?

Pil64 used three flags (zero, sign and carry). CPUs usually have a lot more
of them, e.g. overflow, but I decided to go without them.

Some functions returned values in one or more registers, plus some flags. This
is much more powerful than the single return value supported by C.

> Can I say that the need is restricted to the use of BigNum?

On the machine instruction level, the carry is used in a lot more situations,
like comparisons or arithmetic shifts.


> The ability to set/get the stack I presume needs to be compared with
> setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not
> efficient enough?

No, setjmp/longjmp is fine. Pil21 uses it too. But in some situations you need
to set the stack pointer explicitly (e.g. when allocating coroutine stack areas)
or read it (stack overflow checks).

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-30 Thread Alexander Burger
Hi Tomas,

> >carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + 
> > (unDig(dst) & ~1)));
> [...]
> > Concerning the stack, assembly code can handle the hardware stack pointer 
> > just
> > like any other register.
> 
> interesting
> 
> Did you consider GCC inline assembly?

Yes. But I found it much more clean to write the whole system in a generic
assembler.

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-29 Thread C K Kashyap
Thank you Alex,
I can see how you would have to end up writing the whole thing in assembly
- in the example you shared. Would it be right to say that its only the
carry flag that you need or is it just an example and there are other flags
too? Can I say that the need is restricted to the use of BigNum?

The ability to set/get the stack I presume needs to be compared with
setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not
efficient enough?

Regards,
Kashyap


On Tue, Mar 29, 2022 at 12:31 PM Tomas Hlavaty  wrote:

> On Tue 29 Mar 2022 at 18:49, Alexander Burger  wrote:
> > As C does not allow access to the carry bit, you have to do ugly and
> inefficient
> > tricks, by looking at the most significant bit of the result and trying
> to
> > detect an overflow. For example, in bigAdd() in pil32's src/big.c:
> >
> >carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) +
> (unDig(dst) & ~1)));
> [...]
> > Concerning the stack, assembly code can handle the hardware stack
> pointer just
> > like any other register.
>
> interesting
>
> Did you consider GCC inline assembly?
> What were the reasons you did not use it?
>
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>


Re: An assembly question from the past

2022-03-29 Thread Tomas Hlavaty
On Tue 29 Mar 2022 at 18:49, Alexander Burger  wrote:
> As C does not allow access to the carry bit, you have to do ugly and 
> inefficient
> tricks, by looking at the most significant bit of the result and trying to
> detect an overflow. For example, in bigAdd() in pil32's src/big.c:
>
>carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + 
> (unDig(dst) & ~1)));
[...]
> Concerning the stack, assembly code can handle the hardware stack pointer just
> like any other register.

interesting

Did you consider GCC inline assembly?
What were the reasons you did not use it?

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: An assembly question from the past

2022-03-29 Thread Alexander Burger
Hi Kashyap,

> 
> >> Pil32 and miniPicoLisp are written in C, and C does not support calling 
> >> other functions in a generic way. This is one of the reasons pil64 was 
> >> written in assembly (in addition to stack control and CPU status bits).
> 
> Could you please throw some more light on the "stack control" and "CPU
> status bits". I imagine that "CPU status bits" should be easy with
> inline assembly right? (although potentially introducing a bunch of
> #ifdef's for individual platform/compiler). I am not sure of the stack
> control though.

Yes, with inline assembly you gain access to the CPU status bits. But it does
not help much, as these bits need to be handled by the code all over the
function (and even across function calls, see below), so you probably end up
writing *all* in inline assemtly.

As an example, take the addition of multi-word numbers (bignums). At each step,
you add two single words from each number, *plus* the carry bit (a CPU status
register bit), and you get a new carry bit for the next step.

As C does not allow access to the carry bit, you have to do ugly and inefficient
tricks, by looking at the most significant bit of the result and trying to
detect an overflow. For example, in bigAdd() in pil32's src/big.c:

   carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + (unDig(dst) 
& ~1)));


The assembly code in Pil64 does not only use the carry bit, but also the zero-
and sign bits. This makes it possible to pass and return these bits to/from
functions, and have them tested by the caller directly:

   call fun  # fun returns the zero bit set or unset
   jz bar# Conditional jump

C code does not support this. Instead, a number is returned in a register, this
number needs in turn to be *compared* to zero by the caller, to obtain the same
zero-bit as was directly returned in the assembly version.


Concerning the stack, assembly code can handle the hardware stack pointer just
like any other register. You can read it

   ld A S  # Get stack pointer into A

or set it

   ld S A  # Store A in stack pointer

Such operations are needed for example to set and restore the stacks in
coroutine switching, or to check for stack overflows.


LLVM as advantages here, as it at least supports a kind of carry bit (though not
other CPU flags), and has operators to get and set the stack pointer. Thus,
Pil21, which compiles to LLVM-IR, is a kind of compromize between an
implementation in C and a full-control assembly implementation.

☺/ A!ex

-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


An assembly question from the past

2022-03-29 Thread C K Kashyap
Hey Alex,

I had reached out to you about the need for assembly in the past and
you had mentioned the following -


> 'c' implementation of pil32?

>> Pil32 and miniPicoLisp are written in C, and C does not support calling 
>> other functions in a generic way. This is one of the reasons pil64 was 
>> written in assembly (in addition to stack control and CPU status bits).


Could you please throw some more light on the "stack control" and "CPU
status bits". I imagine that "CPU status bits" should be easy with
inline assembly right? (although potentially introducing a bunch of
#ifdef's for individual platform/compiler). I am not sure of the stack
control though.


Regards,

Kashyap