Re: An assembly question from the past
Hi Kevin, > Previously, I've gotten pil21 (and originally pil64) to run bare metal on > the RPI4 4/8 GB models. > ... > functionality, and adding some C/ASM for bootup and setting up the MMU and > exception vectors, before branching to the pil21 interpreter loop (instead > > I wrote a small UART/RS-232 driver to allow REPL functionality, with the > ... > was at a stage where it could be bootstrapped from the REPL, i.e. an entire > "kernel" could be written dynamically from the REPL - similar to PilOS. Of > ... > time to create the PilPhone :) Wow! That is an impressive list of achievements! Please continue to keep us up-to-date! ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Hello, To answer your question without going into gory details, assuming the program was compiled for a 32-bit arch where the pointer size is typically 4 bytes (vs 8 bytes on 64-bit), any code assuming this would need to be addressed, especially bit operations. Generally, it's straight-forward to run 32-bit on 64-bit than vice versa with the 64-bit on 32-bit, due to the latter needing more registers, address space differences etc. Previously, I've gotten pil21 (and originally pil64) to run bare metal on the RPI4 4/8 GB models. Qemu has an emulator for the RPI3 only (still?) so it wasn't terribly useful, which meant replacing the SD card a bunch of times until I could get a proper REPL booted. Similar to PilOS, it required factoring out C libraries, POSIX and I/O functionality, and adding some C/ASM for bootup and setting up the MMU and exception vectors, before branching to the pil21 interpreter loop (instead of a kernel for example). The interpreter was modified to allow access to system registers from Lisp to handle interrupts and other low level functionality. Everything ran at EL1 (equivalent to ring level 0 on x86) and in a single-address space, so it had access to the entire memory space including the peripherals. What's cool is I could optimize the TLB to reduce misses, which meant Lisp code could potentially run faster, along with less OS overhead like context-switching. I wrote a small UART/RS-232 driver to allow REPL functionality, with the REPL implemented in Lisp (rather than compiled into the interpreter). It was an interesting experience to say the least. As a proof of concept, it was at a stage where it could be bootstrapped from the REPL, i.e. an entire "kernel" could be written dynamically from the REPL - similar to PilOS. Of course, there are other design issues to consider including a muti-core/interpreter design, namespacing security and C support. I'm hoping to attempt it again for the PinePhone or ROCKPro64, when I have the time to create the PilPhone :) Adding to the list, there was also a minimal PicoLisp implementation on JS: https://github.com/Grahack/EmuLisp . It could act as a cross-platform implementation of PicoLisp, albeit with different characteristics, like the ersatz version or pil64 emulator; I wonder if the JIT would make the interpreter faster in certain situations, like with numbers, similar to PyPy. Unforunately, WASM isn't feasible due to stack limitations which makes implementing a GC difficult. Plenty of possibilities! On Thu, Mar 31, 2022 at 11:31 AM Henry Baker wrote: > Actually, 64-bits is also interesting. I only asked about 32-bits, since > I had lots of old 32-bit machines > around; bare metal on a Raspberry Pi (32 or 64 bits) would also be > interesting. RPi4's are hard to get > just now, but RPi3's run 64 bits (albeit more slowly). > > I haven't played with Arm VM's but I presume that they are just as good as > Intel VM's. > > An aside: if one is executing on a 64-bit machine, how hard is it to > execute a 32-bit 'application'? Can one easily start up a 32--bit thread > inside a 64-bit machine? > > This 32 inside 64 question is purely theoretical (for the moment)... > > -Original Message- > From: > Sent: Mar 31, 2022 2:59 AM > To: > Subject: Re: An assembly question from the past > > On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote: > > This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS > > Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64 > bits. > > ☺/ A!ex > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe > > > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe >
Re: An assembly question from the past
On Thu 31 Mar 2022 at 15:25, Henry Baker wrote: > An aside: if one is executing on a 64-bit machine, how hard is it to > execute a 32-bit 'application'? Can one easily start up a 32--bit > thread inside a 64-bit machine? This 32 inside 64 question is > purely theoretical (for the moment)... I use the 32 bit picolisp written in C. It runs fine on amd64 machines with 64 bit GNU Linux and also runs fine on 32 bit and 64 bit raspberry pi. Alex wrote several picolisp implementations which have different dependencies, for example: - minipicolisp: minimal dependencies, less functionality - 32 bit picolisp in C (the one I am using) - pil64: 64 bit picolisp in assembly - picolisp in java - pil21: picolisp using llvm There are other projects like bare metal picolisp or picolisp in fpga but I haven't kept track of those. -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Actually, 64-bits is also interesting. I only asked about 32-bits, since I had lots of old 32-bit machines around; bare metal on a Raspberry Pi (32 or 64 bits) would also be interesting. RPi4's are hard to get just now, but RPi3's run 64 bits (albeit more slowly). I haven't played with Arm VM's but I presume that they are just as good as Intel VM's. An aside: if one is executing on a 64-bit machine, how hard is it to execute a 32-bit 'application'? Can one easily start up a 32--bit thread inside a 64-bit machine? This 32 inside 64 question is purely theoretical (for the moment)... -Original Message- From: Sent: Mar 31, 2022 2:59 AM To: Subject: Re: An assembly question from the past On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote: > This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64 bits. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
On Wed, Mar 30, 2022 at 05:13:20PM -0700, C K Kashyap wrote: > This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS Right, but it seems Henry looks for a 32-bit machine, but PilOS needs 64 bits. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
This may be of interest to you Henry - https://picolisp.com/wiki/?PilOS On Wed, Mar 30, 2022 at 4:42 PM Henry Baker wrote: > I haven't been following this thread terribly closely, so I hope this > question isn't off-base. > > Is there a version of picolisp that runs on 80386/80486/80586 'bare metal' > (or at least 'bare VM') -- talking directly to a HW serial port and reading > from a FAT file system? > > -Original Message- > From: > Sent: Mar 30, 2022 10:38 AM > To: > Subject: Re: An assembly question from the past > > On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote: > > Just to give some background - I've been working on the attempt to port > > miniPicoLisp to windows (more like making vanilla C as the only > > dependency). > > Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only > stdio > library functions. > > > > For the stack - I believe that Pil > > successfully existed without coroutines for decades right :) > > Yes. Coroutines are very nice in some situations, but with more programming > effort you can always implement a conventional solution instead. > > > Somehow llvm - even though it's "industry standard" now - I feel that it > > imposes too much as a dependency - the very fact that it's written in c++ > > is a turn off for me :) > > I agree with both statements. > > ☺/ A!ex > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe > > > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subjectUnsubscribe >
Re: An assembly question from the past
I haven't been following this thread terribly closely, so I hope this question isn't off-base. Is there a version of picolisp that runs on 80386/80486/80586 'bare metal' (or at least 'bare VM') -- talking directly to a HW serial port and reading from a FAT file system? -Original Message- From: Sent: Mar 30, 2022 10:38 AM To: Subject: Re: An assembly question from the past On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote: > Just to give some background - I've been working on the attempt to port > miniPicoLisp to windows (more like making vanilla C as the only > dependency). Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only stdio library functions. > For the stack - I believe that Pil > successfully existed without coroutines for decades right :) Yes. Coroutines are very nice in some situations, but with more programming effort you can always implement a conventional solution instead. > Somehow llvm - even though it's "industry standard" now - I feel that it > imposes too much as a dependency - the very fact that it's written in c++ > is a turn off for me :) I agree with both statements. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
> Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only > stdio > library functions. > Thanks :Alex :) ... almost Vanilla C I think - with some gcc toppings (VLA particularly) ;) I also moved away from pointer tagging in favor of an extra "part" in the cell. This takes away any alignment requirement as well
Re: An assembly question from the past
On Wed, Mar 30, 2022 at 08:13:00AM -0700, C K Kashyap wrote: > Just to give some background - I've been working on the attempt to port > miniPicoLisp to windows (more like making vanilla C as the only > dependency). Good, but isn't miniPicoLisp plan vanilla C anyway? I think it uses only stdio library functions. > For the stack - I believe that Pil > successfully existed without coroutines for decades right :) Yes. Coroutines are very nice in some situations, but with more programming effort you can always implement a conventional solution instead. > Somehow llvm - even though it's "industry standard" now - I feel that it > imposes too much as a dependency - the very fact that it's written in c++ > is a turn off for me :) I agree with both statements. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Thanks for the clear explanation Alex, Just to give some background - I've been working on the attempt to port miniPicoLisp to windows (more like making vanilla C as the only dependency). I wanted to make sure that I understood the cost of not going with assembly. Since I use https://github.com/libtom/libtommath for BigNum I think I am okay on the flags front. For the stack - I believe that Pil successfully existed without coroutines for decades right :) and I can see how I could mimic coroutines in the "user space". Somehow llvm - even though it's "industry standard" now - I feel that it imposes too much as a dependency - the very fact that it's written in c++ is a turn off for me :) Regards, Kashyap On Wed, Mar 30, 2022 at 12:13 AM Alexander Burger wrote: > Hi Kashyap, > > > I can see how you would have to end up writing the whole thing in > assembly > > - in the example you shared. Would it be right to say that its only the > > carry flag that you need or is it just an example and there are other > flags > > too? > > Pil64 used three flags (zero, sign and carry). CPUs usually have a lot more > of them, e.g. overflow, but I decided to go without them. > > Some functions returned values in one or more registers, plus some flags. > This > is much more powerful than the single return value supported by C. > > > Can I say that the need is restricted to the use of BigNum? > > On the machine instruction level, the carry is used in a lot more > situations, > like comparisons or arithmetic shifts. > > > > The ability to set/get the stack I presume needs to be compared with > > setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not > > efficient enough? > > No, setjmp/longjmp is fine. Pil21 uses it too. But in some situations you > need > to set the stack pointer explicitly (e.g. when allocating coroutine stack > areas) > or read it (stack overflow checks). > > ☺/ A!ex > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe >
Re: An assembly question from the past
Hi Kashyap, > I can see how you would have to end up writing the whole thing in assembly > - in the example you shared. Would it be right to say that its only the > carry flag that you need or is it just an example and there are other flags > too? Pil64 used three flags (zero, sign and carry). CPUs usually have a lot more of them, e.g. overflow, but I decided to go without them. Some functions returned values in one or more registers, plus some flags. This is much more powerful than the single return value supported by C. > Can I say that the need is restricted to the use of BigNum? On the machine instruction level, the carry is used in a lot more situations, like comparisons or arithmetic shifts. > The ability to set/get the stack I presume needs to be compared with > setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not > efficient enough? No, setjmp/longjmp is fine. Pil21 uses it too. But in some situations you need to set the stack pointer explicitly (e.g. when allocating coroutine stack areas) or read it (stack overflow checks). ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Hi Tomas, > >carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + > > (unDig(dst) & ~1))); > [...] > > Concerning the stack, assembly code can handle the hardware stack pointer > > just > > like any other register. > > interesting > > Did you consider GCC inline assembly? Yes. But I found it much more clean to write the whole system in a generic assembler. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Thank you Alex, I can see how you would have to end up writing the whole thing in assembly - in the example you shared. Would it be right to say that its only the carry flag that you need or is it just an example and there are other flags too? Can I say that the need is restricted to the use of BigNum? The ability to set/get the stack I presume needs to be compared with setjmp/longjmp - correct? Is setjmp/longjmp insufficient or is it not efficient enough? Regards, Kashyap On Tue, Mar 29, 2022 at 12:31 PM Tomas Hlavaty wrote: > On Tue 29 Mar 2022 at 18:49, Alexander Burger wrote: > > As C does not allow access to the carry bit, you have to do ugly and > inefficient > > tricks, by looking at the most significant bit of the result and trying > to > > detect an overflow. For example, in bigAdd() in pil32's src/big.c: > > > >carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + > (unDig(dst) & ~1))); > [...] > > Concerning the stack, assembly code can handle the hardware stack > pointer just > > like any other register. > > interesting > > Did you consider GCC inline assembly? > What were the reasons you did not use it? > > -- > UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe >
Re: An assembly question from the past
On Tue 29 Mar 2022 at 18:49, Alexander Burger wrote: > As C does not allow access to the carry bit, you have to do ugly and > inefficient > tricks, by looking at the most significant bit of the result and trying to > detect an overflow. For example, in bigAdd() in pil32's src/big.c: > >carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + > (unDig(dst) & ~1))); [...] > Concerning the stack, assembly code can handle the hardware stack pointer just > like any other register. interesting Did you consider GCC inline assembly? What were the reasons you did not use it? -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
Re: An assembly question from the past
Hi Kashyap, > > >> Pil32 and miniPicoLisp are written in C, and C does not support calling > >> other functions in a generic way. This is one of the reasons pil64 was > >> written in assembly (in addition to stack control and CPU status bits). > > Could you please throw some more light on the "stack control" and "CPU > status bits". I imagine that "CPU status bits" should be easy with > inline assembly right? (although potentially introducing a bunch of > #ifdef's for individual platform/compiler). I am not sure of the stack > control though. Yes, with inline assembly you gain access to the CPU status bits. But it does not help much, as these bits need to be handled by the code all over the function (and even across function calls, see below), so you probably end up writing *all* in inline assemtly. As an example, take the addition of multi-word numbers (bignums). At each step, you add two single words from each number, *plus* the carry bit (a CPU status register bit), and you get a new carry bit for the next step. As C does not allow access to the carry bit, you have to do ugly and inefficient tricks, by looking at the most significant bit of the result and trying to detect an overflow. For example, in bigAdd() in pil32's src/big.c: carry = (unDig(src) & ~1) > num(setDig(dst, (unDig(src) & ~1) + (unDig(dst) & ~1))); The assembly code in Pil64 does not only use the carry bit, but also the zero- and sign bits. This makes it possible to pass and return these bits to/from functions, and have them tested by the caller directly: call fun # fun returns the zero bit set or unset jz bar# Conditional jump C code does not support this. Instead, a number is returned in a register, this number needs in turn to be *compared* to zero by the caller, to obtain the same zero-bit as was directly returned in the assembly version. Concerning the stack, assembly code can handle the hardware stack pointer just like any other register. You can read it ld A S # Get stack pointer into A or set it ld S A # Store A in stack pointer Such operations are needed for example to set and restore the stacks in coroutine switching, or to check for stack overflows. LLVM as advantages here, as it at least supports a kind of carry bit (though not other CPU flags), and has operators to get and set the stack pointer. Thus, Pil21, which compiles to LLVM-IR, is a kind of compromize between an implementation in C and a full-control assembly implementation. ☺/ A!ex -- UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
An assembly question from the past
Hey Alex, I had reached out to you about the need for assembly in the past and you had mentioned the following - > 'c' implementation of pil32? >> Pil32 and miniPicoLisp are written in C, and C does not support calling >> other functions in a generic way. This is one of the reasons pil64 was >> written in assembly (in addition to stack control and CPU status bits). Could you please throw some more light on the "stack control" and "CPU status bits". I imagine that "CPU status bits" should be easy with inline assembly right? (although potentially introducing a bunch of #ifdef's for individual platform/compiler). I am not sure of the stack control though. Regards, Kashyap