On Thu, Jun 05, 2025 at 10:26:14AM +0800, Qian Yun wrote: > On 6/5/25 3:22 AM, Waldek Hebisch wrote: > > Today we have 59872 lines of Lisp and Boot code in src/interp > > (4100 lines of Lisp and 55772 lines of Boot). That is decrease > > by 1522 compared to FriCAS 1.3.11. As I wrote eventually > > I would like to limit Lisp to lowest level support code and > > move other functionality to Spad code. > > 1. Do you think this development process could be done > incrementally? (i.e. replace the compiler component by component)
Mostly yes. In case of compiler (and probably some other components) we probably will need some period of parallel developement, that is having old component in place but having capability to replace it by new one so that we can test compatibility. More specifically, to compile Spad code we need working Spad compiler. Replacement almost surely will use different data structures, which means that part which we replace will be sizeable. Currenly Spad compiler has several stages, first 3 stages, that is reading files and include handling, scanner and pile handling are shared with the interpreter. After that there is Spad specific part which changes symbols and affects pile handling (scwrap2.boot). There is part which mostly helps in algebra bootstrap but also contains calls to compiler passes (ncomp.boot). There parser (s-pares.boot). There are 2 transformation passes (postpar.boot and parse.boot) run after parser. Biggest part of Spad compiler takes (transformed) parse tree and environment as arguments and produces Lisp via calls to later stages. Entry point to the compiler is 'compTopLevel', main "driver" is 'comp' ('compTopLevel' mostly sets up things to properly print error messages and calls 'comp'). 'comp' recursively handles various Spad constructs. Definitions (constructors and functions) are handled in 'define.boot'. 'functor.boot' generates code to initialize constructors (part of this is delegated to 'nruncomp.boot'). 'iterator.boot' handles Spad looping constructs. 'apply.boot' handles function calls. Environment keeps information about visible declarations, at entry to the compiler it is empty, but in recursive calls it contains info from upper stages. Environment handling is partially shared with the interpreter. Part of environment handling is in 'i-intern.boot'. However, 'modemap.boot' is related as it puts a lot of information into environment. Global information is kept in databases, several places in compiler query databases and put slightly changed information in the environment. Compiler uses runtime system, in particular categories in compiler sometimes are represented by Lisp S-expression, but frequently those S-expression are evaluated to get runtime representation of a category. In particular this is done to produce operation list for a domain/package (compiler effectively produces fake category reprezenting type of domain and extracts operation list and some other info from that category). Also, compiler needs to handle conditions. To do this compiler tries various sources of information like databases, but ultimately evaluates categories to query runtime values of conditions (and especially, presence of operations). Compiler plays special tricks to avoid evaluating domains and packages during compilation, but sometimes can not avoid this. Compiler uses special representaion for several constructs in object code, this is changed by functions in 'g-boot.boot' to Lisp that is output (or compiled in memory). For constructors, interpreter functions and for internal use by interpreter there is support for memoization, (in 'clam.boot' and 'slam.boot' As you can see from the above there is a lot of interaction between various parties involved in compilation and they must keep consistent representations of needed data.) Early stages of compiler should be relatively easy to replace (for example I have Spad parser in Spad), but that requires implementing bootstrap infrastructure and ATM I decided to handle bootstrap only later (mainly to avoid reworkin bootstap later, because what needs to be done for full bootstrap will be known only when other parts are done). Typecheking should be doable by separate pass. But for some time we will have to live with old compiler and not fully working new compiler. > 2. Is it a good idea to make "src/interp" more modular? > For example, making the dependency between files more clear, > mark some files as "core" and let other files depend on them. > Current situation feels like spaghetti. There are some clever abuses and undesirable sharing of code. Also, in IBM era new parts were developed as "patches" on older part, that is they defined functions replacing at runtime older functions. As one of first things in my work on Axiom code I removed duplicate definitions, but I kept function mostly in the same place. So effectively logically connected functions are in different files due to historic developement. But there is also separation into larger (multi file) modules and some attempts at layering. If you look at older FriCAS sources you will notice that largish parts of interpeter were dynamially loaded. In principle, if you did not need specific part FriCAS could run without loading it. One of those parts, that is Spad to Aldor translater is completely removed. HyperDoc code is now always included, but if somebody really wanted to remove it, then removal would be relatively easy. "Interpeter" proper, that is part which compiles input files and hadles user expressions is mostly separate from other part. Of course, handling user expressions is core functionality of FriCAS, so nobody tried to remove it, but it should be not hard to create version of FriCAS that say only contains Spad compiler and is unable to perform normal user-oriented tasks. Version of FriCAS containg only HyperDoc would be harder to do, as HyperDoc takes advantage of interpreter proper. Some functionaly in FriCAS is independent of runtime support (that is currently 'buildom.boot', 'interop.boot', 'nruntime.boot', 'nrungo.boot', 'nrunfast.boot' + database info needed there), but normal Spad code needs it so in a sense it is the lowest layer. For bootstrap it should be possible to generate Spad code that does not need its own runtime support, so we should be able to write most of the runtime in Spad. We could also try to write and compile Spad compiler so that it does not need normal runtime support to run. However, details here are to be decided later, currently Spad compiler re-uses runtime functions for type checking. This reduces amount of code that we need and I would like to preserve this. So, Spad compiler running without any runtime support would have weaker typechecking and probably would be unable to compile code needing runtime support. Without actually coding this I do not know is simpler bootstrap possible with Spad compiler independent from runtime is worth effort needed to create such version of Spad compiler, Of course we wnat to use full Spad in algebra so we need full compiler. So, if created, Spad compiler without runtime would be separate beast (hopfully subsetting full compiler) needed only during bootstrap. -- Waldek Hebisch -- You received this message because you are subscribed to the Google Groups "FriCAS - computer algebra system" group. To unsubscribe from this group and stop receiving emails from it, send an email to fricas-devel+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/fricas-devel/aEGNJgl_0e-HU8Jr%40fricas.org.