In examples/parser/no_cpu you find a new project, that can be used as a Pascal parser, or as a compiler template for a new CPU.
This project is kind of a plug-in for the FPC compiler, i.e. it uses the original FPC parser (fpcsrc/compiler/p*). More projects will follow, making the FPC parser usable inside other applications. As the name "no_cpu" indicates, this project does not implement a specific CPU, because code generation is not required for an universal parser. The project name "ppcFromM68K" indicates that it is based on the M68K compiler, and it (currently) requires a $DEFINE M86K, because all acceptable CPUs are hard-coded in the FPC compiler code. Effectively every CPU resides in its own subdirectory, that is added to the path when the compiler is built, and that contains several units of predefined names and contents; see the no_cpu/readme.txt for these details. Until now these problems make the parser unusable in other projects: 1) The compiler requires that all used units can be found, and are translated as well. 2) All units implicitly use System. Until now I could not make this unit found, so the test project was named system.pas, to bypass the search for the system unit in the compiler. A compiler switch (-s?) may have the same result. 3) The compiler builds an parse tree for every procedure, but I found no way yet to make this tree accessible. There should exist a method/procedure in the CPU specific code, that is called to create the binary code for a procedure, but I could not yet locate it. 4) It's not yet known how the rest of a unit (declarations...) is represented internally. Some tokens (comments...) simply are skipped, what can be cured by modifications to the scanner. 5) Conditional compilation only processes one branch, and macros are expanded. While macro expansion may be suppressed, somehow, the compilation of multiple conditional branches really doesn't make sense. We'll have to find an way to submit the exactly defined symbols to the compiler, so that the intended branches become part of the parse tree. For the use of the parser in an syntax-highlighter a different approach must be choosen, that allows to classify all tokens for the syntax highlighter itself, and that also allows to identify sections, procedures and blocks for folding and the determination of e.g. begin-end pairs. In detail the last item [5] suggests an more flexible parser, that can do with the scanned tokens whatever is appropriate in the scope of a specific application. The general solution is a separation of the syntactical and semantical procedures in the parser. For fastest processing the semantical code can be made selectable just as for the CPU, by placing this code into a dedicated directory. I hope that this solution is acceptable to the FPC maintainers, and I'm willing to refactor all the parser units accordingly. Another solution would use a Semantics class, with the benefit that different trees can be built from the source code in one or more runs; one such class could provide the classified tokens to an syntax highlighter, another one could provide the block structure of the unit. This solution can be derived from the procedural solution, when the semantical procedures delegate all work to a supplied Semantics object, or to multiple objects in parallel - nothing that would affect the remaining compiler code at all. One big advantage of the separation into syntactical and semantical parts is the chance for adding further languages to the compiler, as selectable front-ends, which use the already existing classes and procedures for code generation. Adding e.g. Oberon syntax should not require many changes to the existing code, while other languages (C/C++) would require to add and handle new node types during optimization and code generation. But such extensions are beyond the scope of the current parser examples. Any comments and suggestions are welcome. If somebody wants to contribute to this or related projects, I'll add it or apply according patches to the examples/parser tree. Any assistance is welcome, in finding out the places where the existing compiler code can be modified, in order to overcome the beforementioned problems. FPDoc documentation of the compiler also will be welcome... DoDi -- _______________________________________________ Lazarus mailing list Lazarus@lists.lazarus.freepascal.org http://lists.lazarus.freepascal.org/mailman/listinfo/lazarus