Hi all, Sorry for the somewhat lenghty post, but I hope it will be helpful for someone of you.
The Boost.Spirit based C++ preprocessor iterator (the project name is 'Wave') is functionally complete now. All pp operators and pp statements are in place, the macro expansion engine works as expected. So I've released a first version: Wave V0.9.0 (please consider it a beta). Conceptually, the Wave library is a conformant (to the C++ Standard) preprocessing C++ lexer, which exposes an (forward-)iterator interface for iteration over the preprocessed C++ tokens. The main goals for this project are: - full conformance with the C++ standard (INCITS/ISO/IEC 14882/1998) - usage of Spirit for the parsing parts of the game (certainly :-) - maximal usage of STL and/or Boost libraries (for compactness and maintainability) - straightforward extendability for the implementation of additional features (as variadics and placemarkers) - building a flexible library for different C++ lexing and preprocessing needs. At the first steps it is not planned to make a very high performance or very small C++ preprocessor. If you are looking for these objectives you probably have to look at other places. Although the C++ preprocessor should work as expected and will be usable as a reference implementation, for instance for testing of other preprocessor oriented libraries as Boost.Preprocessor et.al. or for developing new pp functionalities. Tests done by Paul Mensonides showed, that the Wave library is very conformant to the C++ Standard, such that it compiles several strict conformant modules written by him, which are even not compilable with EDG based preprocessors (i.e. Comeau or Intel). The C++ preprocessor is not built as a monolitic application, it's rather a modular library, which exposes a context object and an iterator interface. The context object helps to configure the actual pp process (as search path's, predefined macros, etc.). The exposed iterators are generated by this context object too. Iterating over the sequence defined by the two iterators will return the preprocessed tokens, which are generated on the fly from the underlying input stream. The overall preprocessing is a two stage process: input stream (characters) | v +-----------+ | C++ lexer | (tokenizer) +-----------+ | v pp tokens | v +-----------+ |preprocess.| (macro expansion etc.) +-----------+ | v preprocessed C++ tokens As you can see, the input stream feeds a full C++ lexer module (the generated C++ tokens here are exposed through an iteration interface too). This C++ lexer allows the preprocessing module to work on tokens, not directly on the character stream (performance!), additionally this helps to resolve language ambiguities such as 'some_class<include<some_term> >' or similar (see C++ standard 2.1.1.3), which is difficult to do in a one step process. During token generation the C++ lexer does physical source lines splicing into logical source lines (removal of '\\' followed by a '\n'), trigraph and alternative token recognition etc. The exposed C++ lexer iteration interface generates the preprocessing tokens consumed by the preprocessing module, which does the actual work, the preprocessing :-). After this the resulting tokens are converted to C++ tokens exposed by the preprocessor interator. To make the C++ preprocessing library modular, the C++ lexer is held completely separate and independend from the preprocessor (it is actually a template parameter). To proof this concept I've implemented two different full blown C++ lexers (one based on a re2c based C++ lexer written by Dan Nuffer some time ago [VERY fast], the other based on the Spirit based Slex dynamic lexing engine - a table driven DFA [quite compact]). Both lexers are plugable into the preprocessor through a unified iterator interface and are completely interchangeable. BTW the C++ lexers are usable standalone, without using the preprocessing part of the library. It would be very interesting to see, how the other existing and ongoing C++ lexers (see the Spirit examples) fit into the picture. So the user of the final library will be able to decide, which C++ lexer fits best his/her needs. There a couple of things left by now: - report the concatination of unrelated tokens as an error - write a more complete documentation (for now please see the samples) - test the Wave pp iterator more thoroughly There is already some documentation in place, which you may use as a starting point. If this isn't enough, there is a sample driver program for the Wave library (source: cpp.cpp etc.), which fully utilizes the capabilities of the library, so you may look at the source for further information (for now). You can find the Wave library in the Spirit CVS (cvs.spirit.sourceforge.net:/cvsroot/spirit): 'spirit/wave'. Additionally there is a zip file, that can be downloaded here: http://sourceforge.net/projects/spirit/ There will be eventually separate releases of binary packages, built for different platforms. Please note, that to build the enclosed sample driver (essentially a full blown text stream --> text stream preprocessor) you will need to have a correctly installed Boost distribution in place, because there are used several different Boost libraries (as Boost.Filesystem, Boost.inreview.program_options etc.) It is planned to bundle the Wave library later on with a strict version of the pp-lib from Paul Mensonides (Boost.Preprocessor) and put it into the Boost CVS. The Wave library compiles and works so far with - VC7.1 (final beta) - gcc 3.2 (Cygwin and linux) - IntelV7/DinkumwareSTL (from VC6sp5) (other compilers were not tested by now). Last but not least I want to thank Paul Mensonides for his invalueable comments, thourough testing and very helpful tips, which made it possible for me to write the Wave library in such a quite short amount of time. Regards Hartmut _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost