Hi Paul,

Paul Mensonides:
>   TOKEN_SEQ_TO_SEQ(FIRST, REST, IS_LAST, B o o s t SPACE 1 3 1)
>      ==> (B)(o)(o)(s)(t)(SPACE)(1)(3)(1)

I already have something almost exactly like this in the high-precision
arithmetic that haven't committed yet.  Of course, you'd have to have a
macro for every letter of the alphabet, every number, and every
"miscellaneous" thing like "SPACE".  Also, it cannot be made to work on
any operators, so you'd need special names for them too.
That would be true. In many interesting cases you would need more than one macro per token:
- TOKEN_TO_CHAR_LIT(token) for making characters.
- TOKEN_TO_STR_LIT(token) for composing strings.
- TOKEN_TO_NUMBER(token) for comparing tokens (avoids having to duplicate functionality for tokens).
- etc...
I have been thinking about the possibility of using such a syntax to allow a kind of string manipulation using the preprocessor. The preprocessor simply does not have the means to destructure ordinary identifiers, numeric literals, strings, etc... So, the best thing that can be done is to pass them unstructured. It is not ideal, of course, but I think that it could be more than usable. Consider the following grammar production representations:

In Yacc/Bison:

Expr
: Term Add Expr { return $1 + $3; }
| Term Sub Expr { return $1 - $3; }
;

Using preprocessor, we could use:

((E x p r)
((T e r m) (A d d) (E x p r), { return _1 + _3; } )
((T e r m) (S u b) (E x p r), { return _1 - _3; } ))

Now, the reason why we can't just write `Expr' instead of `(E x p r)' is that the preprocessor has no way of comparing arbitrary tokens like `Expr'. However, it is not impossible to compare token sequences like `(E x p r)'. If one is willing to spend a few macros, it is possible to get an even nicer syntax:

// ... sweetener macros earlier in the same .cpp file ...
#define EXPR (E x p r)
#define TERM (T e r m)
#define ADD (A d d)
#define SUB (S u b)

// ... later in the grammar ...
(EXPR
(TERM ADD EXPR, { return _1 + _3; } )
(TERM SUB EXPR, { return _1 - _3; } ))

The same token sequence technique could be used for many interesting kind of code generators that need to manipulate symbolic information.

hmm... I think that one of the next things that one would need for lexer and parser generators would be to have set and map data structures. Perhaps I'll implement a functional red-black tree or AVL tree using the preprocessor. Well, perhaps in a few weeks I'll have the time.

How about this instead:
[...]

That looks *very* nice, and is probably also very fast!

Too bad that the same technique does not work if the elements are not parenthesized.

-Vesa Karvonen


_________________________________________________________________
MSN 8: advanced junk mail protection and 2 months FREE*. http://join.msn.com/?page=features/junkmail

_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Reply via email to