Hi,

I've just read the blog post, "Visualizing a Python tokenizer" and it  
reminded me of this:

"OMeta: an object oriented language for pattern matching"
http://www.cs.ucla.edu/~awarth/papers/dls07.pdf

OMeta is an extension and generalisation of the idea of PEGs*. It  
provides a nice way to describe a language both at the character level  
(tokens), the grammar itself and productions into the AST. Finally the  
grammars are extensible (possibly from within the language itself).

The implementation is discussed in "Packrat Parsers Can Support Left  
Recursion" and there is some discussion of the performance there. 
http://www.vpri.org/pdf/packrat_TR-2007-002.pdf

I wonder whether the same idea behind PyPy can be applied to the  
grammar. Write a program in some language (a python version of OMeta  
for instance) which is then transformed by the translator, compiler,  
or JIT into something that runs fast.

What could be nice about this is bringing the tokenising and parsing  
closer in spirit to the heart of PyPy, writing 'nicer' code, and  
providing a (I think tantalising) way to try new syntax going forward.

And there are things to play with on this page:
http://www.cs.ucla.edu/~awarth/ometa/ometa-js/

* Parsing Expression Grammar


With regard to railroad diagrams (I think that's what they're called):

There used to be a script that generated them - it's mentioned at the  
top of the python grammar file, and here 
http://www.python.org/search/hypermail/python-1994q3/0294.html

But I've seen discussion elsewhere that it has been lost :(

How about this? 
http://www.informatik.uni-freiburg.de/~thiemann/haskell/ebnf2ps/README


cheers,
Toby
_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev

Reply via email to