Hi, On Sun, May 08, 2005 at 11:26:59AM +0200, Armin Rigo wrote: > Hi Ludovic, hi all, > > I had a look at recparser, and how to integrate it into PyPy. Ideally, > it can be exported as the 'parser' module by adding a line to > interpreter/baseobjspace.py (see the commented-out line about the other > 'parser'). Yes we've been experiencing with that already.
> A few comments about the interface file pyparser.py (this > should be put in some documentation...): > > * applevel() requires obscure tweaking about the 'import compiler' > statement, the prevent the whole compiler package to be dragged in and > compiled by PyPy (which may be what we want later, but for now it just > doesn't work, I expect). I checked that in. maybe this will solve the problem we're seeing when trying to compile parse trees generated from either parsers. > * the interpleveldef exports a class, 'STType'. I added another hack in > lazymodule.py to make that work. Basically, the interp-level exports > had to be wrapped objects, or functions -- which get wrapped > automatically. Types now also get wrapped automatically. Previously, > you'd have needed an interpleveldef like > > 'STType': 'space.gettypeobject(pyparser.STType.typedef)' > > which fishes the typedef (i.e. the definition of the app-level type) > corresponding to the class STType, and asks the space to build a real > app-level type object for it. ok > At the moment, with the above changes, it appears to work rather nicely > (at least the few exported methods). But we cannot feed the parse > tuples to the pure Python compiler package because the latter expect > tuples with line number information, and as far as I see you're always > generating tuples without. It seems that you're collecting the > information already so it should not be difficult to fix. > > The next step would be to integrate it so that it is used by the > built-ins, like compile(). There is a new abstraction, class Compiler, > in pypy.interpreter.compiler. Its purpose is to be subclassed by > concrete compilers; currently there is only CPythonCompiler, which > cheats and calls compile() at interpreter-level. I guess that it should > be possible to create another subclass that uses recparser and the pure > Python compiler package to do its job, or even a generic PythonCompiler > that uses whatever built-in 'parser' module is available, and then the > pure Python compiler package. > > All of PyPy ends up using the compiler instance is stored in the current > execution context whenever it needs to compile source code (including at > the interactive prompt). > > > Finally, a quick look over the recparser sources shows a few constructs > that are clearly not "RPython", i.e. too dynamic. We need to think a > bit and see how to address the issue. About RPython: > http://codespeak.net/pypy/index.cgi?doc/coding-style.html#restricted-python > > Before we actually try to perform type inference on recparser, it's a > bit hard to know if there are type problems or not. It is often the > case that even when we write code knowing that it should be RPython we > oversee some subtle typing problem. I'll give it a try, I guess (this > is done by enabling the recparser module in baseobjspace as hinted > above, running "dist/goal/translate_pypy.py targetpypy", and trying to > make sense out of the obscure assertion errors and enormous flow graphs > we get...) > For now, a problematic feature that is obvious is the visitor pattern > that you use extensively. It's definitely a great pattern, but not one > that immediately applies to C- or Java-like languages. I'm not saying > that you should rewrite all of recparser; more that we need to find a > trick to implement visitor patterns without the getattr() with a > computed attribute name. Possibly something along these lines: > > class MyVisitor: > def visit_name1(self, node): > ... > def visit_name2(self, node): > ... > > # this can be computed by a for loop instead: > VISIT_MAP = {'name1': visit_name1, > 'name2': visit_name2, > } > > class Node: > def visit(self, visitor): > visit_meth = visitor.VISIT_MAP[self.name] > visit_meth(visitor, self) > > The difference with the getattr() case is that the operation that > replaces it, a getitem on a constant dictionary, has a reasonable > C-level equivalent, namely a (precomputed) hash table lookup. sure, I discussed that with Hoelger already, thing is the visitor isn't used for parsing but only by the EBNFParser which parses the python grammar file and turn it into a tree of grammar object This should be called only at startup time. I must say I am not sure whether the following call in recparser/__init__.py: PYTHON_PARSER = pythonutil.python_grammar() really is called at bootstrap time ? anyway, at this time PYTHON_PARSER is a static tree of objects representing the grammar and for now the parsing is done by providing a 'builder' object to the match method of the tree (in fact there are several subtrees, one for each grammar targets) > > That's it for now. Don't hesitate to ask if I'm not making sense, or > for more help about integration issues. I am aware that it is some kind > of guesswork at the moment. Just feel free to post to pypy-dev. > > > A bientot, > > Armin. > -- Ludovic Aubry LOGILAB, Paris (France). http://www.logilab.com http://www.logilab.fr http://www.logilab.org _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
