On Feb 27, 2009, at 7:36 PM, Jim Idle wrote: > Leonardo Santagada wrote: >> >> On Feb 27, 2009, at 3:19 PM, Jacob Hallén wrote: >> >>> fredagen den 27 februari 2009 skrev Frank Wierzbicki: >>>> On Thu, Feb 26, 2009 at 2:55 PM, Leonardo Santagada <[email protected] >>>> > >>> >>> Andrew Dalke, who is very thorough in his investigation of >>> software, has >>> written som interesting things about his experience with ANTLR as >>> well as >>> some other parsing projects. In short, he likes ANTLR as a tool, >>> but in his >>> application, it is considerably slower than some other alternatives. >>> He also has something called python4ply, which is a ready, MIT >>> licensed >>> parser for Python. >>> >>> You can find his articles on >>> http://www.dalkescientific.com/writings/diary/archive/ >> >> >> The problem he might be having is with the python backend for >> ANTLR, wich neither us (we are going to have to create a rpython >> one) nor cpython (which would use a c89 one) would have. but this >> is just a guess as I have had no time to read his article yet, > All I can find is info about using the Python backend. > Unfortunately, the Python backend is very slow. There was some > discussion between the Python runtime author and Guido about why - I > can't re-quote as it was private email, but basically the runtime > and the generated code are method-call heavy, which isn't a good > idea in Python. Also, as string handling and other things are not > particularly quick, it is hard to get Python to perform when running > a parser using a design that wasn't specifically tailored for > Python. After all, Python wasn't really aimed at writing things like > lexers and parsers and is much better at things in other domains. > All that said, I think that the Python runtime will get better as > there will be more expansive choices for backend runtime authors in > the future. Then again, is the speed of the parser going to be a > factor?
Not much I think. Well I did take a look at both the support code and the generated code (and a fast look at the runtime). The support code in java is really crazy, at least for me I didn't get most of it (I should read the docs later). Now the generated code seems to be following the java backend so far as being almost RPython. there is a problem that RPython doesn't have sets so it will take some time to make it work... but I think it is doable. Somehow my generated code has "pass" before every block of code on both parsers and lexers, the reason for that is still a mistery for me (do anyone knows why?). Maybe in this weekend I will have more time to look/work more seriosly on this. > The Java runtime is a lot quicker that the Python runtime and unless > there are Python translation units with 25,000 lines (there will be > somewhere ;-), performance would not be a factor. > > When I wrote the C runtime however, I did not make a blind copy of > the Java runtime, hence the performance is akin to hand written > code. For instance the GNU C parser written for ANTLR and running > with the C runtime, is almost the same speed as the GNU C parser > itself. This is great, but is the code C89 or do you use something from a newer standard? Because the only way to have a shot at using it with cpython would be if it was C89... though using the same parser in jython and pypy would be cool enough. > None of this would help (in terms of ANTLR) if you want a Python > parser that runs in Python of course :-) Well knowing that the Java one is quick is a good indication that the rpython one can be quick also. Thanks for all the info and for the quick response, -- Leonardo Santagada santagada at gmail.com _______________________________________________ [email protected] http://codespeak.net/mailman/listinfo/pypy-dev
