This also sounds like a bug to me. As for the design assumptions on caching,
the general idea is that the parsetab.py file is recreated if any part of the
grammar specification that would affect the parsing table changes. I'll take
a look at this for the next PLY release.
As for supporting multiple languages, in the same parser there is a much easier
way to do what you want. Basically, you create a set of special "start"
tokens for each language. For example, FOOSTART, BARSTART, and SPAMSTART.
Then, you write a top-level grammar rule like this:
p_start(p):
'''start : FOOSTART foo_parse
| BARSTART bar_parse
| SPAMSTART spam_parse
"""
To parse the different things, you make sure that the lexer returns the
appropriate start token first. At some point, I thought I had implemented a
feature that allowed the starting token to be specified, just for this purpose
but I just don't see it in PLY-3.3 (so my memory is probably wrong about it).
So, you're going to have to hack it with a trick involving the lexer.
Probably the easiest way to do it would be to just pick special characters for
FOOSTART, BARSTART, and SPAMSTART. They don't even have to be text characters.
For example:
t_FOOSTART = '\xf0'
t_BARSTART = '\xf1'
t_SPAMSTART = '\xf2'
Then, when you want to parse something, just prepend it with the appropriate
start symbol. For example,
parse(input=t_FOOSTART+text) # Parse foo stuff
parse(input=t_BARSTART+text) # Parse bar stuff
Admittedly it looks a little hacky, but it should work okay.
As for dynamically picking an arbitrary start symbol at runtime, I really don't
know how that would be implemented in PLY. Changing the start symbol would
require all of the parsing tables to be regenerated. I'm not sure you would
want that.
Cheers,
Dave
On Oct 13, 2010, at 2:08 AM, A.T.Hofkamp wrote:
> adam wrote:
>> I was trying to use the yacc.yacc(start='foo') functionality to
>> specify the starting symbol, and that works ok, but only if i manually
>> delete parsetab.py.
>
> This sounds like a bug, but I am not sure.
> The design assumptions of rule caching are not very clear.
>
>> Is there a better way to do what i'm trying to do? Is there a way to
>> dynamically inject the current/start symbol instead of at grammar
>> build time?
>
> As I wrote in my previous mail, I don't think there is.
>
>> (Like a python language, but where a dict isnt valid standalone, but
>> is valid in assignment or function calls, etc, but in a different mode
>> i need to parse just a standalone dict and not anything else).
>
> You could parse 'dict | python_language', and decide afterwards which one you
> have.
> Another way could be to make classes for your grammars (a 'dict' parser
> class, and a 'python language' grammar class), and use inheritance to
> 'inherit' the dict grammar rules.
>
> Since you are having several parsers in a single program then, you'll have to
> do more work to prevent clashes, like your 'parsetab' problem.
>
> Albert
>
>
> --
> You received this message because you are subscribed to the Google Groups
> "ply-hack" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/ply-hack?hl=en.
>
--
You received this message because you are subscribed to the Google Groups
"ply-hack" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ply-hack?hl=en.