On 09/06/11 00:20, Sam Denton wrote:
I'm wanting to parse some Wikipedia pages.
Wikipedia template data looks like this: {{my template|arg one|arg
two|keyword=value}}
In a template definition, you can use variable expansion, like this:
{{{1|default for arg one}}}
I defined my lexer to grab runs of '{' and '}' and return different tokens
depending on the length of the run.
My problem is, I'm hitting cases where a template's name is a variable
expansion, resulting in: {{{{{keword}}}|arg one}}
If this is the only way they can be nested, you can use scanner states, that is, define a scanner
state 'outside template', which matches {{ only. when encountering {{, switch to a 'inside template'
scanner state which matches {{{ only. When encountering }}, switch back to the 'outside template'
scanner state.
An alternative solution would be to use a scannerless parser. I am however not sure whether these
exist for Python.
Sincerely,
Albert
--
You received this message because you are subscribed to the Google Groups
"ply-hack" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ply-hack?hl=en.