> I was wondering if you had gotten a chance to look at version 5 I
> uploaded a couple of weeks ago.
I found a piece of time today to go over it. Much better than the last
version. Others interested in Pygments lexers might find the info
useful as well. For anyone following along, the file is at [1].
1. Matches within a state (that's what the names like "root" are called)
basically happen top-down (it's not longest-match-wins). Especially
watch out for things like this, where only the first can ever match:
(r'\d+', Number.Integer),
(r'\d+L', Number.Integer.Long),
(r'0[xX][a-fA-F0-9]+', Number.Hex),
'0x123' incorrectly matches as an integer 0 followed by x123
'0L' incorrectly matches as an integer 0 followed by L
The solution here is just reorder them, so more specific patterns match
first (see the PythonLexer for an example). You also need to yank the
float matching out into two patterns, so the decimal point or exponent
is required (again, from PythonLexer).
(r'(\d+\.\d*|\d*\.\d+)([eE][+-]?[0-9]+)?', Number.Float),
(r'\d+[eE][+-]?[0-9]+', Number.Float),
2. In a similar vein, once it gets into the parameters state I don't
think it can get out. Parens are matched in the literals state which
gets done first, so probably swap the order of the push/pop parens and
the literals include. I get the same feeling about the pop line inside
the commands state, that numbers won't match because the pop is too
greedy, but I don't know whether that's legal syntax to worry about.
3. I had to look up what r'[]...@$?[]+' would actually do. Is there a
particular reason to use that over the more obvious r'[...@$?\[\]]+'?
4. Always include numbers before literals, since literals includes a
broad match of \w+ that matches numbers. Maybe instead of \w+ use
[A-Za-z]\w* or something.
5. There's a duplication of the string introduction in expressions/literals.
Can you provide an example file with string escapes (the ""|` bit) and
continuations using parens?
Tim
[1]:http://dev.pocoo.org/projects/pygments/attachment/ticket/417/autohotkeyV5.py
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pocoo-libs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/pocoo-libs?hl=en
-~----------~----~----~----~------~----~------~--~---