The analyse_text function is used for both total-guess mode (./
pymentize -g < file) and for when there are multiple lexers for a
filename (e.g. ANTLR .g files, since Pygments 1.0). I'm thinking
about how to set up analyse_text for these lexers, such that the ANTLR
Java lexer trumps all other ANTLR lexers, but doesn't steal
highlighting when the extension doesn't match.
The easiest way I've come up with is a small API change:
def analyse_text(text, filepattern_matched=None)
filepattern would be None when using stdin, but otherwise could be
'*.g' for example. I originally had a boolean here, but that wouldn't
work well for the Prolog lexer (it should be less confident on .pl and
fully confident on .pro).
Another option is to put some sort of match-bonus in the list of
filenames, since the code for this would be very redundant, e.g.
filenames=[("*.pl", 0.5), ("*.pro", 0.8)].
Ideas?
Tim
p.s. here's the wrapper that would be required for that to not break
plugin lexers:
def make_analysator(f):
"""
Return a static text analysation function that
returns float values.
"""
def text_analyse(text, filepattern_matched=False):
try:
rv = f(text, filepattern_matched)
except TypeError:
# Continue supporting the one-argument version, which may
be used
# in plugin lexers for a while to come.
rv = f(text)
if not rv:
return 0.0
return min(1.0, max(0.0, float(rv)))
text_analyse.__doc__ = f.__doc__
return staticmethod(text_analyse)
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"pocoo-libs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/pocoo-libs?hl=en
-~----------~----~----~----~------~----~------~--~---