Why should django.template be refactored? Filter and Variable parsing is inconsistent. Many ad hoc parsers in defaulttags are fragile. Whitespace handling is ungraceful.
The patch provided in #7806[1] splits __init__.py into separate modules[2] and introduces a TokenStream class that allows parsing of literals, vars (called lookups) and filter expressions. Here's how it would work: @register.tag @uses_token_stream def mytag(parser, bits): expr = bits.parse_expression(required=True) return MyNode(expr) `uses_token_stream` replaces the Token argument to the parser function with a TokenStream object. If the token is not fully parsed, a TemplateSyntaxError is raised. For better examples, have a look at the patched version of `django.template.defaulttags`. TokenStream API (first stab) ============================ ``def __init__(self, parser, source)`` parser: a django.template.compiler.Parser instance source: a string or a django.template.compiler.Token instance TokenStream tokenizes its source into "string_literal", "numeric_literal", "char", and "name" tokens, stored in self.tokens as (type, lexeme) pairs. Whitespace will be discarded. You can "read" tokens via the following methods, the current position in self.tokens is maintained in self.offset. A "char" token is a single character in ':|=,;<>!?%&@"\'/()[]{}`*+-'. A name is any sequence of characters that does contain neither "char" characters nor whitespace and is not a string or numeric literal. >>> TokenStream(parser, r'"quoted\" string"|filter:3.14 as name').tokens [('string_literal', '"quoted\\" string"'), ('char', '|'), ('name', 'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name', 'as'), ('name', 'name')] >>> TokenStream(parser, r' "quoted\" string" | filter : 3.14 as name').tokens [('string_literal', '"quoted\\" string"'), ('char', '|'), ('name', 'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name', 'as'), ('name', 'name')] Low level API ------------- ``def pop(self)`` Returns a pair (tokentype, lexeme) and offset+=1; tokentype is one of "string_literal", "numeric_literal", "char", "name". ``def pop_lexem(self, match)`` Returns True and offset+=1 if the next token's lexeme equals `match`. Returns False otherwise. ``def pop_name(self)`` Returns the next token's lexeme and offset+=1 if its tokentype is "name". Returns None otherwise. ``def pushback(self)`` offset-=1 High level API -------------- These methods raise TokenSyntaxError and leave offset untouched if the expected result cannot be parsed. Each accepts a boolean required=False kwarg which turns TokenSyntaxErrors into TemplateSyntaxErrors if True. ``def parse_string(self, bare=False)`` Returns the value of the following string literal. If bare=True, unquoted strings will be accepted. ``def parse_int(self)`` Returns the value of the following numeric literal, if it is an int. ``def parse_value(self)`` Returns an Expression that evaluates the following literal, variable or translated value. ``def parse_expression(self)`` Returns an Expression that evaluates the following value or filterexpression. ``def parse_expression_list(self, minimum=0, maximum=None, count=None)`` Returns a list `e` of expressions; minimum <= len(e) <= maximum. count=n is a shortcut for minimum=maximum=n. I'm unhappy with the naming of TokenStream and uses_token_stream (using_token_stream?). And I just noticed the english spelling of lexeme has a trailing "e". [1] http://code.djangoproject.com/ticket/7806 [2] See [1] for details. Yes, this is orthogonal to the concern of the ticket. I gave up splitting the patch when I stumbled upon the first circular dependency issue - a half-assed effort - and decided it's not worth it. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers?hl=en -~----------~----~----~----~------~----~------~--~---