RFC: django.template refactoring (#7806)

Johannes Dollinger Tue, 16 Sep 2008 11:37:01 -0700

Why should django.template be refactored? Filter and Variable parsing  
is inconsistent. Many ad hoc parsers in defaulttags are fragile.  
Whitespace handling is ungraceful.


The patch provided in #7806[1] splits __init__.py into separate  
modules[2] and introduces a TokenStream class that allows parsing of  
literals, vars (called lookups) and filter expressions.

Here's how it would work:

@register.tag
@uses_token_stream
def mytag(parser, bits):
     expr = bits.parse_expression(required=True)
     return MyNode(expr)

`uses_token_stream` replaces the Token argument to the parser  
function with a TokenStream object.
If the token is not fully parsed, a TemplateSyntaxError is raised.
For better examples, have a look at the patched version of  
`django.template.defaulttags`.


TokenStream API (first stab)
============================
``def __init__(self, parser, source)``
parser: a django.template.compiler.Parser instance
source: a string or a django.template.compiler.Token instance

TokenStream tokenizes its source into "string_literal",  
"numeric_literal", "char", and "name" tokens, stored in self.tokens  
as (type, lexeme) pairs. Whitespace will be discarded.
You can "read" tokens via the following methods, the current position  
in self.tokens is maintained in self.offset.

A "char" token is a single character in ':|=,;<>!?%&@"\'/()[]{}`*+-'.  
A name is any sequence of characters that does contain neither "char"  
characters nor whitespace and is not a string or numeric literal.

 >>> TokenStream(parser, r'"quoted\" string"|filter:3.14 as  
name').tokens
[('string_literal', '"quoted\\" string"'), ('char', '|'), ('name',  
'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name',  
'as'), ('name', 'name')]
 >>> TokenStream(parser, r' "quoted\" string" | filter : 3.14  as  
name').tokens
[('string_literal', '"quoted\\" string"'), ('char', '|'), ('name',  
'filter'), ('char', ':'), ('numeric_literal', '3.14'), ('name',  
'as'), ('name', 'name')]

Low level API
-------------
``def pop(self)``
Returns a pair (tokentype, lexeme) and offset+=1; tokentype is one of  
"string_literal", "numeric_literal", "char", "name".

``def pop_lexem(self, match)``
Returns True and offset+=1 if the next token's lexeme equals `match`.  
Returns False otherwise.

``def pop_name(self)``
Returns the next token's lexeme and offset+=1 if its tokentype is  
"name". Returns None otherwise.

``def pushback(self)``
offset-=1


High level API
--------------
These methods raise TokenSyntaxError and leave offset untouched if  
the expected result cannot be parsed.
Each accepts a boolean required=False kwarg which turns  
TokenSyntaxErrors into TemplateSyntaxErrors if True.

``def parse_string(self, bare=False)``
Returns the value of the following string literal. If bare=True,  
unquoted strings will be accepted.

``def parse_int(self)``
Returns the value of the following numeric literal, if it is an int.

``def parse_value(self)``
Returns an Expression that evaluates the following literal, variable  
or translated value.

``def parse_expression(self)``
Returns an Expression that evaluates the following value or  
filterexpression.

``def parse_expression_list(self, minimum=0, maximum=None, count=None)``
Returns a list `e` of expressions; minimum <= len(e) <= maximum.  
count=n is a shortcut for minimum=maximum=n.


I'm unhappy with the naming of TokenStream and uses_token_stream  
(using_token_stream?). And I just noticed the english spelling of  
lexeme has a trailing "e".


[1] http://code.djangoproject.com/ticket/7806
[2] See [1] for details. Yes, this is orthogonal to the concern of  
the ticket. I gave up splitting the patch when I stumbled upon the  
first circular dependency issue - a half-assed effort - and decided  
it's not worth it.


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

RFC: django.template refactoring (#7806)

Reply via email to