Hi, I made up my mind and I think I have the solution (for the is_safe terminology, django world domination, and all the rest :-)
* finalized * So it's : FinalizedString (replaced SafeString) mark_finalized() (replaces mark_safe) preserves_finalized (replaces is_safe as a function attribute) It seems there's no need for the opposite. Rationale for this choice -------------------------------- * 'finalized' is a more neutral term than 'safe'. * it does not make any wrong promises. * it does not suggest that it has actually been escaped or processed in any way, like 'escaped' did. * 'finalized' means that its form is considered final, and that exactly defines the concept. Using these terms, I've edited Malcolm's introduction of his patch. Some comments are in double brackets [[ ]]. ******** Snip ***** A summary of the points I'd like opinions on is at the end of the email. What does this add? ------------------- (1) An "autoescape" template tag that turns automatic escaping on or off throughout its scope. (2) A "noescape" filter that marks its result as finalized for use without further escaping (see the description of "finalized strings" below). (3) Safe Context ... [[remark: I think it's now consens that SafeContext won't make the run, so this paragraph would be removed anyway ]] (4) A "mark_finalized()" method to mark strings as not requiring further escaping. How does it work? ----------------- When a variable is evaluated in a context in a template, it is considered to be either "finalized" or not. By default, strings are not marked as finalized. When automatic escaping is enabled, because {% autoescape on %} is in effect, all strings that are not marked as finalized are escaped at rendering time (here, "escaped" means "conservative HTML escaping": &, <, >, " and ' are converted to entities always). Any string marked as finalized (or passed through the "noescape" filter) is not automatically escaped. When automatic escaping is disabled in the template, all variable results are output without further escaping, unless the "escape" filter is applied to them (this is the same behaviour as currently in Django). Because some filters are designed to return raw markup, the mark_finalized() function exists so that the returned strings can be designated as finalized. For example, {{ var|markdown }} returns raw HTML and the result is not subject to any further auto-escaping. Some filters (e.g. unordered_list) wrap HTML around raw content. If auto-escaping is enabled, these filters will escape the content before wrapping it in the HTML tags (the returned result is a finalized string in all cases). So such filters are auto-escaping-aware. Filters that accept text strings as input and return text strings are marked (with the "preserves_finalized" attribute) as to whether or not they return a finalized string whenever they are passed a finalized string as input. This has the effect of preserving "finalizedness" for those filters. Note that this attribute is a *guarantee* of preserving finalizedness, so a filter like "cut" has preserves_finalized = False: {{ var|cut:"&" }} could turn an escaped string into a monster. If a finalized string is passed into a filter that is not marked as finalized and auto-escaping is enabled, the resulting string will be escaped. If the preserves_finalized attribute is not attached to a function, it is assumed to be not preserving. Because of the preserves_finalized attribute, it will be possible to change the automatically generated documentation in the admin interface to annotate each filter whether it guarantees to preserve finalizedness. [[ I took the freedom to edit a bit more liberally in the preceding sentence, 'with its guarantee of preserving finalizedness' sounds too much nouns for my ears ]] The "noescape" filter acts as a way to annotate the result of a filter chain as finalized. So, although "cut" is not a filter preserving finalizedness, we know that cut:"x" is preserving (it can't harm our HTML-escaped strings) and thus {{ var|cut:"x"|noescape }} will prevent further escaping of the result, even in auto-escaping-enabled situations. The "noescape" filter does nothing except mark the result as finalized -- the string output is identical to the input. Is it backwards compatible? --------------------------- Mostly. Auto-escaping is not turned on by default (Adrian made a statement in [2] and I'm going with that preference at the moment). If you do not use the autoescape tag, it will be very close to what happens today. [2] http://groups.google.com/group/django-developers/msg/5a57f37667e1e941? Four filters had their behaviour slightly changed. Three of these are: linebreaks, linebreaksbr and linenumbers. All three now respect the current auto-escaping setting on their input content (before applying breaks or numbering). Previously, linenumbers would always escape and linebreaks and linebreaksbr would never escape. So, the main change for somebody not enabling auto-escaping here is that the linenumbers filter will not escape the output any longer. The fourth filter is "escape". To make forward porting easier and so that template designers do not have to feel restricted in their use of the escape filter, I implemented it so that applying "escape" to a finalized string has no effect. Since "escape" itself makes the result finalized, applying escape multiple times in a chain has the same effect as applying it exactly once. Previously (var = "&"): {{ var|escape|escape }} => &amp; Now: {{ var|escape|escape }} => & This particular case of chaining "escape" is obviously not common (I would hope). All of the default filters (template/defaultfilters.py) and the markup filters have been ported in this patch. I have not done anything else under contrib/ or the i18n filters. Points to note -------------- (1) Because "preserves_finalized" has to be valid for all arguments, the "pluralize" filter is not preserving at the moment. The bizarre {{ var| puralize:"&" }} is an example of the problem case. I'm thinking of fixing this so that we check for unescaped characters in the argument(s) and then return finalized strings on finalized input and no unescaped characters in the args. (2) Because of the way "noescape" works, it was not really possible to make {% filter noescape %} work in an auto-escaping block (the contents were escaped long before the filter tag was applied). Fortunately, this particular filter tag construct is equivalent to {% autoescape off %}, so there's no functionality loss. A TemplateSyntaxError is raised if the illegal construct is used. (3) Filters that take non-string arguments (e.g. "join") or return something other than a string (e.g. "length") have preserves_finalized = False. This is convention more than requirement, but it makes things explicit. Performance impact ------------------- Obviously we are doing a little bit more work here, even in the non-auto-escaping paths (just testing what the auto-escape setting is, for example). As far as I can work out, the performance impact is very minor, but I do not have a really good performance test suite for this at the moment. One simple test: running the tests/othertests/template.py file 100 times takes 21.0436 seconds before these changes and 21.1674 seconds afterwards -- averaged over five runs on my desktop machine. This tests the "no escaping" path. That's a slowdown 0.6% (and that was almost within the "noise" of the various runs). These tests aren't particularly comprehensive, but they do test the templating code a reasonable amount (although not the filters very much at all). What are the issues at the moment? ---------------------------------- Now we get to the things I want to sort out before going much further. (1) Any violent (or even just passionate) objections to using terms like "finalize" and mark_finalized()? - Should we use Simon's original proposal of escaped and mark_escaped()? I feel "safe" is a bit more consistent with the behaviour (an opposite-but-similar term to Perl's "tainted"). (2) Is the new behaviour of "escape" reasonable (i.e. it does nothing on finalized strings)? - The only drawback of this is that there is no way to give an escaped version of a finalized string in the templates. That is, there is no opposite to the "noescape" filter. - If we make "escape" apply to finalized strings as well, then views must be very consistent about variables always having the same "finalizedness" state. Otherwise, the template would have to escape sometimes and not escape other times and it has no way of knowing when. The current implementation lets you whack an escape filter on there and it will work always. - Current behaviour also makes forward porting easier (you don't have to run around removing all the escape filters in your code immediately). (3) Auto-escaping inherits down through template inclusions. That is, if you extend a template that has auto-escaping enabled, you get auto-escaping enabled (obviously the autoescape template tag can control this). Anybody have a strong reason not to do this? - Personally, I think this is a no-brainer, but I've been wrong plenty of times before. (4) Should generic views use SafeContext by default? ... [[ Should probably be removed ]] (5) Adrian, Jacob: do you guys still want "off by default"? - I *really* don't care what the answer is here, but I would rather not have to change things after porting everything under contrib/ . - For people thinking it's auto-escaping or nothing, {% autoescape on %} at the beginning of a template (and {% endautoescape %}) at the end is not a huge imposition. Feedback obviously welcome and appreciated. ********* end ************** --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/django-developers -~----------~----~----~----~------~----~------~--~---