Django Template Compilation rev.2 ================================= About Me ~~~~~~~~ I'm student of last year of Technical University of Lodz, Poland on faculty of electronic engineering and computer science, while now in parallel I'm doing my second diplom of electronic engineering on Polytech de Nantes in France.I've been using python for 8 years, and after getting totally frustrated withphp and it's frameworks, I decided to choose something else for doing my webdev gigs. I can say I'm with django from 0.96 version. I was always reader never commiter, maybe I was too scared to develop somethingand than made people believe that your idea is right. I decided to change that,and take part of django - the framework that made me like web development ;)I know that it's a bit late to prove my qualities, but I believe I can succeedin this project. I hope research I've made for this proposal will convince you ;)
Background ~~~~~~~~~~ It is one year since Alex Gaynor published the first version of this proposal. I've tried to track code that Alex reviewed and also after deep analyze of existing django code itself, I found that improvement in the field of template generation can be huge, not only by simple compilation of them. Plan ~~~~ Compile Django templates into Python functions, cache produced code to speed up template generation. Optionally compile templates to machine code. Rationale ~~~~~~~~~ Still the Django template language exists at a level above the Python interpreter, and interprets Django templates. As it was agreed this makes template generation slow. We could optimise this process in two ways. Preprocessing template source as much as possible (in our case we do compilation), and reducing time required to access this preprocessed date (my idea is to allow compiled code to be cached) Method ~~~~~~ As proposal from the last year, haven't been rejected, and nothing has been implemented in that matter from that time. I will let me cite Alex Gaynor's method which I fully support and want to implement. Alex Gaynor wrote: Templates will be compiled by turning each template into a series of functions, one per block (note that the base template from which other templates extend is a single function, not one per block). This is accomplished by taking the tree of ``Node`` objects which currently exist for a template and translating it into an alternate representation that more closely mirrors the structure of Python code, but which still has the semantics of the template language. For example, the new tree for a loop using the ``{% for %}`` tag would become a for loop in Python, plus assignments to set up the ``{{ forloop }}`` variable that the ``{% for %}`` tag provides. The semantics of Python code is that variables assigned in a for loop exist beyond the loop itself, including the looping variable. Django templates, however, pop the top layer from the context stack at the end of a for loop. This intermediate representation uses the scoping of Django templates. After an intermediate representation is created a compiler is invoked which translates the IR into Python code. This handles the details of Django template scoping, spilling variables in the event of conflicts, and calling template tag functions. An important feature of Django templates is that users can write template tags which have access to the full context, including the ability to modify the context. In order to maintain backwards compatibility with existing template tags, we must create a template context object whenever an uncompilable template tag is used, and mirror any changes made to the context in the function's locals. This presents a complication, as we forfeit the speed benefits of a compiled template (lookups of a Python local are a single index in a C array) and must perform a dictionary lookup for every variable. Unfortunately, mirroring a context dictionary back into local variables requires maintaining a dictionary of arbitrary names to values, which can't be efficiently implemented with Python's locals (use of ``exec`` causes locals to degrade to dictionary lookups). Furthermore, constructing a dictionary of the full context requires additional effort. To provide an optimal solution we must know which variables a given template tag needs, and which variables it can mutate. This can be accomplished by attaching a new class attribute to ``Nodes`` and passing only those values to the class, (instead of the full context dictionary). Subsequently, we would only need to mirror a few given values into the locals, and since these are known names, we avoid degrading the local lookups into dictionaries. Old-style ``Nodes`` will continue to work, but in a less efficient manner. As we all know, there is a need to keep every thing backward compatible, so as it was mentioned in orginal proposal. We should develop that as a new custom loader, which would use all performance benefits of compilation. And during some period of time every one could decide to use old-backend, or new-backend with the need to update code for custom nodes. Like it was done with newform/oldform change. In this point we can also imagine compatibility mode in which generation of old nodes would trigger old template generation. I belive that templates should be processed with this steps. 1. Parsing string representation 2. Creating AST. In fact first two steps are made so far with already existing template engine. NodeList is kind of AST. 3. Creating IR (Intermediate Representation) for AST 4. Generating Python code and inline compilation Creating IR from AST should allow further optimizations, like reducing dead code. 5. Optional: Cython or Psyco compilation Since we are generating Python code from scratch we could generate it with Cython language extension to define some variables as simpler `C` types. This would allow us to compile them to machine code. Other way could be incorporating Psyco. As we also need to take care of restricted environment users like `GoogleAppEngine`, this feature would be optional. 6. Caching resulting code For now the default Django behavior is not cache template NodeList. These objects are created each time we call view. For greater speed improvement we should consider, that templates doesn't change during server execution. Of course somekind of API to reload/auto-reload cache should be implemented. 7. Code is being fetched from cache, and with context variable it generates page. Not much to comment, but we should notice that every page reload is only a matter of executing point number 7. Regarding Armin Ronacher's proposal, I believe that for now, Django should still contribute to it's own well known template system. Building compilation mechanism dedicated for use with Django allows Django community have greater control over architecture and the way it works. Django was always [as long as I remember] 'battery included' framework. In my opinion building template engine on the base of external library in external repository could begin the process of dividing Django into small independent blocks, which in some point of time, will stop to play nicely with each other. Also important here is matter of tracking and fixing bugs, providing patches and taking responsibility for any problems. Building pluggable infrastructure of application is important, but template module is still much core component, and should be developed inside the community. Alex Gaynor's example ~~~~~~~~~~~~~~~~~~~~~ The following are some examples of what I'd expect a compiled template to look like: .. sourcecode:: html+django {% for i in my_list %} {% if i|divisibleby:2 == 0 %} {{ i }} {% endif %} {% endfor %} .. sourcecode:: python def templ(context, divisibleby=divisibleby): my_list = context.get("my_list") _loop_len = len(my_list) result = [] for forloop, i in enumerate(my_list): forloop = { "counter0": forloop, "counter": forloop+1, "revcounter": _loop_len - i, "revcounter0": _loop_len - i - 1, "first": i == 0, "last": (i == _loop_len - 1), } if divisibleby(i, 2) == 0: result.append(force_unicode(i)) return "".join(result) For comparison here is the performnace of these 2:: >>> %timeit t.render(Context({"my_list": range(1000)})) 10 loops, best of 3: 38.2 ms per loop >>> %timeit templ(Context({"my_list": range(1000)})) 100 loops, best of 3: 3.63 ms per loop That's a 10-fold improvement! Timeline ~~~~~~~~ * 1 week -- develop a benchmark suite of templates and tests for comparing compatibility * 3 weeks -- develop the frontend portion of this, code which translates Django's included template tags into the IR. * 1 week -- developing the internal IR generation API. * 2 weeks -- hooking up all of Django's template tags to actually use it. * 4 weeks -- develop the backend code generator. This takes the IR and translates it into Python, including handling the semantic changes. * 2 weeks -- basic code generation support. Does nothing but generate code that looks exactly like what's already executed, this means variable lookups are still lookups in a ``Context`` dictionary. * 2 weeks -- optimize known names into local variables at the python level. * 2 weeks -- time set aside for dealing with bugs, corner cases, and anything else. * 1 week -- Explore possibility for additional optimizations, eliminating duplicate values (for example removing unused ``{{ forloop }}`` variables), allowing an external app to provide "type" data to IR nodes such that variable lookups could be resolved as indexing vs attribute lookup at compile time. * 1 week -- Explore speed gain with use of machine code compilation (psyco/cython) and use of different backend's for storing/caching compiled code objects. Goals ~~~~~ As with any good project we need some criteria by which to measure success: * Successfully compile complete (real world) templates. * Speed up templates. For reference purposes, Jinja2 is about 10-20x faster than Django templates. My goal is to come within a factor of 2-3 of this. * Complete backwards compatibility. * Develop a complete porting guide for old-style template tags to minimize any pain in the transition. As I wrote in the title this is Alex Gaynor proposal revision 2. It is not completly new idea. After analyzing everything, I just wanted to add few things from me. And as I read django-developers group yesterday, this project is quite popular, which worries me to some point, because I'm determined to work on that this summer. So now, how should I convince you to hand me this job ;)? Please post any questions or comments I'll be glad to reply. Contact with me, by standard means email: jan.rzepecki (at) gmail (dot) com jabber/gtalk: same as above irc: i've just started to idle everyday on #django and #django-dev on nick `xtrqt` ( it is also my nick on django tracker.) PS I'm sorry for starting third thread on that issue, but I thought that one-thread-one-proposal is fair to everyone applying. -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-developers@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.