This is a heads-up post about some planned changes to the ORM and 
specifically to the expressions API. This affects how the following 
features work inside the ORM:
  - F-expressions (and other ExpressionNode subclasses)
  - aggregates
  - anything using SQLEvaluator (django.db.models.sql.expressions)

While the changes target private APIs, these APIs have remained stable for 
a long time. I expect there to be significant amount of users of the above 
APIs. The main concern is that the planned changes will break existing 
code. I am looking for feedback from 3rd party library developers - does 
the planned changes break existing code for you, and if yes, why? If we 
find some common cases we might be able to add backwards compatibility code 
paths for those cases.

There are two main reasons for doing the changes. First, the change allows 
for a lot of nice new features - doing conditional aggregates, aggregates 
using expressions and writing custom expressions, all this using public 
APIs. The second reason is that the current coding is somewhat complex, and 
that complexity makes it hard to write custom aggregates or expressions.

Currently the expressions and aggregates are built up from two classes. The 
first one is public facing API (for example Sum in 
django.db.models.aggregates), the second is how the public facing API is 
executed in the ORM (Sum in django.db.models.sql.aggregates). The idea is 
that we have one public facing component users should use for different 
queries. Then different Query implementations can have different 
implementation classes. Thus the same public facing class can be executed 
in different ways depending on the used Query class. Unfortunately this 
leads to cases where it is hard to extend expressions or aggregates - while 
it is easy to add a new public facing API class, it isn't easy to add an 
implementation for that class - the implementation belongs to the used 
Query class, but that class isn't under user control.

In addition to the extensibility problem the current implementation is 
somewhat complex to follow. Still, aggregation implementation doesn't share 
code with expressions, but after all expressions are just a special kind of 
expression.

The new way is simplified - there is just public facing classes. The 
classes know how to implement themselves. The new expressions know how to 
add themselves to the query, and they know how to generate a query against 
different database backends. Different database backends are handled with 
as_vendorname() syntax. Aggregates are a subclass of certain kind of 
expression (Func class), so aggregates use the same code paths as other 
expressions. The end results is simplified code, ability to use Sum('foo') 
+ Sum('bar') style aggregations, and the ability to write new expressions 
and aggregates using a public stable API.

A patch exists that implements the new way. It is written by Josh Smeaton. 
The patch also implements a way to annotate non-aggregates to queries 
(.annotate(Coalesce(F('foo'), F('bar'))). The patch can be used as basis 
for other improvements to the ORM, for example the ability to queries like 
.order_by(Lower('somecol').desc()) has been discussed on this list recently.

The only big problem with introducing the new way is backwards 
compatibility. The current coding is implemented the way it is because the 
aim was allowing writing different kinds  of backends (NoSQL). The 
NoSQLQuery would just need to contain different implementation class than 
the normal Query class had, and then you could do whatever you want. I 
claim that it is possible to do the exact same thing with addition of a 
rewriter to the NoSQLQuery class - it inspects the new-style classes, and 
creates different implementation classes on the fly.

The bigger problem seems to be existing 3rd party aggregates and 
expressions - while technically we are changing only private APIs, I don't 
see it as a good idea to break existing code if we can avoid that.

I have written a bit about this also in DEP format, but ran out of interest 
of writing a DEP as the DEP process doesn't seem to be doing that well. 
This seems like a good candidate for DEP, but before I start finishing the 
DEP there must be some guarantees that we have a working DEP process to 
handle it. I want to avoid the situation where this feature is stalled 
because of the DEP process. You can see the half-written DEP at 
https://github.com/akaariai/deps/blob/master/drafts/expressions.rst. The 
most interesting part is about the current implementation.

The most important thing now is to find backwards incompatibilities caused 
by the planned change. So, if you depend on the current implementation of 
expressions, aggregates or SQLEvaluator, please check if the new way breaks 
your code. If so, report that in ticket #14030, and lets see if there is 
something we can do to help ease the transition. Of course, other feedback 
is also welcome.

 - Anssi

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to django-developers+unsubscr...@googlegroups.com.
To post to this group, send email to django-developers@googlegroups.com.
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/3625393f-2a6c-4d2e-8b9e-7bebea5c8276%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to