Re: ORM refactoring

Luke Plant Mon, 10 Oct 2011 06:13:48 -0700

Hi Anssi,

> I would like to work on refactoring the ORM. This would be slow work,
> one piece at a time, not a total rewrite. I think there is much
> potential in the ORM. It needs to be cleaned first, and then we can
> get hacking new and awesome features. I know there are other people
> interested in this, too.
> 
> For example I have been thinking for some time about something like
> F() objects, but which could contain arbitrary SQL and references to
> fields in the query. relabel_aliases and cloning would be handled for
> you. These could be used in .order_by(), values_list, annotate() and
> so on. SQLSnippet could be its name. This would allow us to
> deprecate .extra, and would allow for some really fancy extensions.
> 
> Using these you could implement an external SQL snippets library, the
> PostgreSQL window functions in one library for example. Or custom
> aggregate ModelAggregate (fetch related models using array_agg,
> something like prefetch_related, but runs in single SQL query). I
> really do believe this is possible.
> 
> But we need to get the work started on refactoring the ORM first. I do
> currently have some extra time, and I am willing to work on the ORM.
> But we need some people reviewing the patches and most of all we need
> core developers who have enough time to review and commit the ready
> patches. Unfortunately the ORM-knowing core developers seem to be
> really busy.

One of the problems is that it can be very hard to review refactorings.
For example, I recently checked in rev 16929 [1] from ivan_virabyan's
patch, and reviewing it was hard, despite the fact that it was a very
good quality patch. The difficulty is that there appear to be a large
number of changes, when in reality the code has just moved around. But
confirming that it has indeed stayed the same is very time consuming if
you do it properly - in fact it seems almost faster to do the same
changes yourself, and verify you get the same result.

This is only thinking out loud, but I'm wondering if that approach might
not be too bad. If instead of a large patch, we have a series of smaller
steps (in a github branch or whatever), perhaps that would be easier to
review? I'm not saying rework any already created patches to match this,
BTW.

We do have pretty good test coverage, which should be a good help, but
one of the problems here is the mutability thing - it is very difficult
to write a test that shows that shows that a new QuerySet is not sharing
data structures with the old in a way that is incorrect and could lead
to the previous QuerySet being modified by the new one (or subsequent ones).

> I really do not know how to move forward. I guess I am asking all you
> what do you think should be done with the ORM. We could take a
> completely opposite direction and make it a wrapper around SQLAlchemy
> (I saw this mentioned somewhere some time ago). Or decide that the ORM
> is just fine now, or that there are higher importance items. Before
> continuing my hacking, it would be nice to know if there is support
> for ORM refactoring.

It was me who suggested the idea of basing the ORM on SQLAlchemy in
future, on my blog [2]. It would be a very large change, especially
things related to the unit-of-work stuff, and I have no idea whether it
is even feasible really.

My concern is that the ORM currently leaves you high and dry when you
come to the end of the queries it can generate. Some people say the
answer is raw SQL at that point, but if you have built up your query
dynamically in any way, or perhaps in a completely Model agnostic way
(as I was doing in django-easyfilters), you are completely stuck,
because you are left with a dynamically created query that you really
cannot usefully manipulate in any way, but which you must manipulate in
order to do the queries you need. I think the best you could do is
generate the SQL from the query as a string (which is actually hard to
do currently due to the way that SQL generation interacts with the
different backends), parse it again with python-sqlparse, manipulate and
pass to QuerySet.raw(). And then somehow cope with the backend specific
stuff. Ugh.

SQLA, I believe, would be very different, because 'core' provides a
complete wrapper around SQL, and the ORM allows you to drop down to core
to do advanced things. And it has a much nicer way of dealing with the
different SQL dialects.

But as I said, I don't know if any kind of rewrite to use SQLA is
feasible, and I don't know whether anyone else in the community or in
the core team has similar leanings - I don't know if any of the core
devs happened to read that blog post.

I do think that these refactorings you are doing are valuable. However,
if you work on this, I think you need to be prepared for the possibility
that the work may be a dead end. It could be an experiment that shows
that a different approach will be needed. If it does that, it will still
have been useful work, but you will need a certain kind of attitude to
still consider it useful work if we end up switching to SQLA :-)

I'm also unconvinced of the SQLSnippet approach at the moment. I think
it could end up becoming like '.extra()' - an extension point that
becomes troublesome because it drops you through the layers of
abstraction too rapidly. I suspect we need a more convincing and
comprehensive lower level before we can add low level extensions. And it
is that thinking that makes me want to more to SQLA with its core/ORM
distinction.

Regards,

Luke

[1] https://code.djangoproject.com/changeset/16929
[2]
http://lukeplant.me.uk/blog/posts/announcing-django-easyfilters-with-some-heresy-on-the-side/

--
"My capacity for happiness you could fit into a matchbox without
taking out the matches first." (Marvin the paranoid android)

Luke Plant || http://lukeplant.me.uk/

--
You received this message because you are subscribed to the Google Groups
"Django developers" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.

Re: ORM refactoring

Reply via email to