On Wed, Apr 7, 2010 at 4:43 AM, Waldemar Kornewald <wkornew...@gmail.com> wrote: > Hey Alex, > > On Apr 7, 2:11 am, Alex Gaynor <alex.gay...@gmail.com> wrote: >> Non-relational database support for the Django ORM >> ================================================== >> >> Note: I am withdrawing my proposal on template compilation. Another >> student >> has expressed some interest in working on it, and in any event I am >> now more >> interested in working on this project. > > It's great that you want to work on this project. Since I want to see > this feature in Django, I'm offering mentoring help with the NoSQL > part. You know Django's ORM better than me, so I probably can't really > help you there, but I can help to make sure that your modifications > will work well on NoSQL DBs. Just in case this is necessary, I'll > apply as a GSoC mentor before it's too late (if I remember correctly, > in 2007 we could still allow new mentors even at this late stage)? > >> Method >> ~~~~~~ >> >> The ORM architecture currently has a ``QuerySet`` which is backend >> agnostic, a >> ``Query`` which is SQL specific, and a ``SQLCompiler`` which is >> backend >> specific (i.e. Oracle vs. MySQL vs. generic). The plan is to change >> ``Query`` >> to be backend agnostic by delaying the creation of structures that are >> SQL >> specific, specifically join/alias data. Instead of structures like >> ``self.where``, ``self.join_aliases``, or ``self.select`` all working >> in terms >> of joins and table aliases the composition of a query would be stored >> in terms >> of a tree containing the "raw" filters, as passed to the filter calls, >> with >> things like ``Field.get_prep_value`` called appropriately. The >> ``SQLCompiler`` >> will be responsible for computing the joins for all of these data- >> structures. > > Could you please elaborate on the data structures? In the end, non- > relational backends shouldn't have to reproduce large parts of the > SQLQuery code just to emulate a JOIN. When we tried to do a similar > refactoring we quickly faced the problem that we needed something > similar to setup_joins() and other SQLQuery features. We'd also have > to create code for grouping filters into individual queries on tables. > The Query class should take care of as much of the common stuff as > possible, so nonrel backends can potentially emulate every single SQL > feature (e.g., via MapReduce or whatever) with the least effort. > Otherwise this refactoring would actually have more disadvantages than > our current SQLCompiler-based approach in Django-nonrel (as ridiculous > as that sounds). > > However, it's important that all of the emulated features are handled > not by the backend, but by a reusable code layer which sits on top of > the nonrel backends. It would be wasteful to let every backend > developer write his own JOIN emulation and denormalization and > aggregate code, etc.. The refactored ORM should at least still allow > for writing some kind of "proxy" backend that sits on top of the > actual nonrel backend and takes care of SQL features emulation. I'm > not sure if it's a good idea to integrate the emulation into Django > itself because then progress will be slowed down. > > Ideally, we should provide a simplified API for nonrel backends, > similar to the one that we recently published for Django-nonrel, so a > backend could be written in two days instead of two weeks. We can port > our work over to the refactored ORM, so this you don't have to deal > with this (except if it should be officially integrated into Django). >
No. I am vehemently opposed to attempting to extensively emulate the features of a relational database in a non-relational one. People talk about the "object relational" impedance mismatch, much less the "object-relational non-relational" one. I have no interest in attempting to support any attempts at emulating features that just don't exist on the databases they're being emulated on. > In addition to these changes you'll also need to take care of a few > other things: > > Many NoSQL DBs provide a simple "upsert"-like behavior where on save() > they either create a new entity if none exists with that primary key > or update the existing entity if one exists. However, on save() Django > first checks if an entity exists. This would be inefficient and > unnecessary, so the backend should be able to turn that behavior off. > > On delete() Django also deletes related objects. This can be a costly > operation, especially if you have a large number of entities. Also, > the queries that collect the related entities can conflict with > transaction support at least on App Engine and it might also be very > inefficient on HBase. IOW, it's not sufficient to let the user handle > the deletion for large datasets. So, non-relational (and maybe also > relatinoal) DBs should be able to defer and split up the deletion > process into background tasks - which would also simplify the > developer's job because he doesn't have to take care of manually > writing background tasks for large datasets, so it's a good addition > in general. > There is seperate work on another ticket to provide a way to declare ON_DELETE behavior, though this is a bit of a relational concept it seems to me making these easy to customize provides a good way for different backends to specify their behavior here. > I'm not sure how to handle multi-table inheritance. It could be done > with JOIN emulation, but this would be very inefficient. > Denormalization is IMHO not the answer to this problem, either. Should > Django simply fail to execute such a query on those backends or should > the user make sure that he doesn't use multi-table inheritance > unnecessarily in his code? > There's nothing about MTI that's inherently hard on a non-relational database, besides not being able to "select_related" the parent. > Bye, > Waldemar Kornewald > > -- > You received this message because you are subscribed to the Google Groups > "Django developers" group. > To post to this group, send email to django-develop...@googlegroups.com. > To unsubscribe from this group, send email to > django-developers+unsubscr...@googlegroups.com. > For more options, visit this group at > http://groups.google.com/group/django-developers?hl=en. > > Alex -- "I disapprove of what you say, but I will defend to the death your right to say it." -- Voltaire "The people's good is the highest law." -- Cicero "Code can always be simpler than you think, but never as simple as you want" -- Me -- You received this message because you are subscribed to the Google Groups "Django developers" group. To post to this group, send email to django-develop...@googlegroups.com. To unsubscribe from this group, send email to django-developers+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/django-developers?hl=en.