More inline stuff. On Mar 18, 2013, at 4:14 AM, Taras D wrote:
> Hi Peter > > On 18 March 2013 16:35, Peter of the Norse <[email protected]> wrote: > You’re over thinking this by a large margin. getTendersForContractor should > return a QuerySet. That way you can add things like .select_related as > necessary in the view. > > Also, I’m a bit worried about the fact you have a function called > getTendersForContractor. It really should be Tender.objects.forContractor. Or > if the function is just a single filter call, it can be eliminated entirely. > > To confirm, this would require a new Manager that implements the > 'forContractor' method? Yes. >> * Prefetch_related selects related obects via the criteria: 'in (list of >> ids'), which arguably can be slower > > What database are you using where this is true? > > It was the result of a discussion with a colleague. Also slides 67 & 68 at > http://www.slideshare.net/OReillyOSCON/unbreaking-your-django-application > imply that large IN's may cause performance problems - > "68: Large INs are a mess. Very expensive for the database to parse. Very > expensive for the database to execute. If there are potentially more than > 10-15 items in the list, rework the IN as a JOIN against whatever the source > of the keys is." That sounds like a good candidate for a bug ticket. It’s not trivial though. The limit of IDs in the query set would depend on the database, the complexity of the query, and the data being pulled. I would have to think about this for a long time before I felt comfortable making any kind of suggestion. >> * Seems brittle - if you change the data model, you have to change all of >> the queries in the data retrieval layer. Denomralising the data would reduce >> this, but we'd still have the problem of retrieving more data than the >> client may possibly want > Yes, if you change code in one place, you might have to change it someplace > else to match it. Most of us just use the ORM as the data retrieval layer. > Again, I’m worried you’re over thinking this. > > I hadn't really considered passing around a QuerySet in place of a list of > Tenders. One thing I'm finding is that the cost in retrieving related objects > is making me rethink the API presented to the client. For example, consider > the following API, which I think is quite reasonable: > > class Tender: > def getTotalCost(self): > return self.project.getTotalCost() > > class Project: > > def getTotalCost(self): > cost = 0 > for contract in self.contract_set.all(): > cost += constract.getTotalCost() > return cost > > similarly implementations for Project & Contract > > class Contract: > def getTotalCost(self): > cost = 0 > for job in self.job_set.all(): > cost += job.getTotalCost() > return cost https://docs.djangoproject.com/en/dev/topics/db/aggregation/ > Having a Project know how much its total cost is, and presenting this to a > client, is in line with OO principles of abstraction and encapsulation. > Consider a page that outputs the total cost of each contract tendered for a > particular contractor. If we re-use the 'getTendersForContractor' function > (that returns a QuerySet), and then call 'getTotalCost' on each Project that > has a tender, then we still have a problem in that we'll get a multitude of > calls to the database. > > We might put the onus on the client to call: > qs = getTendersForContractor(contractor) > for tender in list(qs.prefetch_related('project', 'project__contract', > 'project__contract')): > print tender, tender.getTotalCost() If this is really what you are doing, then I strongly recommend https://docs.djangoproject.com/en/dev/topics/db/aggregation/. This case is a single SQL statement. > Not only is this complicated, but this requires the client know how the > Tender is implemented (the prefetched would become uneccessary if we changed > the implementation to store the total cost on the actual Tender itself at the > cost of denormalised data). Note that you wouldn't have these problems if the > data existed in memory. > > These points are making me rethink my data model and API design, and that the > OO paradigm doesn't always fit well with web apps. In particular the slides > at http://www.slideshare.net/OReillyOSCON/unbreaking-your-django-application: > > 23: "Django doesn’t have a good primitive for returning an object graph in > one call. - multiple db trips required" Implies that object graphs like what > I've designed is difficult to implement effeciently..? > > 24: "So, what do to? If you almost always return the parent along with the > related object, consider object inheritance instead. That does do a nice > join operation. Don’t create an insane class graph, though. If you > frequently use the related object without reference to the parent object, use > a foreign key." Not sure what 'object inheritance' is but might be useful, > and concerned that I might be heading towards an 'insane class graph' (the > real data model has 6 or 7 more related objects). Am I heading down the wrong > path of embracing an OO style API and data model? > > Taras I don’t know. But I think you have to acknowledge that there is such a thing as too OOP. I’m not sure that creating an API on top of Django is a good idea. Django is already super high level. I would have to ask myself how I expect my users to use what I give them. On some level, you’ll have to trust your user to ask for the data they need. Peter of the Norse [email protected] -- You received this message because you are subscribed to the Google Groups "Django users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/django-users?hl=en. For more options, visit https://groups.google.com/groups/opt_out.

