#23557: annotate gives different results on postgresql and mysql -------------------------------------+------------------------------------- Reporter: brian | Owner: nobody Type: Bug | Status: new Component: Database layer | Version: 1.7 (models, ORM) | Resolution: Severity: Normal | Triage Stage: Accepted Keywords: | Needs documentation: 0 Has patch: 0 | Patch needs improvement: 0 Needs tests: 0 | UI/UX: 0 Easy pickings: 0 | -------------------------------------+-------------------------------------
Comment (by akaariai): I am with brian here - I think Django's way to calculate the GROUP BY is complex enough without the GROUP BY clause being altered by .order_by() calls. In my opinion Django's group by logic should be as follows: - Group by the primary key of the table in the query by default. The group by will need to be appended by functionally dependent columns as many databases need this. This means that we can add any column from the same table to the group by, or any column from any table pointed by direct foreign key or reverse one to one field. - Group by the user's defined .values() call. Possibly allow extending it by functionally dependent columns. That is, if some table's primary key is in the values() list, then allow extension from that primary key along foreign keys and o2ofields. - If any other column is needed in the group by, then error out. This means that: `CPUJob.objects.values('project').annotate(usage=Sum('cpu_usage'), jobs=Count('id')).order_by('date')` is an error. The functionally dependent columns appending means that we need to do a group by author.id, author.name on some databases even though technically group by author.id gives the same result. Similarly, we need to group by author.id, author.name, book.id, book.name when .select_related('favorite_book') is applied, where favorite_book is a foreign key from author to book. The main point here is that we really shouldn't alter the results because .order_by() was added to the query. Just error out in that case. Resist the temptation to guess, explicit is better than implicit and so on... I think we can deprecate the current behavior of silently altering the group by. Checking if a column is functionally dependent on some subset of columns in the group by isn't exactly easy, but should be doable at least for primary key case. Multi-column unique indexes are going to be a bit harder to support... -- Ticket URL: <https://code.djangoproject.com/ticket/23557#comment:7> Django <https://code.djangoproject.com/> The Web framework for perfectionists with deadlines. -- You received this message because you are subscribed to the Google Groups "Django updates" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-updates+unsubscr...@googlegroups.com. To post to this group, send email to django-updates@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-updates/063.f4db9e06a4283793285a5e869bb09e75%40djangoproject.com. For more options, visit https://groups.google.com/d/optout.