Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Saturday, July 5, 2014 1:47:30 AM UTC+3, Jon Dufresne wrote: > Sorry, but I felt like I was reporting information interesting to > developers. > > 1.) Wrong JOIN type (OUTER vs INNER) producing inefficient queries > 2.) FieldError in 1.7 where there wasn't one in 1.6 > > If these are of no interest, I will not continue the discussion. > This is the right list, especially concerning 2). Even if 2) turns out to be a false alarm, I much rather get a couple of false alarms than a regression in final release of 1.7. - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/37ea28e0-20da-4bb3-bf31-210472d35fd9%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Saturday, July 5, 2014 12:42:39 AM UTC+3, Jon Dufresne wrote: > > Suppose I have the following models: > > --- > class Container(models.Model): > pass > > class Item(models.Model): > container = models.ForeignKey(Container) > previous = models.ForeignKey('self', null=True) > current = models.BooleanField() > flag = models.BooleanField() > --- > > Item represents a chain of items, all grouped by a container (like a > linked list). The field "current" represents the most recent item > (front of the list). The field "flag" is simply something to query on. > > Suppose I perform the following query: > > --- > Item.objects.filter(current=True, > container__item__previous__isnull=True, container__item__flag=True) > --- > > That is, I'm looking for all current items such that the first item in > the chain has flag = true. Django 1.6 produces the following SQL for > this query: > > --- > SELECT ... > FROM "myapp_item" > INNER JOIN "myapp_container" > ON ( "myapp_item"."container_id" = "myapp_container"."id" ) > LEFT OUTER JOIN "myapp_item" T3 > ON ( "myapp_container"."id" = T3."container_id" ) > WHERE ("myapp_item"."current" = True AND T3."previous_id" IS NULL AND > T3."flag" = True ) > --- > > The OUTER JOIN is the problem. Why is this not a more efficient INNER > JOIN? This query is very inefficient as the the database gets larger. > This causes slow downs on pages. > This have been fixed in 1.7. Pre-1.7 there was no join "demotion". That is, when the ORM generated a LEFT JOIN expression it was never changed back to INNER JOIN. Here the container__item__previous__isnull=True generates LEFT JOIN for the item join. In 1.7 there exists some new code to do join demotion. In this case it works because container__item__flag=True can produce results only when the item join produces results => the query never produces results with LEFT JOIN of item => the existing LEFT JOIN for alias T3 can be "demoted" to INNER JOIN. Interestingly, if I try this on 1.7, this query is apparently not even > supported. I get the following error: > I wasn't able to reproduce this - are you sure the query is exactly the same both on 1.6 and 1.7? Are models are set up correctly before generating the query? If after double-checking you still get the error, could you post a sample project somewhere? - Anssi -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/d9253e4a-df9c-4e75-bc5a-3f883a62db0f%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
> ... and this is more appropriate on django-users Sorry, but I felt like I was reporting information interesting to developers. 1.) Wrong JOIN type (OUTER vs INNER) producing inefficient queries 2.) FieldError in 1.7 where there wasn't one in 1.6 If these are of no interest, I will not continue the discussion. Cheers, Jon -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CADhq2b5-QZffBC1eVbfKZuKfjJdSUBXP4e9Ad5HGK8-U%2B4o8AA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Fri, Jul 4, 2014 at 5:23 PM, Stephen J. Butler wrote: > * Build a list of relevant containers first: > Container.objects.filter(item__previous__isnull=True, > item__flag=True).values_list('pk', flat=True). Then > Items.objects.filter(current=True, container__in=flagged_containers). Not a > problem as long as your query doesn't grow too large for your DB (that is, > too many containers that might be considered) i guess Items.objects.filter(current=True, container__in=Container.objects.filter(item__previous__isnull=True, item__flag=True)) would be better, it gives the the opportunity to not get the container list in Python. ... and this is more appropriate on django-users -- Javier -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFkDaoROfS14GrKPb-7XbSTGxyb8g1hNBvZSFDQhSzwTEq7DOg%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Fri, Jul 4, 2014 at 4:42 PM, Jon Dufresne wrote: > Suppose I have the following models: > > --- > class Container(models.Model): > pass > > class Item(models.Model): > container = models.ForeignKey(Container) > previous = models.ForeignKey('self', null=True) > current = models.BooleanField() > flag = models.BooleanField() > --- > [snip] > A slow query is better than no query at all. Any particular reason > this was removed? How would one model and query what I'm trying to > achieve moving forward to avoid inefficient queries and exceptions? > Two ways come to mind: * If it's only "flag" on the first item in a sequence then move "flag" to be on present on Container also. A "first_flag" field. Could manage in a signal or override Item.save(). That would let you not nest the query, and could be more efficient. * Build a list of relevant containers first: Container.objects.filter(item__previous__isnull=True, item__flag=True).values_list('pk', flat=True). Then Items.objects.filter(current=True, container__in=flagged_containers). Not a problem as long as your query doesn't grow too large for your DB (that is, too many containers that might be considered) -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAD4ANxVC4SJjWM9_bw4uN3--HgcVDSj8s%3DAd%3D1vimTcxJ0Vh5A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Fri, Jul 4, 2014 at 2:54 PM, Javier Guerra Giraldez wrote: > what purpose does the "container__item__previous__isnull=True" > argument serve here? To filter on the initial item in the list. I'm looking for flag=True *only* on the first item in the list. flag=True could be set on other items in the same chain, but I don't care about those. The goal of the query is: I'm looking for all *current* items such that the *first item* in the chain has *flag = true*. In raw SQL I want the query to be: SELECT myapp_item.* FROM "myapp_item" INNER JOIN "myapp_container" ON ( "myapp_item"."container_id" = "myapp_container"."id" ) INNER JOIN "myapp_item" T3 ON ( "myapp_container"."id" = T3."container_id" ) WHERE ("myapp_item"."current" = True AND T3."previous_id" IS NULL AND T3."flag" = True ) So the WHERE condition flag=True, only applies to the initial item. >i think it means "an item that belongs to a container that has an item with no >'previous'" Correct, but not just any container, but the same container as the item being filtered. > which if it's a linked list, then any non-empty container would comply Right, except then the "container__item__flag=True" should influence it further. > and since you start the query from the item, then the container is non-empty > by definition. Yes, correct. > btw, a linked list in a database? can you elaborate on that idea? It is just one way to think about it. It is not really a linked list. I simply meant the items are linked together by the "previous" point. In my actual application item represent a step with history (previous being that history). Container represents the process of the step (has actual fields). So each step has a previous step in the process. Sometimes I want to query for the current step in the process based on the initial step. Cheers, Jon -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CADhq2b7Uy6nY0Vo0__%2B3Ka1%2B_Hv7_smnJrDb4rtL%2B6sF%2B3QE5A%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Re: Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
On Fri, Jul 4, 2014 at 4:42 PM, Jon Dufresne wrote: > Item.objects.filter(current=True, > container__item__previous__isnull=True, container__item__flag=True) > --- > > That is, I'm looking for all current items such that the first item in > the chain has flag = true. Django 1.6 produces the following SQL for > this query: what purpose does the "container__item__previous__isnull=True" argument serve here? i think it means "an item that belongs to a container that has an item with no 'previous'", which if it's a linked list, then any non-empty container would comply, and since you start the query from the item, then the container is non-empty by definition. btw, a linked list in a database? can you elaborate on that idea? -- Javier -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAFkDaoR6GwBq2r68V7HbHhCnjednpL7GBGA1rpOOeru15Wr_%3DQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
Querying across FK produces too many OUTER JOINs (1.6) or FieldError (1.7)
Suppose I have the following models: --- class Container(models.Model): pass class Item(models.Model): container = models.ForeignKey(Container) previous = models.ForeignKey('self', null=True) current = models.BooleanField() flag = models.BooleanField() --- Item represents a chain of items, all grouped by a container (like a linked list). The field "current" represents the most recent item (front of the list). The field "flag" is simply something to query on. Suppose I perform the following query: --- Item.objects.filter(current=True, container__item__previous__isnull=True, container__item__flag=True) --- That is, I'm looking for all current items such that the first item in the chain has flag = true. Django 1.6 produces the following SQL for this query: --- SELECT ... FROM "myapp_item" INNER JOIN "myapp_container" ON ( "myapp_item"."container_id" = "myapp_container"."id" ) LEFT OUTER JOIN "myapp_item" T3 ON ( "myapp_container"."id" = T3."container_id" ) WHERE ("myapp_item"."current" = True AND T3."previous_id" IS NULL AND T3."flag" = True ) --- The OUTER JOIN is the problem. Why is this not a more efficient INNER JOIN? This query is very inefficient as the the database gets larger. This causes slow downs on pages. Interestingly, if I try this on 1.7, this query is apparently not even supported. I get the following error: --- Traceback (most recent call last): File "", line 1, in File "/home/jon/djtest/djtest/test.py", line 3, in qs = Item.objects.filter(current=True, container__item__previous__isnull=True, container__item__flag=True) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/manager.py", line 92, in manager_method return getattr(self.get_queryset(), name)(*args, **kwargs) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/query.py", line 689, in filter return self._filter_or_exclude(False, *args, **kwargs) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/query.py", line 707, in _filter_or_exclude clone.query.add_q(Q(*args, **kwargs)) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1287, in add_q clause, require_inner = self._add_q(where_part, self.used_aliases) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1314, in _add_q current_negated=current_negated, connector=connector) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/sql/query.py", line 1181, in build_filter lookups, value) File "/home/jon/djtest/venv/lib/python2.7/site-packages/django/db/models/fields/related.py", line 1506, in get_lookup_constraint raise exceptions.FieldError('Relation fields do not support nested lookups') FieldError: Relation fields do not support nested lookups --- A slow query is better than no query at all. Any particular reason this was removed? How would one model and query what I'm trying to achieve moving forward to avoid inefficient queries and exceptions? Should I file this as a bug? Cheers, Jon -- You received this message because you are subscribed to the Google Groups "Django developers" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To post to this group, send email to django-developers@googlegroups.com. Visit this group at http://groups.google.com/group/django-developers. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CADhq2b7%3DcsYSOU2gSCL%3DwTD8Mztc9On%2ByRLWmUef%2B-y-fWryuA%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.