On Mon, 2009-05-11 at 16:27 +0200, Michael Radziej wrote:
> On Mon, May 11, Gábor Farkas wrote:
> 
> > 
> > but that's not the case here. the nullable field i have here is a 
> > DateTimeField.
> > 
> > we are talking about this sql:
> > 
> > SELECT "x_group"."id", "x_group"."name"
> > FROM "x_group" LEFT OUTER JOIN "x_person"
> > ON ("x_group"."id" = "x_person"."group_id")
> > WHERE "x_person"."stamp" IS NULL
> > 
> > as you see, it's joining on a normal foreign-key,
> > and then does a "WHERE" on the mentioned nullable field.
> > 
> > compare it to the sql produced for the "person__stamp='2008-12-12'" case:
> > 
> > SELECT "x_group"."id", "x_group"."name"
> > FROM "x_group" INNER JOIN "x_person"
> > ON ("x_group"."id" = "x_person"."group_id")
> > WHERE "x_person"."stamp" = E\'2008-12-12 00:00:00\'
> > 
> > why is an OUTER JOIN used in the first case,
> > and an INNER JOIN in the second?
> > 
> > i'm not saying that django should never use an outer join.
> > i'm only saying that in this case, there is no reason to switch
> > to a different join-type, just because i'm doing an IS_NULL
> > test on a completely unrelated field.

There's a very good reason to switch. It was explained in the comment in
that ticket. The "stamp" field is not a "completely unrelated field".
It's a field in a model that is related to the first model by a nullable
relation (a reverse one-to-many, and all reverse relations are
nullable), so the field and value you're filtering against is definitely
relevant in that it can be part of a related instance that exists (and
have a NULL value) or there can be no related instance (in which case,
we have to decide how to handle filtering attempts).

The choice to be made in an API implementation of this kind is whether
NULL checks on fields in related models implies a related instance
exists or not. Django follows SQL there and doesn't insist that a
related row exists: absence of a row leads to NULL values for all
columns in the potential related object. Since that doesn't make it
impossible to filter on NULL values where the related instance must
exist (as I indicated in the comment), it's not a restriction.

> > 
> > or, i overlooked something here. it wouldn't be the first time :-)
> 
> Django basically always uses an outer join unless it can optimize the query
> to an inner join that produces the same result.
> 
> In the second query, it can use an inner join because "WHERE x_person.stamp
> ='2008-12-12'" can only match when there's also a person.
> 
> In the first query, it cannot do this optimization, because "WHERE
> x_person.stamp is NULL" is even true when there's no person that can be
> joined.

That is all completely correct. It's exactly why the query works this
way. We avoid outer joins wherever possible, since they're much slower.
However, in cases where they are required in order to get the correct
result, we use them. The *only* possible point requiring thought is the
one I mentioned above: which way to treat the entire absence of a
related instance. Django's choice is deliberate and not particularly
unusual.

>  If you want to limit the query to existing persons, use
> 
>   filter(person__stamp__isnull=True, person__id__isnull=False)
> 
> (I haven't actually tried this, though, but this should result in an inner
> join.)
> 
> > > Why do you care for the join type at all?
> > because it produces different result :)
> No, it doesn't <8-)

Well, they do give different results: switching from an outer to an
inner for the nullable query would give incorrect results with respect
to what we intend, which is different from the correct results. :-)

Switching from inner to outer joins when filtering against a non-NULL
"stamp" value won't change the results, of course, although it will run
less efficiently on some backends that don't do outer->inner join
promotion in their optimisers.

Regards,
Malcolm



--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django developers" group.
To post to this group, send email to django-developers@googlegroups.com
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to