>
> However, I don't think this is subject to the kind of problem you describe
> as the inner queryset should be turned into a subquery of the main queryset.
Yes, I can confirm that I used this solution and it returns a QuerySet. It
does not perform query at module load time in my case.
It's true that the generated queries might be inefficient (I'm not sure all
> DBMS can optimize something like "*SELECT ... FROM foo WHERE id IN
> (SELECT id FROM foo ORDER BY bar LIMIT 10)*"). I would not use such a
> construction on a big production dataset and if performance really matters
> I would do something else.
My table is not very large (5-10k rows) so the maintenance cost of a
solution that would require me to periodically check if the query is not
returning too little or to many rows is much bigger concern than
performance. But I guess it does matter if it were to be implemented like
this internally in Django as I proposed. You're right, the version with a
subqery is much slower than the plain one, at least in PostgreSQL. i tried
it on one of my tables (unrelated to my example) that contains 1906 rows:
*Plain query*
EXPLAIN ANALYZE SELECT
"baseapp_award"."id",
"baseapp_award"."profile_awards_id",
"baseapp_award"."name",
"baseapp_award"."date_achieved",
"baseapp_award"."recognition_level"
FROM "baseapp_award"
ORDER BY "baseapp_award"."id" DESC
LIMIT 100;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.28..4.51 rows=100 width=38) (actual time=0.024..0.118 rows=
100 loops=1)
-> Index Scan Backward using baseapp_award_pkey on baseapp_award (cost=
0.28..80.94 rows=1906 width=38) (actual time=0.022..0.097 rows=100 loops=1)
Total runtime: 0.171 ms
*Subquery*
EXPLAIN ANALYZE SELECT
"baseapp_award"."id",
"baseapp_award"."profile_awards_id",
"baseapp_award"."name",
"baseapp_award"."date_achieved",
"baseapp_award"."recognition_level"
FROM "baseapp_award"
WHERE
"baseapp_award"."id" IN (
SELECT U0."id"
FROM "baseapp_award" U0
ORDER BY U0."id" DESC
LIMIT 100
);
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=6.76..48.94 rows=100 width=38) (actual time=0.763..
0.841 rows=100 loops=1)
Hash Cond: (baseapp_award.id = u0.id)
-> Seq Scan on baseapp_award (cost=0.00..36.06 rows=1906 width=38) (actual
time=0.007..0.324 rows=1908 loops=1)
-> Hash (cost=5.51..5.51 rows=100 width=4) (actual time=0.112..0.112
rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 4kB
-> Limit (cost=0.28..4.51 rows=100 width=4) (actual time=0.009..
0.083 rows=100 loops=1)
-> Index Only Scan Backward using baseapp_award_pkey on
baseapp_award u0 (cost=0.28..80.94 rows=1906 width=4) (actual time=0.009..
0.063 rows=100 loops=1)
Heap Fetches: 100
Total runtime: 0.904 ms
The plain version runs in about 0.2 sec on my development machine while the
one with subquery requires more than 4x times as much. Mostly because of
the extra sequential scan and a semi-join.
--
You received this message because you are subscribed to the Google Groups
"Django developers (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/django-developers.
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-developers/0dfb4964-44d5-4648-b7dd-ac06445a164a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.