As I'm a relentless DBA after all, and I enjoy facts over estimates, and I
live for these kind of things....I concocted a script that benches
different methods.
pyDAL interaction has been left out of the picture, as it'll probably
affect estimates. Raw T-SQL under the hood, psycopg2 driver.
Every test case is based on 1000 iterations: fresh db, vacuumed and
defragmented indexes.
speed_executesql1() does a simple join, then distincts the results (would
have been the easiest answer)
select distinct news_categories.id, news_categories.name
from news
, news_categories
where news.category = news_categories.id
speed_executesql2() does a subselect - as advised here -
select news_categories.id, news_categories.name
from news_categories
where news_categories.id in (
select distinct category from news
)
speed_executesql3() does the "two queries thingy"
select distinct category from news
then parses the results, then sends
select news_categories.id, news_categories.name
from news_categories
where id IN (%s)
The resultsets are always identical..................
I estimated a set that covers from 25% to 75% the "distribution" of
categories (for the less accustomed to "database tuners", it means sets
that have only 25% of the categories used by at least one news, to sets
that use 75% of the categories): this lets us test if a method is more or
less convenient if the number of categories returned is small or high, in
relation to the total set of available categories
AND
sets involving a cardinality ratio from 40 to 1000 (i.e. from news =
categories * 100 to news = categories * 1000): this helps to identify if a
method is more or less convenient if the number of news is more or less
high than the number of categories.
Results indicate that the mean of the results seems to be in favour of
doing the subselect.
speed_executesql1() total=117.208569 avg=0.117209
speed_executesql2() total=33.434958 avg=0.033435
speed_executesql3() total=52.830114 avg=0.052830
--
Resources:
- http://web2py.com
- http://web2py.com/book (Documentation)
- http://github.com/web2py/web2py (Source code)
- https://code.google.com/p/web2py/issues/list (Report Issues)
---
You received this message because you are subscribed to the Google Groups
"web2py-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/d/optout.