Hello Marty, Nick and Robert,
NB> Depending on what version of PG you are running, IN might take a while
NB> to complete. If so try an EXISTS instead
RT> A question and two experiments... what version of postgresql is this?
I am using the newer 7.5dev native Windows port. For this reason I
don't think that IN will cause any trouble (I read that this issue was
resolved in 7.4).
MS> At any rate, a query with an IN clause should help quite a bit
MS> SELECT bv_books. *
MS> FROM bv_books
MS> WHERE bv_books.book_id IN (
MS> SELECT book_id
MS> FROM bv_genres
MS> WHERE bv_bookgenres.genre_id = 5830
MS> )
MS> ORDER BY vote_avg DESC LIMIT 10 OFFSET 0;
It looks like it helps a bit (though you meant "FROM bv_bookgenres",
right?). I can't tell you how MUCH it helped though, because of two
reasons:
1) As soon as I run a query, it is cached in the memory and I can't
really find a good way to flush it out of there to test again except a
full computer reset (shutting postmaster down doesn't help). If you
have a better idea on this, do tell me =\ (Reminding again, I am on
Windows).
2) I *think* I resolved this issue, at least for most of the genre_ids
(didn't go through them all, but tried a few with different book count
and the results looked quite good). The fault was partly mine, a few
weeks ago I increase the statistics for the genre_id column a bit too
much (from 10 to 70), I was unsure how exactly it works (and still am)
but it helped for a few genre_ids that had a high book count, yet it
also hurt the performence for the genres without as much ids. I now
halved the stastics (to 58) and almost everything looks good now.
Because of that I'll stop working on that query for a while (unless
you have some more performance tips on the subject). Big thanks to
everyone who helped.. And I might bring this issue later again, it it
still will cause too much troubles.
RT> Try reindexing i_bookgenres_genre_id and capture the explain
RT> analyze for that.
Is that's what you meant "REINDEX INDEX i_bookgenres_genre_id"? But it
returns no messages what-so-ever =\. I can EXPLAIN it either.
RT> If it doesn't help try doing set enable_indexscan = false and
RT> capture the explain analyze for that.
Here it is:
------------------------------------------------------------------------------------------------------------------------------------------
QUERY PLAN
Limit (cost=41099.93..41099.96 rows=10 width=76) (actual time=6734.000..6734.000
rows=10 loops=1)
-> Sort (cost=41099.93..41100.45 rows=208 width=76) (actual
time=6734.000..6734.000 rows=10 loops=1)
Sort Key: bv_books.vote_count
-> Merge Join (cost=40229.21..41091.92 rows=208 width=76) (actual
time=6078.000..6593.000 rows=1993 loops=1)
Merge Cond: ("outer".book_id = "inner".book_id)
-> Sort (cost=16817.97..16818.49 rows=208 width=4) (actual
time=1062.000..1062.000 rows=1993 loops=1)
Sort Key: bv_bookgenres.book_id
-> Seq Scan on bv_bookgenres (cost=0.00..16809.96 rows=208
width=4) (actual time=0.000..1047.000 rows=1993 loops=1)
Filter: (genre_id = 5830)
-> Sort (cost=23411.24..23841.04 rows=171918 width=76) (actual
time=5016.000..5189.000 rows=171801 loops=1)
Sort Key: bv_books.book_id
-> Seq Scan on bv_books (cost=0.00..4048.18 rows=171918
width=76) (actual time=0.000..359.000 rows=171918 loops=1)
Total runtime: 6734.000 ms
------------------------------------------------------------------------------------------------------------------------------------------
Regards,
Vitaly Belman
ICQ: 1912453
AIM: VitalyB1984
MSN: [EMAIL PROTECTED]
Yahoo!: VitalyBe
Wednesday, May 26, 2004, 1:24:18 AM, you wrote:
MS> Vitaly,
MS> This looks like there might be some room for performance improvement...
>> MS> I didn't see the table structure, but I assume
>> MS> that the vote_avg and
>> MS> vote_count fields are in bv_bookgenres.
>>
>> I didn't understand you. vote_avg is stored in bv_books.
MS> Ok. That helps. The confusion (on my end) came from the SELECT clause
MS> of the query you provided:
>> SELECT bv_books. * ,
>> vote_avg,
>> vote_count
MS> All fields from bv_books were selected (bv_books.*) along with vote_agv
MS> and vote_count. My assumption was that vote_avg and vote_count were
MS> therefore not in bv_books.
MS> At any rate, a query with an IN clause should help quite a bit:
MS> SELECT bv_books. *
MS> FROM bv_books
MS> WHERE bv_books.book_id IN (
MS> SELECT book_id
MS> FROM bv_genres
MS> WHERE bv_bookgenres.genre_id = 5830
MS> )
MS> ORDER BY vote_avg DESC LIMIT 10 OFFSET 0;
MS> Give it a whirl.
MS> Marty
MS> ---------------------------(end of
MS> broadcast)---------------------------
MS> TIP 6: Have you searched our list archives?
MS> http://archives.postgresql.org
---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match