Re: [PERFORM] Slow query with a lot of data

2008-08-18 Thread Matthew Wakeling
On Mon, 18 Aug 2008, Moritz Onken wrote: Running the query for more than one user is indeed not much slower. That's what I need. I'm clustering the results table on domain right now. But why is this better than clustering it on "user"? The reason is the way that the merge join algorithm works.

Re: [PERFORM] Slow query with a lot of data

2008-08-18 Thread Moritz Onken
Well, you're getting the database to read the entire contents of the domain_categories table in order. That's 12 million rows - a fair amount of work. You may find that removing the "user = 1337" constraint doesn't make the query much slower - that's where you get a big win by clusterin

Re: [PERFORM] file system and raid performance

2008-08-18 Thread Mark Wong
On Fri, Aug 15, 2008 at 12:22 PM, Bruce Momjian <[EMAIL PROTECTED]> wrote: > Mark Wong wrote: >> On Mon, Aug 4, 2008 at 10:04 PM, <[EMAIL PROTECTED]> wrote: >> > On Mon, 4 Aug 2008, Mark Wong wrote: >> > >> >> Hi all, >> >> >> >> We've thrown together some results from simple i/o tests on Linux >>

Re: [PERFORM] Cross Join Problem

2008-08-18 Thread Tom Lane
[ please keep the list cc'd for the archives' sake ] "Gauri Kanekar" <[EMAIL PROTECTED]> writes: > On Mon, Aug 18, 2008 at 7:32 PM, Tom Lane <[EMAIL PROTECTED]> wrote: >> No PG release since 7.3 would have voluntarily planned that query that >> way. Maybe you were using join_collapse_limit = 1 to

Re: [PERFORM] Slow query with a lot of data

2008-08-18 Thread Matthew Wakeling
On Mon, 18 Aug 2008, Moritz Onken wrote: "HashAggregate (cost=817397.78..817428.92 rows=2491 width=8) (actual time=42874.339..42878.419 rows=3361 loops=1)" " -> Merge Join (cost=748.47..674365.50 rows=19070970 width=8) (actual > time=15702.449..42829.388 rows=36308 loops=1)" "Merge

Re: [PERFORM] Slow query with a lot of data

2008-08-18 Thread Moritz Onken
Am 18.08.2008 um 16:30 schrieb Matthew Wakeling: On Mon, 18 Aug 2008, Moritz Onken wrote: I have indexes on result.domain, domain_categories.domain, result.user, domain_categories.category. Clustered result on user and domain_categories on domain. "-> Materialize (cost=2118752.2

Re: [PERFORM] Slow query with a lot of data

2008-08-18 Thread Matthew Wakeling
On Mon, 18 Aug 2008, Moritz Onken wrote: I have indexes on result.domain, domain_categories.domain, result.user, domain_categories.category. Clustered result on user and domain_categories on domain. "-> Materialize (cost=2118752.28..2270064.64 rows=12104989 width=8) (actual time=464

[PERFORM] Slow query with a lot of data

2008-08-18 Thread Moritz Onken
Hi, I run this query: select max(a."user"), b.category, count(1) from result a, domain_categories b where a."domain" = b."domain" group by b.category; the table result contains all websites a user visited. And the table domain_categories contains all categories a domain is in. result has 2

Re: [PERFORM] Cross Join Problem

2008-08-18 Thread Tom Lane
"Gauri Kanekar" <[EMAIL PROTECTED]> writes: > Following is the Query : > SELECT sum(id), sum(cd), sum(ad) >FROM table1 a , table2 b cross join table3 c >WHERE a.nkey = b.key > AND a.dkey = c.key > AND c.date = '2008-02-01' > AND b.id = 999 ;

[PERFORM] Cross Join Problem

2008-08-18 Thread Gauri Kanekar
Hi, Following is the Query : SELECT sum(id), sum(cd), sum(ad) FROM table1 a , table2 b cross join table3 c WHERE a.nkey = b.key AND a.dkey = c.key AND c.date = '2008-02-01' AND b.id = 999 ; We have fired this on our production system which is

Re: [PERFORM] Optimizing a VIEW

2008-08-18 Thread Matthew Wakeling
On Fri, 15 Aug 2008, Madison Kelly wrote: Below I will post the VIEW and a sample of the query's EXPLAIN ANALYZE. Thanks for any tips/help/clue-stick-beating you may be able to share! This query looks incredibly expensive: SELECT ... FROM customer a, history.customer_da