Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Michael Guerin
Stephan Szabo wrote:

On Thu, 28 Aug 2003, Michael Guerin wrote:

 

Stephan Szabo wrote:

   

On Thu, 28 Aug 2003, Michael Guerin wrote:



 

Stephan Szabo wrote:



   

On Wed, 27 Aug 2003, Michael Guerin wrote:





 

I'm running into some performance problems trying to execute simple
queries.
postgresql version 7.3.3
.conf params changed from defaults.
shared_buffers = 64000
sort_mem = 64000
fsync = false
effective_cache_size = 40
ex. query: select * from x where id in (select id from y);

There's an index on each table for id.  SQL Server takes <1s to return,
postgresql doesn't return at all, neither does explain analyze.


   

IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
generally much better (for reasonably sized subqueries) but in earlier
versions you'll probably want to convert into an EXISTS or join form.




 

Something else seems to be going on, even switching to an exists clause
gives much better but poor performance.
count(*) where exists clause: Postgresql 19s, SQL Server <1s
count(*) where not exists:  23.3s SQL Server 1.5s
   

What does explain analyze show for the two queries?



 

explain analyze  select count(*) from tbltimeseries where exists(select
uniqid  from tblobjectname where timeseriesid = uniqid);
Aggregate  (cost=5681552.18..5681552.18 rows=1 width=0) (actual
time=22756.64..22756.64 rows=1 loops=1)
  ->  Seq Scan on tbltimeseries  (cost=0.00..5680051.34 rows=600336
width=0) (actual time=22.06..21686.78 rows=1200113 loops=1)
Filter: (NOT (subplan))
SubPlan
  ->  Index Scan using idx_objectname on tblobjectname
(cost=0.00..4.70 rows=1 width=4) (actual time=0.01..0.01 rows=0
loops=1200673)
Index Cond: ($0 = uniqid)
Total runtime: 22756.83 msec
(7 rows)
   

Hmm... I'd thought that it had options for a better plan than that.

What do things like:

explain analyze select count(distinct timeseriesid) from tbltimeseries,
tblobjectname where timeseriesid=uniquid;
and

explain analyze select count(distinct timeseriesid) from
tbltimeseries left outer join tblobjectname on (timeseriesid=uniqid)
where uniqid is null;
give you?

 

much better performance:

explain analyze select count(distinct timeseriesid) from tbltimeseries,
tblobjectname where timeseriesid=uniquid;
Aggregate  (cost=7384.03..7384.03 rows=1 width=8) (actual time=668.15..668.15 rows=1 
loops=1)
  ->  Nested Loop  (cost=0.00..7380.83 rows=1282 width=8) (actual time=333.31..666.13 
rows=561 loops=1)
->  Seq Scan on tblobjectname  (cost=0.00..33.82 rows=1282 width=4) (actual 
time=0.05..4.98 rows=1282 loops=1)
->  Index Scan using xx on tbltimeseries  (cost=0.00..5.72 rows=1 width=4) 
(actual time=0.51..0.51 rows=0 loops=1282)
  Index Cond: (tbltimeseries.timeseriesid = "outer".uniqid)
Total runtime: 669.61 msec
(6 rows)
explain analyze select count(distinct timeseriesid) from
tbltimeseries left outer join tblobjectname on (timeseriesid=uniqid)
where uniqid is null;
Aggregate  (cost=59144.19..59144.19 rows=1 width=8) (actual time=12699.47..12699.47 
rows=1 loops=1)
  ->  Hash Join  (cost=37.02..56142.51 rows=1200673 width=8) (actual 
time=7.41..6376.12 rows=1200113 loops=1)
Hash Cond: ("outer".timeseriesid = "inner".uniqid)
Filter: ("inner".uniqid IS NULL)
->  Seq Scan on tbltimeseries  (cost=0.00..44082.73 rows=1200673 width=4) 
(actual time=0.01..3561.61 rows=1200673 loops=1)
->  Hash  (cost=33.82..33.82 rows=1282 width=4) (actual time=4.84..4.84 rows=0 
loops=1)
  ->  Seq Scan on tblobjectname  (cost=0.00..33.82 rows=1282 width=4) 
(actual time=0.04..2.84 rows=1282 loops=1)
Total runtime: 12699.76 msec
(8 rows)






---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Stephan Szabo
On Thu, 28 Aug 2003, Michael Guerin wrote:

> Stephan Szabo wrote:
>
> >On Thu, 28 Aug 2003, Michael Guerin wrote:
> >
> >
> >
> >>Stephan Szabo wrote:
> >>
> >>
> >>
> >>>On Wed, 27 Aug 2003, Michael Guerin wrote:
> >>>
> >>>
> >>>
> >>>
> >>>
> I'm running into some performance problems trying to execute simple
> queries.
> 
> postgresql version 7.3.3
> .conf params changed from defaults.
> shared_buffers = 64000
> sort_mem = 64000
> fsync = false
> effective_cache_size = 40
> 
> ex. query: select * from x where id in (select id from y);
> 
> There's an index on each table for id.  SQL Server takes <1s to return,
> postgresql doesn't return at all, neither does explain analyze.
> 
> 
> 
> 
> >>>IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
> >>>generally much better (for reasonably sized subqueries) but in earlier
> >>>versions you'll probably want to convert into an EXISTS or join form.
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>
> >>Something else seems to be going on, even switching to an exists clause
> >>gives much better but poor performance.
> >>count(*) where exists clause: Postgresql 19s, SQL Server <1s
> >>count(*) where not exists:  23.3s SQL Server 1.5s
> >>
> >>
> >
> >What does explain analyze show for the two queries?
> >
> >
> >
> >
> explain analyze  select count(*) from tbltimeseries where exists(select
> uniqid  from tblobjectname where timeseriesid = uniqid);
> Aggregate  (cost=5681552.18..5681552.18 rows=1 width=0) (actual
> time=22756.64..22756.64 rows=1 loops=1)
>->  Seq Scan on tbltimeseries  (cost=0.00..5680051.34 rows=600336
> width=0) (actual time=22.06..21686.78 rows=1200113 loops=1)
>  Filter: (NOT (subplan))
>  SubPlan
>->  Index Scan using idx_objectname on tblobjectname
> (cost=0.00..4.70 rows=1 width=4) (actual time=0.01..0.01 rows=0
> loops=1200673)
>  Index Cond: ($0 = uniqid)
>  Total runtime: 22756.83 msec
> (7 rows)

Hmm... I'd thought that it had options for a better plan than that.

What do things like:

explain analyze select count(distinct timeseriesid) from tbltimeseries,
 tblobjectname where timeseriesid=uniquid;

and

explain analyze select count(distinct timeseriesid) from
 tbltimeseries left outer join tblobjectname on (timeseriesid=uniqid)
 where uniqid is null;

give you?


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Michael Guerin
Stephan Szabo wrote:

On Thu, 28 Aug 2003, Michael Guerin wrote:

 

Stephan Szabo wrote:

   

On Wed, 27 Aug 2003, Michael Guerin wrote:



 

I'm running into some performance problems trying to execute simple
queries.
postgresql version 7.3.3
.conf params changed from defaults.
shared_buffers = 64000
sort_mem = 64000
fsync = false
effective_cache_size = 40
ex. query: select * from x where id in (select id from y);

There's an index on each table for id.  SQL Server takes <1s to return,
postgresql doesn't return at all, neither does explain analyze.
   

IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
generally much better (for reasonably sized subqueries) but in earlier
versions you'll probably want to convert into an EXISTS or join form.


 

Something else seems to be going on, even switching to an exists clause
gives much better but poor performance.
count(*) where exists clause: Postgresql 19s, SQL Server <1s
count(*) where not exists:  23.3s SQL Server 1.5s
   

What does explain analyze show for the two queries?

 

explain analyze  select count(*) from tbltimeseries where exists(select 
uniqid  from tblobjectname where timeseriesid = uniqid);
Aggregate  (cost=5681552.18..5681552.18 rows=1 width=0) (actual 
time=22756.64..22756.64 rows=1 loops=1)
  ->  Seq Scan on tbltimeseries  (cost=0.00..5680051.34 rows=600336 
width=0) (actual time=22.06..21686.78 rows=1200113 loops=1)
Filter: (NOT (subplan))
SubPlan
  ->  Index Scan using idx_objectname on tblobjectname  
(cost=0.00..4.70 rows=1 width=4) (actual time=0.01..0.01 rows=0 
loops=1200673)
Index Cond: ($0 = uniqid)
Total runtime: 22756.83 msec
(7 rows)

fiasco=# explain analyze  select count(*) from tbltimeseries where 
exists(select uniqid  from tblobjectname where timeseriesid = uniqid);
  QUERY 
PLAN

explain analyze  select count(*) from tbltimeseries where exists(select 
uniqid  from tblobjectname where timeseriesid = uniqid);
Aggregate  (cost=5681552.18..5681552.18 rows=1 width=0) (actual 
time=19558.77..19558.77 rows=1 loops=1)
  ->  Seq Scan on tbltimeseries  (cost=0.00..5680051.34 rows=600336 
width=0) (actual time=0.21..19557.73 rows=560 loops=1)
Filter: (subplan)
SubPlan
  ->  Index Scan using idx_objectname on tblobjectname  
(cost=0.00..4.70 rows=1 width=4) (actual time=0.01..0.01 rows=0 
loops=1200673)
Index Cond: ($0 = uniqid)
Total runtime: 19559.04 msec
(7 rows)





---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Stephan Szabo
On Thu, 28 Aug 2003, Michael Guerin wrote:

> Stephan Szabo wrote:
>
> >On Wed, 27 Aug 2003, Michael Guerin wrote:
> >
> >
> >
> >>I'm running into some performance problems trying to execute simple
> >>queries.
> >>
> >>postgresql version 7.3.3
> >>.conf params changed from defaults.
> >>shared_buffers = 64000
> >>sort_mem = 64000
> >>fsync = false
> >>effective_cache_size = 40
> >>
> >>ex. query: select * from x where id in (select id from y);
> >>
> >>There's an index on each table for id.  SQL Server takes <1s to return,
> >>postgresql doesn't return at all, neither does explain analyze.
> >>
> >>
> >
> >IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
> >generally much better (for reasonably sized subqueries) but in earlier
> >versions you'll probably want to convert into an EXISTS or join form.
> >
> >
> >
> >
> Something else seems to be going on, even switching to an exists clause
> gives much better but poor performance.
> count(*) where exists clause: Postgresql 19s, SQL Server <1s
> count(*) where not exists:  23.3s SQL Server 1.5s

What does explain analyze show for the two queries?


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Shridhar Daithankar
On 28 Aug 2003 at 10:38, Michael Guerin wrote:
> >IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
> >generally much better (for reasonably sized subqueries) but in earlier
> >versions you'll probably want to convert into an EXISTS or join form.
> Something else seems to be going on, even switching to an exists clause 
> gives much better but poor performance.
> count(*) where exists clause: Postgresql 19s, SQL Server <1s
> count(*) where not exists:  23.3s SQL Server 1.5s

This was with 7.4? Can you try downloading 7.4CVS and try?

> 
> SQL Server runs on a dual 1.4 with 4gigs, win2k
> Postgresql runs on a quad 900 with 8 gigs, sunos 5.8

SunOS...Not the impala out there but anyways I would refrain from slipping in 
that..

Parden me if this is a repeatation,  have you set your effective cache size?

Bye
 Shridhar

--
Nouvelle cuisine, n.:   French for "not enough food".Continental breakfast, n.: 
English for "not enough food".Tapas, n.:Spanish for "not enough food".Dim Sum, 
n.: Chinese for more food than you've ever seen in your entire life.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Michael Guerin
Stephan Szabo wrote:

On Wed, 27 Aug 2003, Michael Guerin wrote:

 

I'm running into some performance problems trying to execute simple
queries.
postgresql version 7.3.3
.conf params changed from defaults.
shared_buffers = 64000
sort_mem = 64000
fsync = false
effective_cache_size = 40
ex. query: select * from x where id in (select id from y);

There's an index on each table for id.  SQL Server takes <1s to return,
postgresql doesn't return at all, neither does explain analyze.
   

IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
generally much better (for reasonably sized subqueries) but in earlier
versions you'll probably want to convert into an EXISTS or join form.
 

Something else seems to be going on, even switching to an exists clause 
gives much better but poor performance.
count(*) where exists clause: Postgresql 19s, SQL Server <1s
count(*) where not exists:  23.3s SQL Server 1.5s

SQL Server runs on a dual 1.4 with 4gigs, win2k
Postgresql runs on a quad 900 with 8 gigs, sunos 5.8


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Christopher Kings-Lynne
> postgresql version 7.3.3
> .conf params changed from defaults.
> shared_buffers = 64000
> sort_mem = 64000
> fsync = false
> effective_cache_size = 40
>
> ex. query: select * from x where id in (select id from y);
>
> There's an index on each table for id.  SQL Server takes <1s to return,
> postgresql doesn't return at all, neither does explain analyze.
> x has 1200673 rows
> y has 1282 rows
>
> It seems like its ignoring the index and not using enough memory.. any
> ideas?

This is a known problem in 7.3, it is much faster in 7.4b1.  This should be
very, very fast though, and do exactly the same thing:

select * from x where exists (select id from y where y.id=x.id);

Chris


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Neil Conway
On Wed, Aug 27, 2003 at 05:40:05PM -0400, Michael Guerin wrote:
> ex. query: select * from x where id in (select id from y);
> 
> There's an index on each table for id.  SQL Server takes <1s to return, 
> postgresql doesn't return at all, neither does explain analyze.

This particular form of query is a known performance problem for PostgreSQL
7.3 and earlier -- the problem should hopefully be fixed in 7.4 (currently
in beta). Check the archives for more discussion on this topic.

-Neil


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Simple queries take forever to run

2003-08-28 Thread Stephan Szabo
On Wed, 27 Aug 2003, Michael Guerin wrote:

> I'm running into some performance problems trying to execute simple
> queries.
>
> postgresql version 7.3.3
> .conf params changed from defaults.
> shared_buffers = 64000
> sort_mem = 64000
> fsync = false
> effective_cache_size = 40
>
> ex. query: select * from x where id in (select id from y);
>
> There's an index on each table for id.  SQL Server takes <1s to return,
> postgresql doesn't return at all, neither does explain analyze.

IN(subquery) is known to run poorly in 7.3.x and earlier.  7.4 is
generally much better (for reasonably sized subqueries) but in earlier
versions you'll probably want to convert into an EXISTS or join form.


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly