Re: [PERFORM] faster search

2005-06-11 Thread Tom Lane
John A Meinel <[EMAIL PROTECTED]> writes:
> I am a little surprised that it is taking 40s to scan only 400k rows,
> though.

Yeah, that seemed high to me too.  Table bloat maybe?  It would be
interesting to look at the output of "vacuum verbose test" to see
how much dead space there is.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] faster search

2005-06-11 Thread Cosimo Streppone

Steve Atkins wrote:


On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?



> [...]
>

Create an index on (productlistid, typeid, partnumber) then

  select * from test where productlistid=3 and typeid=9
   order by productlistid, typeid, partnumber LIMIT 15;



Clark, try also adding (just for testing) partnumber to your
where clause, like this:

   select * from test where productlistid=3 and typeid=9
   and partnumber='foo' order by productlistid,
   typeid, partnumber;

and check output of explain analyze.

I had experiences of planner "bad" use of indexes when attribute
types were integer and cardinality was low (a single attribute
value, like "typeid=9" selects one or few rows).
However, this was on 7.1.3, and probably is not relevant to your case.

--
Cosimo


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] faster search

2005-06-10 Thread John A Meinel
Clark Slater wrote:
> hmm, i'm baffled.  i simplified the query
> and it is still taking forever...
> 
> 
>   test
> -
>  id| integer
>  partnumber| character varying(32)
>  productlistid | integer
>  typeid| integer
> 
> 
> Indexes:
> "test_productlistid" btree (productlistid)
> "test_typeid" btree (typeid)
> "test_productlistid_typeid" btree (productlistid, typeid)
> 
> 
> explain analyze select * from test where (productlistid=3 and typeid=9);
> 
>   QUERY PLAN
> ---
>  Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
> time=516.459..41930.250 rows=132528 loops=1)
>Filter: ((productlistid = 3) AND (typeid = 9))
>  Total runtime: 41975.154 ms
> (3 rows)
> 
> 

This query is still going to take a long time, because you have to scan
the whole table. Your WHERE clause is not very specific (it takes 25% of
the table). Convention says that any time you want > 5-10% of a table, a
sequential scan is better, because it does it in order.

Now if you did:

explain analyze select * from test where (productlistid=3 and typeid=9)
limit 15;

I think that would be very fast.

I am a little surprised that it is taking 40s to scan only 400k rows,
though. On an older machine of mine (with only 256M ram and dual 450MHz
Celerons), I have a table with 74k rows which takes about .5 sec. At
those numbers it should take more like 4s not 40.

John
=:->


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] faster search

2005-06-10 Thread Joshua D. Drake

Clark Slater wrote:

Query should return 132,528 rows.


O.k. then the planner is doing fine it looks like. The problem is you 
are pulling 132,528 rows. I would suggest moving to a cursor which will

allow you to fetch in smaller chunks much quicker.

Sincerely,

Joshua D. Drake




vbp=# set enable_seqscan = false;
SET
vbp=# explain analyze select * from test where (productlistid=3 and 
typeid=9);


   QUERY PLAN

 Index Scan using test_typeid on test  (cost=0.00..137223.89 rows=156194 
width=725) (actual time=25.999..25708.478 rows=132528

 loops=1)
   Index Cond: (typeid = 9)
   Filter: (productlistid = 3)
 Total runtime: 25757.679 ms
(4 rows)


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:


thanks for your suggestion.
a small improvement.  still pretty slow...

vbp=# alter table test alter column productlistid set statistics 150;
ALTER TABLE
vbp=# alter table test alter column typeid set statistics 150;
ALTER TABLE
vbp=# explain analyze select * from test where (productlistid=3 and 



Hello,

Also what happens if you:

set enable_seqscan = false;
explain analyze query

Sincerely,

Joshua D. Drake




typeid=9);
QUERY PLAN

-- 
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) 
(actual time=525.617..36802.556 rows=132528 loops=1)

   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 36847.754 ms
(3 rows)

Time: 36850.719 ms


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:


hmm, i'm baffled.  i simplified the query
and it is still taking forever...




What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and 
typeid=9);


Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and 
typeid=9);


  QUERY PLAN
--- 


 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.





Well, if you are matching 130k out of 400k rows, then a sequential 
scan
is certainly prefered to an index scan. And then you have to sort 
those
130k rows by partnumber. This *might* be spilling to disk 
depending on

what your workmem/sortmem is set to.

I would also say that what you would really want is some way to 
get the

whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the 
index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have 
to do

filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark





Good luck,
John
=:->



---(end of 
broadcast)---

TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to 
[EMAIL PROTECTED])





--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]




--
Your PostgreSQL solutions provider, Com

Re: [PERFORM] faster search

2005-06-10 Thread Clark Slater

Query should return 132,528 rows.

vbp=# set enable_seqscan = false;
SET
vbp=# explain analyze select * from test where (productlistid=3 and typeid=9);

   QUERY PLAN

 Index Scan using test_typeid on test  (cost=0.00..137223.89 rows=156194 
width=725) (actual time=25.999..25708.478 rows=132528

 loops=1)
   Index Cond: (typeid = 9)
   Filter: (productlistid = 3)
 Total runtime: 25757.679 ms
(4 rows)


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:

thanks for your suggestion.
a small improvement.  still pretty slow...

vbp=# alter table test alter column productlistid set statistics 150;
ALTER TABLE
vbp=# alter table test alter column typeid set statistics 150;
ALTER TABLE
vbp=# explain analyze select * from test where (productlistid=3 and 


Hello,

Also what happens if you:

set enable_seqscan = false;
explain analyze query

Sincerely,

Joshua D. Drake




typeid=9);
QUERY PLAN

-- 
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual 
time=525.617..36802.556 rows=132528 loops=1)

   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 36847.754 ms
(3 rows)

Time: 36850.719 ms


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:


hmm, i'm baffled.  i simplified the query
and it is still taking forever...



What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and typeid=9);

Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.




Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have to do
filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark




Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 

Re: [PERFORM] faster search

2005-06-10 Thread Joshua D. Drake

Clark Slater wrote:

thanks for your suggestion.
a small improvement.  still pretty slow...

vbp=# alter table test alter column productlistid set statistics 150;
ALTER TABLE
vbp=# alter table test alter column typeid set statistics 150;
ALTER TABLE
vbp=# explain analyze select * from test where (productlistid=3 and 
typeid=9);

QUERY PLAN
-- 

 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual 
time=525.617..36802.556 rows=132528 loops=1)

   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 36847.754 ms
(3 rows)

Time: 36850.719 ms


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:


hmm, i'm baffled.  i simplified the query
and it is still taking forever...



What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and typeid=9);


How many rows should it return?



Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.




Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the 
index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have 
to do

filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark




Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] faster search

2005-06-10 Thread Joshua D. Drake

Clark Slater wrote:

thanks for your suggestion.
a small improvement.  still pretty slow...

vbp=# alter table test alter column productlistid set statistics 150;
ALTER TABLE
vbp=# alter table test alter column typeid set statistics 150;
ALTER TABLE
vbp=# explain analyze select * from test where (productlistid=3 and 


Hello,

Also what happens if you:

set enable_seqscan = false;
explain analyze query

Sincerely,

Joshua D. Drake




typeid=9);
QUERY PLAN
-- 

 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual 
time=525.617..36802.556 rows=132528 loops=1)

   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 36847.754 ms
(3 rows)

Time: 36850.719 ms


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:


hmm, i'm baffled.  i simplified the query
and it is still taking forever...



What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and typeid=9);

Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.




Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the 
index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have 
to do

filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark




Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])




--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]



--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] faster search

2005-06-10 Thread Clark Slater

thanks for your suggestion.
a small improvement.  still pretty slow...

vbp=# alter table test alter column productlistid set statistics 150;
ALTER TABLE
vbp=# alter table test alter column typeid set statistics 150;
ALTER TABLE
vbp=# explain analyze select * from test where (productlistid=3 and typeid=9);
QUERY PLAN
--
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual 
time=525.617..36802.556 rows=132528 loops=1)

   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 36847.754 ms
(3 rows)

Time: 36850.719 ms


On Fri, 10 Jun 2005, Joshua D. Drake wrote:


Clark Slater wrote:

hmm, i'm baffled.  i simplified the query
and it is still taking forever...


What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and typeid=9);

Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.



Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have to do
filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark



Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] faster search

2005-06-10 Thread Joshua D. Drake

Clark Slater wrote:

hmm, i'm baffled.  i simplified the query
and it is still taking forever...


What happens if you:

alter table test alter column productlistid set statistics 150;
alter table test alter column typeid set statistics 150;
explain analyze select * from test where (productlistid=3 and typeid=9);

Sincerely,

Joshua D. Drake





  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:


Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.



Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the index,
so the planner realizes it can do an indexed lookup to get the 
information.


You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have to do
filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark



Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])



--
Your PostgreSQL solutions provider, Command Prompt, Inc.
24x7 support - 1.800.492.2240, programming, and consulting
Home of PostgreSQL Replicator, plPHP, plPerlNG and pgPHPToolkit
http://www.commandprompt.com / http://www.postgresql.org

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] faster search

2005-06-10 Thread Clark Slater

hmm, i'm baffled.  i simplified the query
and it is still taking forever...


  test
-
 id| integer
 partnumber| character varying(32)
 productlistid | integer
 typeid| integer


Indexes:
"test_productlistid" btree (productlistid)
"test_typeid" btree (typeid)
"test_productlistid_typeid" btree (productlistid, typeid)


explain analyze select * from test where (productlistid=3 and typeid=9);

  QUERY PLAN
---
 Seq Scan on test  (cost=0.00..96458.27 rows=156194 width=725) (actual
time=516.459..41930.250 rows=132528 loops=1)
   Filter: ((productlistid = 3) AND (typeid = 9))
 Total runtime: 41975.154 ms
(3 rows)


System specs:
PostgreSQL 7.4.2 on RedHat 9
dual AMD Athlon 2GHz processors
1 gig memory
mirrored 7200 RPM IDE disks


On Fri, 10 Jun 2005, John A Meinel wrote:


Clark Slater wrote:

Hi-

Would someone please enlighten me as
to why I'm not seeing a faster execution
time on the simple scenario below?

there are 412,485 rows in the table and the
query matches on 132,528 rows, taking
almost a minute to execute.  vaccuum
analyze was just run.


Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the index,
so the planner realizes it can do an indexed lookup to get the information.

You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have to do
filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.



Thanks!
Clark


Good luck,
John
=:->



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] faster search

2005-06-10 Thread Steve Atkins
On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote:
> Hi-
> 
> Would someone please enlighten me as
> to why I'm not seeing a faster execution
> time on the simple scenario below?

Because you need to extract a huge number of rows via a seqscan, sort
them and then throw them away, I think.

> explain analyze select * from test where productlistid=3 and typeid=9 
> order by partnumber limit 15;

Create an index on (productlistid, typeid, partnumber) then

  select * from test where productlistid=3 and typeid=9
   order by productlistid, typeid, partnumber LIMIT 15;

?

Cheers,
  Steve

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] faster search

2005-06-10 Thread John A Meinel
Clark Slater wrote:
> Hi-
> 
> Would someone please enlighten me as
> to why I'm not seeing a faster execution
> time on the simple scenario below?
> 
> there are 412,485 rows in the table and the
> query matches on 132,528 rows, taking
> almost a minute to execute.  vaccuum
> analyze was just run.

Well, if you are matching 130k out of 400k rows, then a sequential scan
is certainly prefered to an index scan. And then you have to sort those
130k rows by partnumber. This *might* be spilling to disk depending on
what your workmem/sortmem is set to.

I would also say that what you would really want is some way to get the
whole thing from an index. And I think the way to do that is:

CREATE INDEX test_partnum_listid_typeid_idx ON
test(partnumber, productlistid, typeid);

VACUUM ANALYZE test;

EXPLAIN ANALYZE SELECT * FROM test
WHERE productlistid=3 AND typeid=9
ORDER BY partnumber, productlistid, typeid
LIMIT 15
;

The trick is that you have to match the order by exactly with the index,
so the planner realizes it can do an indexed lookup to get the information.

You could also just create an index on partnumber, and see how that
affects your original query. I think the planner could use an index
lookup on partnumber to get the ordering correct. But it will have to do
filtering after the fact based on productlistid and typeid.
With my extended index, I think the planner can be smarter and lookup
all 3 by the index.

> 
> Thanks!
> Clark

Good luck,
John
=:->


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] faster search

2005-06-10 Thread Tobias Brox
[Clark Slater - Fri at 01:45:05PM -0400]
> Would someone please enlighten me as
> to why I'm not seeing a faster execution
> time on the simple scenario below?

Just some thoughts from a novice PG-DBA .. :-)

My general experience is that PG usually prefers sequal scans to indices if
a large portion of the table is to be selected, because it is faster to do a
seqscan than to follow an index and constantly seek between different
positions on the hard disk.

However, most of the time is spent sorting on partnumber, and you only want
15 rows, so of course you should have an index on partnumber!  Picking up 15
rows will be ligtning fast with that index.

If you may want to select significantly more than 15 rows, you can also try
to make a partial index:

create index test_pli3_ti9_by_part on test (partnumber) where
productlistid=3 and typeid=9;

If 3 and 9 are not constants in the query, try to make a three-key index
(it's important with partnumber because a lot of time is spent sorting):

create index test_pli_type_part on test (productslistid,typeid,partnumber);

To get pg to recognize the index, you will probably have to help it a bit:

select * from test where productlistid=3 and typeid=9 order by
productlistid,typeid,partnumber limit 15;

-- 
Tobias Brox, +47-91700050


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] faster search

2005-06-10 Thread Steinar H. Gunderson
On Fri, Jun 10, 2005 at 01:45:05PM -0400, Clark Slater wrote:
> Indexes:
> "test_id" btree (id)
> "test_plid" btree (productlistid)
> "test_typeid" btree (typeid)
> "test_plidtypeid" btree (productlistid, typeid)
> 
> 
> explain analyze select * from test where productlistid=3 and typeid=9 
> order by partnumber limit 15;

You do not have an index on partnumber. Try adding one.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]