[PERFORM] AND OR combination: index not being used

2005-05-12 Thread David Teran
Hi,
postgres 8.0.1, mac os x 10.3.9
i have a select with multiple OR's combined with one AND:
explain analyze SELECT t0.ATTRIBUTE_TYPE FROM ATTRIBUTE_VALUE t0 WHERE 
(((t0.ATTRIBUTE_TYPE = 'pb'::varchar(10) OR t0.ATTRIBUTE_TYPE = 
'po'::varchar(10) OR t0.ATTRIBUTE_TYPE = 'pn'::varchar(10) OR 
t0.ATTRIBUTE_TYPE = 'ps'::varchar(10))) AND t0.ID_ATTRIBUTE = 
17::int8);

The result is the following. It shows that postgres does not use an 
index which makes the select pretty slow.

Seq Scan on attribute_value t0  (cost=0.00..529.13 rows=208 width=5) 
(actual time=66.591..66.591 rows=0 loops=1)
   Filter: attribute_type)::text = 'pb'::text) OR 
((attribute_type)::text = 'po'::text) OR ((attribute_type)::text = 
'pn'::text) OR ((attribute_type)::text = 'ps'::text)) AND (id_attribute 
= 17::bigint))
 Total runtime: 66.664 ms
(3 rows)

When i remove one OR qualifier one can see that now an index is used.
explain analyze SELECT t0.ATTRIBUTE_TYPE FROM ATTRIBUTE_VALUE t0 WHERE 
(((t0.ATTRIBUTE_TYPE = 'pb'::varchar(10) OR t0.ATTRIBUTE_TYPE = 
'po'::varchar(10) OR t0.ATTRIBUTE_TYPE = 'pn'::varchar(10))) AND 
t0.ID_ATTRIBUTE = 17::int8);

Index Scan using attribute_value__attribute_type__id_attribute, 
attribute_value__attribute_type__id_attribute, 
attribute_value__attribute_type__id_attribute on attribute_value t0  
(cost=0.00..451.82 rows=137 width=5) (actual time=0.301..0.301 rows=0 
loops=1)
   Index Cond: attribute_type)::text = 'pb'::text) AND 
(id_attribute = 17::bigint)) OR (((attribute_type)::text = 'po'::text) 
AND (id_attribute = 17::bigint)) OR (((attribute_type)::text = 
'pn'::text) AND (id_attribute = 17::bigint)))
   Filter: attribute_type)::text = 'pb'::text) OR 
((attribute_type)::text = 'po'::text) OR ((attribute_type)::text = 
'pn'::text)) AND (id_attribute = 17::bigint))
 Total runtime: 0.414 ms
(4 rows)

When i do 'set enable_seqscan=no' the index is used of course. 
Unfortunately the sql is generated on the fly and its not easy, more or 
less impossible to selectively enable / disable seqscan. Any hint how 
to force postgres to use the index even with more OR parts?

regards, David
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] AND OR combination: index not being used

2005-05-12 Thread David Teran
On 12.05.2005, at 16:15, Tom Lane wrote:
David Teran [EMAIL PROTECTED] writes:
Any hint how
to force postgres to use the index even with more OR parts?
More up-to-date statistics would evidently help; the thing is 
estimating
hundreds of rows returned and actually finding none.

I always do a 'vacuum analyze' if something does not work as expected. 
But this did not help. Any other tip?

regards, David
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] select max(id) from aTable is very slow

2004-06-03 Thread David Teran
Hi,
we have a table with about 6.000.000 rows. There is an index on a  
column with the name id which is an integer and serves as primary key.

When we execute select max(id) from theTable; it takes about 10  
seconds. Explain analyze returns:

 

 Aggregate  (cost=153635.15..153635.15 rows=1 width=4) (actual  
time=9738.263..9738.264 rows=1 loops=1)
   -  Seq Scan on job_property  (cost=0.00..137667.32 rows=6387132  
width=4) (actual time=0.102..7303.649 rows=6387132 loops=1)
 Total runtime: 9738.362 ms
(3 rows)


I recreated the index on column id and ran vacuum analyze job_property  
but this did not help. I tried to force index usage with  SET  
ENABLE_SEQSCAN TO OFF; but the explain analyze still looks like the  
query is done using a seqscan.

Is the speed more or less normal for a 'dual G5 with 2 GHZ and 4 GB of  
Ram and a SATA hd' or do i miss something?

regards David
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PERFORM] speeding up a select with C function?

2004-03-09 Thread David Teran
Hi,

On 08.03.2004, at 02:29, Christopher Kings-Lynne wrote:

explain analyze select
((t0.int_value-t1.int_value)*(t0.int_value-t1.int_value))
from job_property t0, job_property t1
where t0.id_job_profile = 5
and t1.id_job_profile = 6
and t1.id_job_attribute = t0.id_job_attribute
and t1.int_value  t0.int_value;
Don't bother with C function, use SQL function instead.  You could get 
a 50% speedup.

Is this always the case when using SQL instead of the C API to get 
values or only the function 'call' itself? We are thinking to use C 
functions which are optimized for the G5 altivec unit.

regards David

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[PERFORM] speeding up a select with C function?

2004-03-07 Thread David Teran
Hi,

 we need to optimize / speed up a simple select:

explain analyze select
((t0.int_value-t1.int_value)*(t0.int_value-t1.int_value))
from job_property t0, job_property t1
where t0.id_job_profile = 5
and t1.id_job_profile = 6
and t1.id_job_attribute = t0.id_job_attribute
and t1.int_value  t0.int_value;
the result from explain analyze is:

first run:
 
 
--
 Merge Join  (cost=8314.36..8336.21 rows=258 width=8) (actual  
time=226.544..226.890 rows=43 loops=1)
   Merge Cond: (outer.id_job_attribute = inner.id_job_attribute)
   Join Filter: (inner.int_value  outer.int_value)
   -  Sort  (cost=4157.18..4159.75 rows=1026 width=8) (actual  
time=113.781..113.826 rows=232 loops=1)
 Sort Key: t0.id_job_attribute
 -  Index Scan using job_property__id_job_profile__fk_index on  
job_property t0  (cost=0.00..4105.87 rows=1026 width=8) (actual  
time=0.045..113.244 rows=232 loops=1)
   Index Cond: (id_job_profile = 5)
   -  Sort  (cost=4157.18..4159.75 rows=1026 width=8) (actual  
time=112.504..112.544 rows=254 loops=1)
 Sort Key: t1.id_job_attribute
 -  Index Scan using job_property__id_job_profile__fk_index on  
job_property t1  (cost=0.00..4105.87 rows=1026 width=8) (actual  
time=0.067..112.090 rows=254 loops=1)
   Index Cond: (id_job_profile = 6)
 Total runtime: 227.120 ms
(12 rows)

second run:
 
 

 Merge Join  (cost=8314.36..8336.21 rows=258 width=8) (actual  
time=4.323..4.686 rows=43 loops=1)
   Merge Cond: (outer.id_job_attribute = inner.id_job_attribute)
   Join Filter: (inner.int_value  outer.int_value)
   -  Sort  (cost=4157.18..4159.75 rows=1026 width=8) (actual  
time=2.666..2.700 rows=232 loops=1)
 Sort Key: t0.id_job_attribute
 -  Index Scan using job_property__id_job_profile__fk_index on  
job_property t0  (cost=0.00..4105.87 rows=1026 width=8) (actual  
time=0.279..2.354 rows=232 loops=1)
   Index Cond: (id_job_profile = 5)
   -  Sort  (cost=4157.18..4159.75 rows=1026 width=8) (actual  
time=1.440..1.477 rows=254 loops=1)
 Sort Key: t1.id_job_attribute
 -  Index Scan using job_property__id_job_profile__fk_index on  
job_property t1  (cost=0.00..4105.87 rows=1026 width=8) (actual  
time=0.040..1.133 rows=254 loops=1)
   Index Cond: (id_job_profile = 6)
 Total runtime: 4.892 ms
(12 rows)

I have run vacuum analyze before executing the statements. I wonder now  
if there is any chance to speed this up. Could we use a C function to  
access the indexes faster or is there any other chance to speed this  
up?

The Server is a dual G5/2GHZ with 8 GB of RAM and a 3.5 TB fiberchannel  
RAID. The job_property table is about 1 GB large (checked with dbsize)  
and has about 6.800.000 rows.

regards David

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] select max(id) from aTable is very slow

2004-02-16 Thread David Teran
Hi Nick,

Try using:

SELECT id FROM theTable ORDER BY is DESC LIMIT 1;

Using COUNT, MAX, MIN and any aggregate function on the table of that 
size will always result in a sequential scan. There is currently no 
way around it although there are a few work arounds. See the 
following for more information.

http://archives.postgresql.org/pgsql-performance/2004-01/msg00045.php
http://archives.postgresql.org/pgsql-performance/2004-01/msg00054.php
http://archives.postgresql.org/pgsql-performance/2004-01/msg00059.php


thanks, that works fine! I will read the mail archive before asking 
such things again ;-)

cheers David

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] select count(*) from anIntColumn where int_value = 0; is very slow

2004-02-11 Thread David Teran
Hi

we have a table with about 4 million rows. One column has an int value, 
there is a btree index on it. We tried to execute the following 
statement and it is very slow on a dual G5 2GHZ with 4 GB of RAM.

explain analyze select count(*) from job_property where int_value = 0;

Aggregate  (cost=144348.80..144348.80 rows=1 width=0) (actual 
time=13536.852..13536.852 rows=1 loops=1)
  -  Seq Scan on job_property  (cost=0.00..144255.15 rows=37459 
width=0) (actual time=19.422..13511.653 rows=42115 loops=1)
Filter: (int_value = 0)
Total runtime: 13560.862 ms



Is this more or less normal or can we optimize this a little bit? 
FrontBase (which we compare currently) takes 2 seconds first time and 
about 0.2 seconds on second+ queries.

regards David

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] select count(*) from anIntColumn where int_value = 0; is very slow

2004-02-11 Thread David Teran
Hi,

Is your int_value data type int4? If not then use ... from 
job_property
where int_value = '0'
Indexes are used only if datatypes matches.

tried those variations already. Strange enough, after dropping and 
recreating the index everything worked fine.

regards David

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[PERFORM] cache whole data in RAM

2004-02-03 Thread David Teran
Hi,

we are trying to speed up a database which has about 3 GB of data. The 
server has 8 GB RAM and we wonder how we can ensure that the whole DB 
is read into RAM. We hope that this will speed up some queries.

regards David

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] another query optimization question

2004-01-31 Thread David Teran
Hi Tim,

you are right:


Interesting.  I have recollected where we saw this before:
http://archives.postgresql.org/pgsql-hackers/2003-11/msg01528.php
Apparently gettimeofday() has a glitch on some BSD releases.  OS X is
a BSD derivative and it's not so surprising if it has it too.
May I suggest that you try the test program given here:
http://archives.postgresql.org/pgsql-hackers/2003-11/msg01546.php
and file a bug report with Apple if it shows any out-of-order results?
I am fairly certain that I tried that test program when I devised it
on my own OS X machine, and couldn't get a failure.  Maybe it depends
on your hardware (hm, could dual CPUs be the key factor)?


p:~ david$ ./a.out
bogus tv_usec: 1075544305 -615731632, prev 1075544304 349672
out of order tv_sec: 1075544304 349759, prev 1075544305 -615731632
out of order tv_usec: 1075544305 339904, prev 1075544305 339905
bogus tv_usec: 1075544307 -615731811, prev 1075544306 349493
out of order tv_sec: 1075544306 349498, prev 1075544307 -615731811
out of order tv_usec: 1075544307 339442, prev 1075544307 339443
out of order tv_usec: 1075544308 339351, prev 1075544308 339352
This is a part of the output. Whats -very- interesting:

Apple provides a little tool that can enable / disable the l2 cache ... 
one CPU of a dual CPU system on the fly. When i start the testapp with 
two CPU's enabled i get this output here, when i turn off one CPU while 
the app is still running the messages disappear as long as one CPU is 
turned off. Reactivating the CPU again produces new error messages. I 
checked the app on a single G4, no errors and i checked the app on a 
dual G4, -not- G5 and also no error messages.

Do you remember where one can find a patch? Maybe its something one can 
fix because parts of the OS from Apple are 'open source'.

Do you know if this bug makes a system unusable with PostgresSQL?

Regards David

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] another query optimization question

2004-01-31 Thread David Teran
Hi,

I'm not sure ... I thought I ran it on my P4 here in the office and 
saw it
too, albeit not near as frequently ... but, in FreeBSD's case, it is a
design issue ... there are two different functions, once that is 
kinda
fuzzy (but fast), and the other that is designed to be exact, but at a
performance loss ... or was it the same function, but a 'sysctl' 
variable
that changes the state?

Can't remember which, but it is by design on FreeBSD ... and, if we're
talking about Apple, the same most likely applies, as its based on the
same kernel ...
Back of my mind, I *think* it was these sysctl variables:

kern.timecounter.method: 0
kern.timecounter.hardware: i8254
I will try to check this on my system.

But here another hint, maybe more interesting for Apple though: The bug 
does -not- occur if another process uses a lot of CPU time. We encoded 
a quicktime movie into mpeg2 and while this was using about 90% and 
while encoding the vcd i wanted to show the bug to a friend and it did 
not work.

But besides this, is there any chance that we can optimize our initial 
performance problem ;-)

regards David

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


[PERFORM] another query optimization question

2004-01-30 Thread David Teran
Hi,

its me again. As far as we tested postgresql ist fast, very fast  
compared to the other db system we test and are using currently.

 We are now testing some test databases on Postgres. We use one  
function which simply calculates a difference between two values and  
checks if on value is 0, so something like this:

declare
diff integer;
begin
if $1  $2
then
diff := $1 -$2;
return diff * diff;
else
return 0;
end if;
end;
Language for this function is plpgsql

executing a select like this:

select
sum(job_property_difference(t0.int_value, t1.int_value)) as rank
  from
  job_property t0,
  job_property t1
  where
  t0.id_job_profile = 911
  and t0.id_job_attribute = t1.id_job_attribute
  and t1.id_job_profile in (select id_job_profile from unemployed)
  and t1.id_job_profile  911;
results in a query plan result:

QUERY PLAN
 
---
 Aggregate  (cost=70521.28..70521.28 rows=1 width=8)
   -  Merge Join  (cost=66272.11..70158.29 rows=145194 width=8)
 Merge Cond: (outer.id_job_attribute =  
inner.id_job_attribute)
 -  Sort  (cost=31.53..32.44 rows=366 width=8)
   Sort Key: t0.id_job_attribute
   -  Index Scan using  
job_property__id_job_profile__fk_index on job_property t0   
(cost=0.00..15.95 rows=366 width=8)
 Index Cond: (id_job_profile = 911)
 -  Sort  (cost=66240.58..67456.79 rows=486483 width=8)
   Sort Key: t1.id_job_attribute
   -  Hash IN Join  (cost=34.08..20287.32 rows=486483  
width=8)
 Hash Cond: (outer.id_job_profile =  
inner.id_job_profile)
 -  Seq Scan on job_property t1   
(cost=0.00..12597.89 rows=558106 width=12)
   Filter: (id_job_profile  911)
 -  Hash  (cost=31.46..31.46 rows=1046 width=4)
   -  Seq Scan on unemployed   
(cost=0.00..31.46 rows=1046 width=4)
(21 rows)



This takes about 1minute, 45 seconds on a test database with about  
31.882 job_profile and 8.483.005 job_property records. The final  
solution will have about 1.000.000 job_profile records and, well ...  
about 266.074.901  so we wonder what options we have in order to  
improve this select. Should we rewrite the function (and others) in C?  
Turning off seqscans makes it slower which might be because psql is  
hopping between the index and the row values back and forth as a lot of  
rows are involved.

Any hint how to speed up this would be great.

regards David

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] another query optimization question

2004-01-30 Thread David Teran
Hi,

On 30.01.2004, at 19:10, Stephan Szabo wrote:

On Fri, 30 Jan 2004, David Teran wrote:

select
sum(job_property_difference(t0.int_value, t1.int_value)) as rank
   from
   job_property t0,
   job_property t1
   where
   t0.id_job_profile = 911
   and t0.id_job_attribute = t1.id_job_attribute
   and t1.id_job_profile in (select id_job_profile from unemployed)
   and t1.id_job_profile  911;
results in a query plan result:
Can we see explain analyze output for the query, it'll give more
information about actual time and row counts than plain explain.


sure, here it is comes. What we need to achieve is: we have different  
job_profiles, each profile has multiple values. For one given profile  
we need the ' sum of the distance of every value in the given profile  
and every other profile'. The result is usually grouped by the profile  
id but to make the query easier i removed this, it does not cost too  
much time and it turned out that this query here uses most of the time.

thanks, David



 
   QUERY PLAN
 
 

 Aggregate  (cost=2689349.81..2689349.81 rows=1 width=8) (actual  
time=100487.423..100487.423 rows=1 loops=1)
   -  Merge Join  (cost=2451266.53..2655338.83 rows=13604393 width=8)  
(actual time=82899.466..-2371037.726 rows=2091599 loops=1)
 Merge Cond: (outer.id_job_attribute =  
inner.id_job_attribute)
 -  Sort  (cost=97.43..100.63 rows=1281 width=8) (actual  
time=3.937..4.031 rows=163 loops=1)
   Sort Key: t0.id_job_attribute
   -  Index Scan using  
job_property__id_job_profile__fk_index on job_property t0   
(cost=0.00..31.31 rows=1281 width=8) (actual time=1.343..3.766 rows=163  
loops=1)
 Index Cond: (id_job_profile = 911)
 -  Sort  (cost=2451169.10..2483246.47 rows=12830947 width=8)  
(actual time=82891.076..-529619.213 rows=4187378 loops=1)
   Sort Key: t1.id_job_attribute
   -  Hash IN Join  (cost=507.32..439065.37 rows=12830947  
width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
 Hash Cond: (outer.id_job_profile =  
inner.id_job_profile)
 -  Seq Scan on job_property t1   
(cost=0.00..246093.84 rows=12830947 width=12) (actual  
time=0.136..19101.796 rows=8482533 loops=1)
   Filter: (id_job_profile  911)
 -  Hash  (cost=467.46..467.46 rows=15946 width=4)  
(actual time=61.313..61.313 rows=0 loops=1)
   -  Seq Scan on unemployed   
(cost=0.00..467.46 rows=15946 width=4) (actual time=0.157..50.842  
rows=15960 loops=1)
 Total runtime: 103769.592 ms



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] another query optimization question

2004-01-30 Thread David Teran
HI Tom.

I got a little distracted by the bizarre actual-time values shown for
some of the query steps:
-  Merge Join  (cost=2451266.53..2655338.83 rows=13604393 
width=8)
(actual time=82899.466..-2371037.726 rows=2091599 loops=1)

  -  Sort  (cost=2451169.10..2483246.47 rows=12830947 
width=8)
(actual time=82891.076..-529619.213 rows=4187378 loops=1)

-  Hash IN Join  (cost=507.32..439065.37 
rows=12830947
width=8) (actual time=61.943..1874640.807 rows=4187378 loops=1)
The hash-join total time is obviously wrong seeing that the total
runtime is only about 10 msec, and the negative values for the 
other
two are even more obviously wrong.

I recall that we saw similar symptoms once before, and I thought we'd
fixed it, but I didn't find any relevant mentions in the CVS logs.
What version are you running, exactly?  Could you repeat the EXPLAIN
with client_min_messages set to 'debug2', and see if you see any
messages about InstrStartTimer or InstrStopNode?
7.4.1, build from sourcecode. Running on MacOS X Server 10.3.2, dual G5 
with 3.5 GB RAM

I have set client_min_messages in postgresql.conf to debug2 but i see 
nothing. Is this going to the normal logfile? Must i activate anything 
else?

regards David

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] optimizing Postgres queries

2004-01-08 Thread David Teran
... wow:

executing a batch file with about 4250 selects, including lots of joins 
other things PostgreSQL 7.4 is about 2 times faster than FrontBase 
3.6.27. OK, we will start to make larger tests but this is quite 
interesting already: we did not optimize a lot, just invoked VACUUM 
ANALYZE and then the selects ;-)

Thanks to all who answered to this thread.

cheers David

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] optimizing Postgres queries

2004-01-05 Thread David Teran
Hi Shridhar,

Are you sure you are using correct data types on indexes?

Did not know about this...

e.g. if field1 is an int2 field, then following query would not use an 
index.

our fk have the type bigint, when i try one simple select like this:

explain analyze SELECT --columns-- FROM KEY_VALUE_META_DATA t0 WHERE 
t0.ID_FOREIGN_TABLE = 21110;

i see that no index is being used whereas when i use

explain analyze SELECT --columns-- FROM KEY_VALUE_META_DATA t0 WHERE 
t0.ID_FOREIGN_TABLE = 21110::bigint;

an index is used. Very fine, the performance is about 10 to 100 times 
faster for the single select.

I am using WebObjects with JDBC. I will now create a DB with integer 
instead of bigint and see how this performs.

regards David

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] optimizing Postgres queries

2004-01-05 Thread David Teran
Hi,

The performance will likely to be the same. Its just that integer 
happens to
be default integer type and hence it does not need an explicit 
typecast. ( I
don't remember exactly which integer is default but it is either of 
int2,int4
and int8...:-))

The docs say int4 is much faster than int8, but i will check this.

The performance diffference is likely due to use of index, which is in 
turn
due to typecasting. If you need bigint, you should use them. Just 
remember to
typecast whenever required.
This is my bigger problem: i am using EOF (OR mapping tool) which frees 
me more or less form writing a lot of SQL. If i need to typecast to use 
an index then i have to see how to do this with this framework.

Regards David

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] optimizing Postgres queries

2004-01-05 Thread David Teran
Hi Tom,

It's worth pointing out that this problem is fixed (at long last) in
CVS tip.  Ypu probably shouldn't expend large amounts of effort on
working around a problem that will go away in 7.5.
We have now changed the definition to integer, this will work for some 
time. We are currently evaluating and have several production database 
we might switch in some time.

What we found out now is that a query with a single 'where' works fine, 
the query planer uses the index but when we have 'two' where clauses it 
does not use the index anymore:

EXPLAIN ANALYZE SELECT columns...  FROM KEY_VALUE_META_DATA t0 WHERE 
(t0.ID_VALUE = 14542); performs fine, less than one millisecond.

EXPLAIN ANALYZE SELECT columns...  FROM KEY_VALUE_META_DATA t0 WHERE 
(t0.ID_VALUE = 14542 OR t0.ID_VALUE = 14550); performs bad: about 
235 milliseconds.

I tried to change the second one to use IN but this did not help at 
all. Am i doing something wrong? I have an index defined like this:

CREATE INDEX key_value_meta_data__id_value__fk_index ON 
KEY_VALUE_META_DATA USING btree (ID_VALUE);

Regards David

---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] optimizing Postgres queries

2004-01-05 Thread David Teran
Hi Tom,


David Teran [EMAIL PROTECTED] writes:
What we found out now is that a query with a single 'where' works  
fine,
the query planer uses the index but when we have 'two' where clauses  
it
does not use the index anymore:

EXPLAIN ANALYZE SELECT columns...  FROM KEY_VALUE_META_DATA t0 WHERE
(t0.ID_VALUE = 14542); performs fine, less than one millisecond.

EXPLAIN ANALYZE SELECT columns...  FROM KEY_VALUE_META_DATA t0 WHERE
(t0.ID_VALUE = 14542 OR t0.ID_VALUE = 14550); performs bad: about
235 milliseconds.
Please, when you ask this sort of question, show the EXPLAIN ANALYZE
output.  It is not a virtue to provide minimal information and see if
anyone can guess what's happening.
Sorry for that, i thought this is such a trivial question that the  
answer is easy.

explain result from first query:

Index Scan using key_value_meta_data__id_value__fk_index on KEY_VALUE_M 
ETA_DATA t0  (cost=0.00..1585.52 rows=467 width=1068) (actual time=0.42 
4..0.493 rows=13 loops=1)

  Index Cond: (ID_VALUE = 21094)

Total runtime: 0.608 ms



explain result from second query:

Seq Scan on KEY_VALUE_META_DATA t0  (cost=0.00..2671.16 rows=931 width 
=1068) (actual time=122.669..172.179 rows=25 loops=1)

  Filter: ((ID_VALUE = 21094) OR (ID_VALUE = 21103))

Total runtime: 172.354 ms



I found out that its possible to disable seq scans with set  
enable_seqscan to off; then the second query result looks like this:

Index Scan using key_value_meta_data__id_value__fk_index, key_value_meta 
_data__id_value__fk_index on KEY_VALUE_META_DATA t0  (cost=0.00..3173. 
35 rows=931 width=1068) (actual time=0.116..0.578 rows=25 loops=1)

  Index Cond: ((ID_VALUE = 21094) OR (ID_VALUE = 21103))

Total runtime: 0.716 ms

But i read in the docs that its not OK to turn this off by default. I  
really wonder if this is my fault or not, from my point of view this is  
such a simple select that the query plan should not result in a table  
scan.

Regards David

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly