Re: [PERFORM] Checking = with timestamp field is slow
On Nov 5, 2004, at 4:16 PM, Antony Paul wrote: where today::date = '2004-11-05'; This is the only condition in the query. There is a btree index on the column today. Is there any way to optimise it. I'm sure others out there have better ideas, but you might want to try where current_date = date '2004-11-05' Might not make a difference at all, but perhaps PostgreSQL is coercing both values to timestamp or some other type as you're only providing a string to compare to a date. Then again, it might make no difference at all. My 1 cent. Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Checking = with timestamp field is slow
On Nov 5, 2004, at 5:14 PM, Michael Glaesemann wrote: On Nov 5, 2004, at 4:16 PM, Antony Paul wrote: where today::date = '2004-11-05'; This is the only condition in the query. There is a btree index on the column today. Is there any way to optimise it. I'm sure others out there have better ideas, but you might want to try where current_date = date '2004-11-05' Ach! just re-read that. today is one of your columns! Try where today::date = date '2004-11-05' ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Checking = with timestamp field is slow
On Fri, Nov 05, 2004 at 12:46:20PM +0530, Antony Paul wrote: >I have a table which have more than 20 records. I need to get > the records which matches like this > > where today::date = '2004-11-05'; > > This is the only condition in the query. There is a btree index on the > column today. Is there any way to optimise it. Is the today column a TIMESTAMP as the subject implies? If so then your queries probably aren't using the index because you're changing the type to something that's not indexed. Your queries should speed up if you create an index on DATE(today): CREATE INDEX foo_date_today_idx ON foo (DATE(today)); After creating the new index, use WHERE DATE(today) = '2004-11-05' in your queries. EXPLAIN ANALYZE should show that the index is being used. -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Restricting Postgres
To what extent would your problems be solved by having a 2nd server, a replication system (such as slony-1, but there are others), and some sort of load-balancer in front of it? The load-balancing could be as simple as round-robin DNS server, perhaps... Then when you need to do maintenance such a vacuum full, you can temporarily take 1 server out of the load-balancer (I hope) and do maintenance, and then the other. I don't know what that does to replication, but I would venture that replication systems should be designed to handle a node going offline. Load balancing could also help to protect against server-overload and 1 server toppling over. Of course, I don't know to what extent having another piece of hardware is an option, for you. cheers, --Tim -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Behalf Of Martin Foster Sent: Friday, November 05, 2004 3:50 AM To: [EMAIL PROTECTED] Subject: Re: [PERFORM] Restricting Postgres [...] Now is there an administrative command in PostgreSQL that will cause it to move into some sort of maintenance mode? For me that could be exceedingly useful as it would still allow for an admin connection to be made and run a VACUUM FULL and such. Martin Foster Creator/Designer Ethereal Realms [EMAIL PROTECTED] ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Checking = with timestamp field is slow
On Fri, 2004-11-05 at 12:46 +0530, Antony Paul wrote: > Hi all, >I have a table which have more than 20 records. I need to get > the records which matches like this > > where today::date = '2004-11-05'; > > This is the only condition in the query. There is a btree index on the > column today. > Is there any way to optimise it. Hi Antony, I take it your field is called "today" (seems dodgy, but these things happen...). Anywa, have you tried indexing on the truncated value? create index xyz_date on xyz( today::date ); analyze xyz; That's one way. It depends on how many of those 200,000 rows are on each date too, as to whether it will get used by your larger query. Regards, Andrew. - Andrew @ Catalyst .Net .NZ Ltd, PO Box 11-053, Manners St, Wellington WEB: http://catalyst.net.nz/PHYS: Level 2, 150-154 Willis St DDI: +64(4)803-2201 MOB: +64(272)DEBIAN OFFICE: +64(4)499-2267 When in doubt, tell the truth. -- Mark Twain - signature.asc Description: This is a digitally signed message part
Re: [PERFORM] Checking = with timestamp field is slow
On Fri, Nov 05, 2004 at 05:32:49PM +0900, Michael Glaesemann wrote: > > On Nov 5, 2004, at 5:14 PM, Michael Glaesemann wrote: > > > > >On Nov 5, 2004, at 4:16 PM, Antony Paul wrote: > >>where today::date = '2004-11-05'; > >> > >>This is the only condition in the query. There is a btree index on the > >>column today. > >>Is there any way to optimise it. > > > >I'm sure others out there have better ideas, but you might want to try > > > >where current_date = date '2004-11-05' > > Ach! just re-read that. today is one of your columns! Try > > where today::date = date '2004-11-05' Casting '2004-11-05' to DATE shouldn't be necessary, at least not in 7.4.5. test=> EXPLAIN ANALYZE SELECT * FROM foo WHERE today::DATE = '2004-11-05'; QUERY PLAN -- Seq Scan on foo (cost=0.00..268.00 rows=50 width=16) (actual time=0.592..50.854 rows=1 loops=1) Filter: ((today)::date = '2004-11-05'::date) As you can see, '2004-11-05' is already cast to DATE. The sequential scan is happening because there's no index on today::DATE. test=> CREATE INDEX foo_date_idx ON foo (DATE(today)); CREATE INDEX test=> EXPLAIN ANALYZE SELECT * FROM foo WHERE DATE(today) = '2004-11-05'; QUERY PLAN -- Index Scan using foo_date_idx on foo (cost=0.00..167.83 rows=50 width=16) (actual time=0.051..0.061 rows=1 loops=1) Index Cond: (date(today) = '2004-11-05'::date) -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Checking = with timestamp field is slow
After a long battle with technology, [EMAIL PROTECTED] (Antony Paul), an earthling, wrote: > Hi all, >I have a table which have more than 20 records. I need to get > the records which matches like this > > where today::date = '2004-11-05'; > > This is the only condition in the query. There is a btree index on the > column today. > Is there any way to optimise it. How about changing the criterion to: where today between '2004-11-05' and '2004-11-06'; That ought to make use of the index on "today". -- "cbbrowne","@","ntlug.org" http://www.ntlug.org/~cbbrowne/sgml.html "People need to quit pretending they can invent THE interface and walk away from it, like some Deist fantasy." -- Michael Peck ---(end of broadcast)--- TIP 8: explain analyze is your friend
[PERFORM] Strange (?) Index behavior?
Hey people, long while since I posted here, but I'm having an index issue that looks on the surface to be a little strange. I have a text field that I'm trying to query on in a table with millions of rows. Stupid I know, but a fairly common stupid thing to try to do. For some reason it's a requirement that partial wildcard searches are done on this field, such as "SELECT ... WHERE field LIKE 'A%'" I thought an interesting way to do this would be to simply create partial indexes for each letter on that field, and it works when the query matches the WHERE clause in the index exactly like above. The problem is thus: Say I have an index.. CREATE INDEX column_idx_a ON table (column) WHERE column LIKE 'A%' It seems to me that a query saying "SELECT column FROM table WHERE column LIKE 'AA%';" should be just as fast or very close to the first case up above. However, explain tells me that this query is not using the index above, which is what's not making sense to me. Does the planner not realize that 'AA%' will always fall between 'A%' and 'B', and thus that using the index would be the best way to go, or am I missing something else that's preventing this from working? ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Strange (?) Index behavior?
> For some reason it's a requirement that partial wildcard > searches are done on this field, such as "SELECT ... WHERE > field LIKE 'A%'" > > I thought an interesting way to do this would be to simply > create partial indexes for each letter on that field, and it > works when the query matches the WHERE clause in the index > exactly like above. The problem is thus: I thought PG could use an ordinary index for 'like' conditions with just a terminating '%'? My other thought is that like 'A%' should grab about 1/26th of the table anyway (if the initial character distribution is random), and so a sequential scan might be the best plan anyway... M ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Strange (?) Index behavior?
> It seems to me that a query saying "SELECT column FROM table WHERE > column LIKE 'AA%';" should be just as fast or very close to the first > case up above. However, explain tells me that this query is not using > the index above, which is what's not making sense to me. It looks for an exact expression match, and doesn't know about values which are equal. You can provide both clauses. WHERE column LIKE 'A%' and column LIKE 'AA%'; ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 05 Nov 2004 10:07:38 -0500, Rod Taylor <[EMAIL PROTECTED]> wrote: > > It seems to me that a query saying "SELECT column FROM table WHERE > > column LIKE 'AA%';" should be just as fast or very close to the first > > case up above. However, explain tells me that this query is not using > > the index above, which is what's not making sense to me. > > It looks for an exact expression match, and doesn't know about values > which are equal. > > You can provide both clauses. > > WHERE column LIKE 'A%' and column LIKE 'AA%'; I see. That's not really optimal either however as you can probably see already.. adding AB, AC, AD...AZ is likely to be pretty bogus and at the least is time consuming. Matt Clark was right that it will use a standard index, which is in fact what it's doing right now in the "SELECT column WHERE column LIKE 'AA%';" case.. however as I said, the table has millions of rows -- currently about 76 million, so even a full index scan is fairly slow. The machine isn't all that hot performance wise either, a simple dual 800 P3 with a single 47GB Seagate SCSI. The only redeeming factor is that it has 2GB of memory, which I'm trying to make the most of with these indexes. So assuming this partial index situation isn't going to change (it seems like it would be a fairly simple fix for someone that knows the pg code however) I'm wondering if a subselect may speed things up any, so I'm going to investigate that next. Perhaps.. SELECT column FROM (SELECT column FROM table WHERE column LIKE 'A%') AS sq WHERE column LIKE 'AA%'; The query planner thinks this will be pretty fast indeed, and does use the index I am after. OS is, of course, FreeBSD. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
[PERFORM] postgresql amd-64
Does anybody have any experiences with postgresql 7.4+ running on amd-64 in 64 bit mode? Specifically, does it run quicker and if so do the performance benefits justify the extra headaches running 64 bit linux? Right now I'm building a dual Opteron 246 with 4 gig ddr400. Merlin ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 5 Nov 2004 10:32:43 -0500, Allen Landsidel <[EMAIL PROTECTED]> wrote: > On Fri, 05 Nov 2004 10:07:38 -0500, Rod Taylor <[EMAIL PROTECTED]> wrote: > > > > > It seems to me that a query saying "SELECT column FROM table WHERE > > > column LIKE 'AA%';" should be just as fast or very close to the first > > > case up above. However, explain tells me that this query is not using > > > the index above, which is what's not making sense to me. > > > > It looks for an exact expression match, and doesn't know about values > > which are equal. > > > > You can provide both clauses. > > > > WHERE column LIKE 'A%' and column LIKE 'AA%'; > > I see. That's not really optimal either however as you can probably > see already.. adding AB, AC, AD...AZ is likely to be pretty bogus and > at the least is time consuming. I see now that you mean to add that to the SELECT clause and not the index, my mistake. > Perhaps.. SELECT column FROM (SELECT column FROM table WHERE column > LIKE 'A%') AS sq WHERE column LIKE 'AA%'; > > The query planner thinks this will be pretty fast indeed, and does use > the index I am after. This was indeed pretty fast. About 7 seconds, as was modifying the WHERE as suggested above. -Allen ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Strange (?) Index behavior?
On Fri, Nov 05, 2004 at 09:39:16 -0500, Allen Landsidel <[EMAIL PROTECTED]> wrote: > > For some reason it's a requirement that partial wildcard searches are > done on this field, such as "SELECT ... WHERE field LIKE 'A%'" > > I thought an interesting way to do this would be to simply create > partial indexes for each letter on that field, and it works when the > query matches the WHERE clause in the index exactly like above. The > problem is thus: That may not help much except for prefixes that have a below average number of occurences. If you are going to be select 1/26 of the records, you are probably going to do about as well with a sequential scan as an index scan. Just having a normal index on the column will work if the database locale is C. In 7.4 you can create an index usable by LIKE even in the database locale isn't C, but I don't remember the exact syntax. You will be better off having just one index rather than 26 partial indexes. ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Checking = with timestamp field is slow
On Fri, Nov 05, 2004 at 07:47:54AM -0500, Christopher Browne wrote: > > How about changing the criterion to: > > where today between '2004-11-05' and '2004-11-06'; > > That ought to make use of the index on "today". Yes it should, but it'll also return records that have a "today" value of '2004-11-06 00:00:00' since "x BETWEEN y AND z" is equivalent to "x >= y AND x <= z". Try this instead: WHERE today >= '2004-11-05' AND today < '2004-11-06' In another post I suggested creating an index on DATE(today). The above query should make that unnecessary, although in 8.0 such an index would be used in queries like this: WHERE today IN ('2004-09-01', '2004-10-01', '2004-11-01'); -- Michael Fuhr http://www.fuhr.org/~mfuhr/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 5 Nov 2004 11:51:59 -0600, Bruno Wolff III <[EMAIL PROTECTED]> wrote: > On Fri, Nov 05, 2004 at 09:39:16 -0500, > Allen Landsidel <[EMAIL PROTECTED]> wrote: > > > > For some reason it's a requirement that partial wildcard searches are > > done on this field, such as "SELECT ... WHERE field LIKE 'A%'" > > > > I thought an interesting way to do this would be to simply create > > partial indexes for each letter on that field, and it works when the > > query matches the WHERE clause in the index exactly like above. The > > problem is thus: > > That may not help much except for prefixes that have a below average > number of occurences. If you are going to be select 1/26 of the records, > you are probably going to do about as well with a sequential scan as an > index scan. The thing isn't that I want 1/26th of the records since the distribution is not exactly equal among different letters, but more importantly, there are about 76million rows currently, and for some reason I am being told by the people with the pointy hair that a query like "select foo,bar from table where foo like 'abc%';" is not an uncommon type of query to run. I don't know why it's common and to be honest, I'm afraid to ask. ;) With that many rows, and a normal index on the field, postgres figures the best option for say "I%" is not an index scan, but a sequential scan on the table, with a filter -- quite obviously this is slow as heck, and yes, I've run analyze several times and in fact have the vacuum analyze automated. With the partial index the index scan is used and the cost drops from 0..2million to 0..9000 -- a vast improvement. So I'm going to go with the partial indexes, and have a total of 36 of them -- A-Z and 0-9. > Just having a normal index on the column will work if the database locale > is C. In 7.4 you can create an index usable by LIKE even in the database > locale isn't C, but I don't remember the exact syntax. You will be better > off having just one index rather than 26 partial indexes. I haven't written a line of C in years, and it was never my strong suit, so despite all my years doing development and sysadminning, the locale stuff is still something of a mystery to me. The locale though is C, the default, and will for the time being at least be storing only ascii strings -- no unicode, other character sets, or anything funky like that. -Allen ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] What is the difference between these?
To me, these three queries seem identical... why doesn't the first one (simplest to understand and write) go the same speed as the third one? I'll I'm trying to do is get statistics for one day (in this case, today) summarized. Table has ~25M rows. I'm using postgres 7.3.? on rh linux 7.3 (note that i think the difference between the first two might just be related to the data being in memory for the second query). EXPLAIN ANALYZE select count(distinct sessionid) from usage_access where atime > date_trunc('day', now()); QUERY PLAN -- Aggregate (cost=933439.69..933439.69 rows=1 width=4) (actual time=580350.43..580350.43 rows=1 loops=1) -> Seq Scan on usage_access (cost=0.00..912400.11 rows=8415831 width=4) (actual time=580164.48..580342.21 rows=2964 loops=1) Filter: (atime > date_trunc('day'::text, now())) Total runtime: 580350.65 msec (4 rows) EXPLAIN ANALYZE select count(distinct sessionid) from (select * from usage_access where atime > date_trunc('day', now())) as temp; QUERY PLAN -- Aggregate (cost=933439.69..933439.69 rows=1 width=4) (actual time=348012.85..348012.85 rows=1 loops=1) -> Seq Scan on usage_access (cost=0.00..912400.11 rows=8415831 width=4) (actual time=347960.53..348004.68 rows=2964 loops=1) Filter: (atime > date_trunc('day'::text, now())) Total runtime: 348013.10 msec (4 rows) EXPLAIN ANALYZE select count(distinct sessionid) from usage_access where atime between date_trunc('day', now()) and date_trunc('day', now()) + '1 day'::interval; QUERY PLAN -- Aggregate (cost=89324.98..89324.98 rows=1 width=4) (actual time=27.84..27.84 rows=1 loops=1) -> Index Scan using usage_access_atime on usage_access (cost=0.00..89009.39 rows=126237 width=4) (actual time=0.51..20.37 rows=2964 loops=1) Index Cond: ((atime >= date_trunc('day'::text, now())) AND (atime <= (date_trunc('day'::text, now()) + '1 day'::interval))) Total runtime: 28.11 msec (4 rows) -- Matthew Nuzum | Makers of "Elite Content Management System" www.followers.net | View samples of Elite CMS in action [EMAIL PROTECTED] | http://www.followers.net/portfolio/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] postgresql amd-64
I have two dual opteron 248's with 4g of ram each, 6x36G 15k rpm ultra 320 scsi disks in hardware raid 5, and they are by far the fastest machines I've user used. As far as this "headache" of using 64 bit Linux, I've experienced no such thing. I'm using gentoo on both machines, which are dedicated for postgres 7.4 and replicated with slony. They're both quite fast and reliable. One machine even runs a secondary instance of pg, pg 8 beta4 in this case, for development, which also runs quite well. Daniel Merlin Moncure wrote: >Does anybody have any experiences with postgresql 7.4+ running on amd-64 >in 64 bit mode? Specifically, does it run quicker and if so do the >performance benefits justify the extra headaches running 64 bit linux? > >Right now I'm building a dual Opteron 246 with 4 gig ddr400. > >Merlin > >---(end of broadcast)--- >TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > > -- Daniel Ceregatti - Programmer Omnis Network, LLC The forest is safe because a lion lives therein and the lion is safe because it lives in a forest. Likewise the friendship of persons rests on mutual help. -- Laukikanyay. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] vacuum analyze slows sql query
Just wanted to know if there were any insights after looking at requested 'explain analyze select ...'? Thanks, --patrick __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Strange (?) Index behavior?
> With that many rows, and a normal index on the field, > postgres figures the best option for say "I%" is not an index > scan, but a sequential scan on the table, with a filter -- > quite obviously this is slow as heck, and yes, I've run > analyze several times and in fact have the vacuum analyze automated. Ah, so "like 'I%'" uses a very slow seq scan, but "like 'ABC%'" uses an ordinary index OK? If so then... The planner would usually assume (from what Tom usually says) that 1/26 selectivity isn't worth doing an index scan for, but in your case it's wrong (maybe because the rows are very big?) You may be able to get the planner to go for an index scan on "like 'I%'" by tweaking the foo_cost variables in postgresql.conf Or you could have the app rewrite "like 'I%'" to "like 'IA%' or like 'IB%' ... ", or do that as a stored proc. > With the partial index the index scan is used and the cost > drops from 0..2million to 0..9000 -- a vast improvement. So there are really only 9000 rows out of 76 million starting with 'I'? How about combining some techniques - you could create an index on the first two chars of the field (should be selective enough to give an index scan), select from that, and select the actual data with the like clause. CREATE INDEX idx_firstletters ON table (substr(field, 1, 2)); CREATE INDEX idx_all ON table (field); SELECT field FROM (SELECT field FROM table WHERE substr(field, 1, 2) = 'DE') AS approx WHERE field LIKE 'DE%'; Any good? ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] What is the difference between these?
Matt Nuzum wrote: To me, these three queries seem identical... why doesn't the first one (simplest to understand and write) go the same speed as the third one? If you look at the explain output, you will notice that only the 3rd query is using an Index Scan, where as the 1st and 2nd are doing a sequential scan over the entire table of 25M rows. My guess is that the problem is related to outdated statistics on the atime column. If you notice the 1st and 2nd queries estimate 8.4M rows returned at which point a seq scan is the right choice, but the 3rd query using the between statement only estimates 127k rows which make the Index a better option. All of these queries only return 2964 rows so it looks like your stats are out of date. Try running an analyze command right before doing any of these queries and see what happens. I'll I'm trying to do is get statistics for one day (in this case, today) summarized. Table has ~25M rows. I'm using postgres 7.3.? on rh linux 7.3 (note that i think the difference between the first two might just be related to the data being in memory for the second query). EXPLAIN ANALYZE select count(distinct sessionid) from usage_access where atime > date_trunc('day', now()); QUERY PLAN -- Aggregate (cost=933439.69..933439.69 rows=1 width=4) (actual time=580350.43..580350.43 rows=1 loops=1) -> Seq Scan on usage_access (cost=0.00..912400.11 rows=8415831 width=4) (actual time=580164.48..580342.21 rows=2964 loops=1) Filter: (atime > date_trunc('day'::text, now())) Total runtime: 580350.65 msec (4 rows) EXPLAIN ANALYZE select count(distinct sessionid) from (select * from usage_access where atime > date_trunc('day', now())) as temp; QUERY PLAN -- Aggregate (cost=933439.69..933439.69 rows=1 width=4) (actual time=348012.85..348012.85 rows=1 loops=1) -> Seq Scan on usage_access (cost=0.00..912400.11 rows=8415831 width=4) (actual time=347960.53..348004.68 rows=2964 loops=1) Filter: (atime > date_trunc('day'::text, now())) Total runtime: 348013.10 msec (4 rows) EXPLAIN ANALYZE select count(distinct sessionid) from usage_access where atime between date_trunc('day', now()) and date_trunc('day', now()) + '1 day'::interval; QUERY PLAN -- Aggregate (cost=89324.98..89324.98 rows=1 width=4) (actual time=27.84..27.84 rows=1 loops=1) -> Index Scan using usage_access_atime on usage_access (cost=0.00..89009.39 rows=126237 width=4) (actual time=0.51..20.37 rows=2964 loops=1) Index Cond: ((atime >= date_trunc('day'::text, now())) AND (atime <= (date_trunc('day'::text, now()) + '1 day'::interval))) Total runtime: 28.11 msec (4 rows) ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Strange (?) Index behavior?
Allen Landsidel <[EMAIL PROTECTED]> writes: > With that many rows, and a normal index on the field, postgres figures > the best option for say "I%" is not an index scan, but a sequential > scan on the table, with a filter -- quite obviously this is slow as > heck, and yes, I've run analyze several times and in fact have the > vacuum analyze automated. > With the partial index the index scan is used and the cost drops from > 0..2million to 0..9000 -- a vast improvement. Hmm. This suggests to me that you're using a non-C locale and so a plain index *can't* be used for a LIKE query. Can you force it to use an indexscan by setting enable_seqscan = false? If not then you've got a locale problem. As someone else pointed out, this can be worked around by creating an index with the right operator class. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] What is the difference between these?
Matt Nuzum <[EMAIL PROTECTED]> writes: > To me, these three queries seem identical... why doesn't the first one > (simplest to understand and write) go the same speed as the third one? This is the standard problem that the planner has to guess about the selectivity of inequalities involving non-constants (like now()). The guesses are set up so that a one-sided inequality will use a seqscan while a range constraint will use an indexscan. See the pgsql-performance archives for other ways of persuading it that an indexscan is a good idea. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 05 Nov 2004 14:57:40 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: > Allen Landsidel <[EMAIL PROTECTED]> writes: > > With that many rows, and a normal index on the field, postgres figures > > the best option for say "I%" is not an index scan, but a sequential > > scan on the table, with a filter -- quite obviously this is slow as > > heck, and yes, I've run analyze several times and in fact have the > > vacuum analyze automated. > > With the partial index the index scan is used and the cost drops from > > 0..2million to 0..9000 -- a vast improvement. > > Hmm. This suggests to me that you're using a non-C locale and so a > plain index *can't* be used for a LIKE query. Can you force it to use > an indexscan by setting enable_seqscan = false? If not then you've got > a locale problem. As someone else pointed out, this can be worked > around by creating an index with the right operator class. Tom, disabling seqscan does cause it to use the index. With seqscan enabled however, "AB%" will use the index, but "A%" will not. The estimated cost for the query is much higher without the partial indexes than it is with them, and the actual runtime of the query is definitely longer without the partial indexes. The locale is set in the postgresql.conf file as per default, with.. # These settings are initialized by initdb -- they may be changed lc_messages = 'C' # locale for system error message strings lc_monetary = 'C' # locale for monetary formatting lc_numeric = 'C'# locale for number formatting lc_time = 'C' # locale for time formatting -Allen ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 5 Nov 2004 18:34:23 -, Matt Clark <[EMAIL PROTECTED]> wrote: > > With that many rows, and a normal index on the field, > > postgres figures the best option for say "I%" is not an index > > scan, but a sequential scan on the table, with a filter -- > > quite obviously this is slow as heck, and yes, I've run > > analyze several times and in fact have the vacuum analyze automated. > > Ah, so "like 'I%'" uses a very slow seq scan, but "like 'ABC%'" uses an > ordinary index OK? If so then... That is correct. > The planner would usually assume (from what Tom usually says) that 1/26 > selectivity isn't worth doing an index scan for, but in your case it's wrong > (maybe because the rows are very big?) The rows aren't big, it's a text field, a few ints, and a few timestamps. That's all. The text field is the one we're querying on here and lengthwise it's typically not over 32 chars. > You may be able to get the planner to go for an index scan on "like 'I%'" by > tweaking the foo_cost variables in postgresql.conf That's true but I'd rather not, there are times when the seqscan will have a faster net result (for other queries) and I'd rather not have them suffer. > Or you could have the app rewrite "like 'I%'" to "like 'IA%' or like 'IB%' > ... ", or do that as a stored proc. Holy cow. Yeah that seems a little outrageous. It would be cleaner looking in "\d table" than having all these indexes at the cost of having one very ugly query. > > With the partial index the index scan is used and the cost > > drops from 0..2million to 0..9000 -- a vast improvement. > > So there are really only 9000 rows out of 76 million starting with 'I'? How > about combining some techniques - you could create an index on the first two > chars of the field (should be selective enough to give an index scan), > select from that, and select the actual data with the like clause. I was talking about the cost, not the number of rows. About 74,000 rows are returned but the query only takes about 8 seconds to run. -- with the partial index in place. > CREATE INDEX idx_firstletters ON table (substr(field, 1, 2)); > CREATE INDEX idx_all ON table (field); > SELECT field FROM (SELECT field FROM table WHERE substr(field, 1, 2) = 'DE') > AS approx WHERE field LIKE 'DE%'; That looks like a pretty slick way to create an index, I didn't know there was such a way to do it.. but It appears that this will not work with queries where the WHERE clause wants to find substrings longer than 2 characters. I will give it a try and see how it goes though I think I'm fairly "settled" on creating all the other indexes, unless there is some specific reason I shouldn't -- they are used in all cases where the substring is >= 1 character, so long as I make sure the first where clause (or inner select in a subquery) is the most ambiguous from an index standpoint. Going back to the initial problem -- having only one large, complete index on the table (no partial indexes) the query "SELECT field FROM table WHERE field LIKE 'A%';" does not use the index. The query "SELECT field FROM table WHERE field LIKE 'AB%';" however, does use the single large index if it exists. Adding the partial index "CREATE INDEX idx_table_substrfield_A ON table (field) WHERE field LIKE 'A%';" causes all queries with substrings of any length to do index scans.provided I issue the query as: SELECT field FROM table WHERE field LIKE 'A%' AND field LIKE 'AB%'; -- or even -- SELECT field FROM table WHERE field LIKE 'A%'; The latter query, without the partial index described, does a sequential scan on the table itself instead of an index scan. -Allen ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[PERFORM] Question regarding the file system
Hi guys, I have been given a dual PIII with 768MB RAM and I am going to install PostgreSQL on it, for data warehousing reasons. I have also been given four 160 Ultra SCSI disks (36MB each) with a RAID controller (Adaptec 2100). I am going to use a RAID5 architecture (this gives me approximately 103 GB of data) and install a Debian Linux on it: this machine will be dedicated exclusively to PostgreSQL. I was wondering which file system you suggest me: ext3 or reiserfs? Also, I was thinking of using the 2.6.x kernel which offers a faster thread support: will PostgreSQL gain anything from it or should I stick with 2.4.x? Thank you very much, -Gabriele -- Gabriele Bartolini: Web Programmer, ht://Dig & IWA/HWG Member, ht://Check maintainer Current Location: Prato, Toscana, Italia [EMAIL PROTECTED] | http://www.prato.linux.it/~gbartolini | ICQ#129221447 > "Leave every hope, ye who enter!", Dante Alighieri, Divine Comedy, The Inferno --- Outgoing mail is certified Virus Free. Checked by AVG anti-virus system (http://www.grisoft.com). Version: 6.0.788 / Virus Database: 533 - Release Date: 01/11/2004 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Strange (?) Index behavior?
Allen Landsidel <[EMAIL PROTECTED]> writes: > With seqscan enabled however, "AB%" will use the index, but "A%" will not. > The estimated cost for the query is much higher without the partial > indexes than it is with them, and the actual runtime of the query is > definitely longer without the partial indexes. OK. This suggests that the planner is drastically misestimating the selectivity of the 'A%' clause, which seems odd to me since in principle it could get that fairly well from the ANALYZE histogram. But it could well be that you need to increase the resolution of the histogram --- see ALTER TABLE SET STATISTICS. Did you ever show us EXPLAIN ANALYZE results for this query? regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] postgresql amd-64
> I have two dual opteron 248's with 4g of ram each, 6x36G 15k rpm ultra > 320 scsi disks in hardware raid 5, and they are by far the fastest > machines I've user used. As far as this "headache" of using 64 bit > Linux, I've experienced no such thing. I'm using gentoo on both > machines, which are dedicated for postgres 7.4 and replicated with > slony. They're both quite fast and reliable. One machine even runs a > secondary instance of pg, pg 8 beta4 in this case, for development, > which also runs quite well. Good, I'll give it a shot and see what I come up with...thx. Merlin ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 05 Nov 2004 16:08:56 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: > Allen Landsidel <[EMAIL PROTECTED]> writes: > > With seqscan enabled however, "AB%" will use the index, but "A%" will not. > > > The estimated cost for the query is much higher without the partial > > indexes than it is with them, and the actual runtime of the query is > > definitely longer without the partial indexes. > > OK. This suggests that the planner is drastically misestimating > the selectivity of the 'A%' clause, which seems odd to me since in > principle it could get that fairly well from the ANALYZE histogram. > But it could well be that you need to increase the resolution of the > histogram --- see ALTER TABLE SET STATISTICS. I will look into this. > > Did you ever show us EXPLAIN ANALYZE results for this query? No, I didn't. I am running it now without the partial index on to give you the results but it's (the 'A%' problem query) been running pretty much since I got this message (an hour ago) and is still not finished. The EXPLAIN results without the ANALYZE will have to suffice until it's done, I can readd the index, and run it again, so you have both to compare to. First two queries run where both the main index, and the 'A%' index exist: -- QUERY 1 search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%'; QUERY PLAN --- Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88 rows=391208 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text)) Filter: (test_name ~~ 'A%'::text) (3 rows) Time: 16.507 ms -- QUERY 2 search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%' AND test_name LIKE 'AB%'; QUERY PLAN - Index Scan using test_name_idx_a on "test" (cost=0.00..113.79 rows=28 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text) AND (test_name >= 'AB'::text) AND (test_name < 'AC'::text)) Filter: ((test_name ~~ 'A%'::text) AND (test_name ~~ 'AB%'::text)) (3 rows) Time: 3.197 ms Ok, now the same two queries after a DROP INDEX test_name_idx_a; search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'A%'; QUERY PLAN --- Index Scan using test_name_unique on "test" (cost=0.00..1568918.66 rows=391208 width=20) Index Cond: ((test_name >= 'A'::text) AND (test_name < 'B'::text)) Filter: (test_name ~~ 'A%'::text) (3 rows) Time: 2.470 ms search=# explain search-# SELECT test_name FROM test WHERE test_name LIKE 'AB%'; QUERY PLAN --- Index Scan using test_name_unique on "test" (cost=0.00..20379.49 rows=5081 width=20) Index Cond: ((test_name >= 'AB'::text) AND (test_name < 'AC'::text)) Filter: (test_name ~~ 'AB%'::text) (3 rows) Time: 2.489 ms -- Copying just the costs you can see the vast difference... Index Scan using test_name_unique on "test" (cost=0.00..1568918.66 rows=391208 width=20) Index Scan using test_name_unique on "test" (cost=0.00..20379.49 rows=5081 width=20) vs Index Scan using test_name_idx_a on "test" (cost=0.00..8605.88 rows=391208 width=20) Index Scan using test_name_idx_a on "test" (cost=0.00..113.79 rows=28 width=20) Lastly no, neither of these row guesstimates is correct.. I'll get back and tell you how much they're off by if it's important, once this query is done. The odd thing is it used the index scan here each time -- that has not always been the case with the main unique index, it's trying to make a liar out of me heh. I'm used to the estimates and plan changing from one vacuum analyze to the next, even without any inserts or updates between.. the index scan is always used however when I have the partial indexes in place, and something like.. CREATE TEMP TABLE t1 AS SELECT field FROM table WHERE field LIKE 'A%' AND field LIKE 'AA%'; runs in 6-8 seconds as well, with a bit under 100k records. -Allen ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[PERFORM] ia64 results with dbt2 and 8.0beta4
Hi everyone, Some more data I've collected, trying to best tune dbt-2 with 8.0beta4. Was hoping for some suggestions, explanations for what I'm seeing, etc. A review of hardware I've got: 4 x 1.5Ghz Itanium 2 16GB memory 84 15K RPM disks (6 controlers, 12 channels) Physical Database table layout (using LVM2 for tables using more than 1 disk): - warehouse 2 disks - district 2 disks - order_line 2 disks - customer 4 disks - stock 12 disks - log 12 disks - orders 2 disks - new_order 2 disks - history 1 disk - item 1 disk - index1 1 disk - index2 1 disk All these tests are using a 500 warehouse database. Test 1: http://www.osdl.org/projects/dbt2dev/results/dev4-010/188/ Metric: 3316 DB parameter changes from default: bgwriter_percent | 10 checkpoint_timeout | 300 checkpoint_segments| 800 checkpoint_timeout | 1800 default_statistics_target | 1000 max_connections| 140 stats_block_level | on stats_command_string | on stats_row_level| on wal_buffers| 128 wal_sync_method| fsync work_mem | 2048 Test 2: http://www.osdl.org/projects/dbt2dev/results/dev4-010/189/ Metric: 3261 -1.7% decrease Test 1 DB parameter changes from Test 1: shared_buffers | 6 Noted changes: The block read for the customer table decreases significantly according to the database. Test 3: http://www.osdl.org/projects/dbt2dev/results/dev4-010/190/ Metric: 3261 0% change from Test 2 DB parameter changes from Test 2: effective_cache_size | 22 Noted changes: No apparent changes according to the charts. Test 4: http://www.osdl.org/projects/dbt2dev/results/dev4-010/191/ Metric: 3323 1.9 increase from Test 3 DB parameter changes from Test 3: checkpoint_segments| 1024 effective_cache_size | 1000 Noted Changes: The increased checkpoint_segments smothed out the throughput and other i/o related stats. Test 5: http://www.osdl.org/projects/dbt2dev/results/dev4-010/192/ Metric: 3149 -5% decrease from Test 4 DB parameter changes from Test 4: shared_buffers | 8 Noted changes: The graphs are starting to jump around a bit. I figure 80,000 shared_buffers is too much. Test 6: http://www.osdl.org/projects/dbt2dev/results/dev4-010/193/ Metric: 3277 4% increase from Test 5 DB parameter changes from Test 5: random_page_cost | 2 shared_buffers | 6 Noted changes: Reducing the shared_buffers to the smoother performance found in Test 4 seemed to have disrupted by decreasing the random_page_cost to 2. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Strange (?) Index behavior?
So there are really only 9000 rows out of 76 million starting with 'I'? How about combining some techniques - you could create an index on the first two chars of the field (should be selective enough to give an index scan), select from that, and select the actual data with the like clause. I was talking about the cost, not the number of rows. About 74,000 rows are returned but the query only takes about 8 seconds to run. -- Well, 74000/7600 ~= 0.1%, way less than 1/26, so no surprise that an indexscan is better, and also no surprise that the planner can't know that I is such an uncommon initial char. with the partial index in place. CREATE INDEX idx_firstletters ON table (substr(field, 1, 2)); CREATE INDEX idx_all ON table (field); SELECT field FROM (SELECT field FROM table WHERE substr(field, 1, 2) = 'DE') AS approx WHERE field LIKE 'DE%'; That looks like a pretty slick way to create an index, I didn't know there was such a way to do it.. but It appears that this will not work with queries where the WHERE clause wants to find substrings longer than 2 characters. I don't see why not, it just uses the functional index to grap the 1/(ascii_chars^2) of the rows that are of obvious interest, and then uses the standard index to filter that set.. Where it won't work is where you just want one initial char! Which is why I suggested the silly query rewrite... Going back to the initial problem -- having only one large, complete index on the table (no partial indexes) the query "SELECT field FROM table WHERE field LIKE 'A%';" does not use the index. The query "SELECT field FROM table WHERE field LIKE 'AB%';" however, does use the single large index if it exists. If you were planning the query, what would you do? Assuming we're talking about A-Z as possible first chars, and assuming we don't know the distribution of those chars, then we have to assume 1/26 probability of each char, so a seq scan makes sense. Whereas like 'JK%' should only pull 1/500 rows. Adding the partial index "CREATE INDEX idx_table_substrfield_A ON table (field) WHERE field LIKE 'A%';" causes all queries with substrings of any length to do index scans.provided I issue the query as: SELECT field FROM table WHERE field LIKE 'A%' AND field LIKE 'AB%'; -- or even -- SELECT field FROM table WHERE field LIKE 'A%'; The latter query, without the partial index described, does a sequential scan on the table itself instead of an index scan. Yes, because (I assume, Tom will no doubt clarify/correct), by creating the partial indices you create a lot more information about the distribution of the first char - either that, or the planner simply always uses an exactly matching partial index if available. I _think_ that creating 26 partial indexes on '?%' is essentially the same thing as creating one functional index on substr(field,1,1), just messier, unless the partial indexes cause the planner to do something special... M ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] vacuum analyze slows sql query
Looking around at the pg_ tables and some PostgreSQL online docs prompted by another post/reply on this list regarding ALERT TABLE SET STATISTICS i found out that prior to a VACUUM the following select (taken from the online docs) shows: pkk=# select relname, relkind, reltuples, relpages from pg_class where relname like 'pkk_%'; relname | relkind | reltuples | relpages ---+-+---+-- pkk_billing | r | 1000 | 10 pkk_offer | r | 1000 | 10 pkk_offer_pkey| i | 1000 |1 pkk_purchase | r | 1000 | 10 pkk_purchase_pkey | i | 1000 |1 (5 rows) Time: 1097.263 ms and after a VACUUM: pkk=# vacuum analyze ; VACUUM Time: 100543.359 ms it shows: pkk=# select relname, relkind, reltuples, relpages from pg_class where relname like 'pkk_%'; relname | relkind | reltuples | relpages ---+-+-+-- pkk_billing | r | 714830 | 4930 pkk_offer | r | 618 |6 pkk_offer_pkey| i | 618 |4 pkk_purchase | r | 1.14863e+06 | 8510 pkk_purchase_pkey | i | 1.14863e+06 | 8214 (5 rows) Time: 3.868 ms Further, I notice that if I were to delete rows from the pg_statistic table I get the db in a state where the query is fast again: pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; QUERY PLAN - Seq Scan on pkk_offer (cost=0.00..13.72 rows=618 width=4) (actual time=2415.739..1065709.092 rows=618 loops=1) Total runtime: 1065711.651 ms (2 rows) Time: 1065713.446 ms pkk=# delete from pg_statistic where pg_statistic.starelid = pg_class.oid and pg_class.relname like 'pkk_%'; DELETE 11 Time: 3.368 ms pkk=# select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; (618 rows) Time: 876.377 ms pkk=# explain analyze select offer_id, pkk_offer_has_pending_purch( offer_id ) from pkk_offer ; QUERY PLAN -- Seq Scan on pkk_offer (cost=0.00..13.72 rows=618 width=4) (actual time=1.329..846.786 rows=618 loops=1) Total runtime: 848.170 ms (2 rows) Time: 849.958 ms Now, I'm sure someone (a PostgreSQL developer most likely) is about to shoot me for doing such a thing :-) But, however *ugly, wrong, sacrilege* this may be, if this is the only solution...err workaround I have that will help me i must resort to it. The only two questions I have about this are: 1. Is this really the only solution left for me? 2. Am I in anyway screwing the db doing this? Best regards, --patrick __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] postgresql amd-64
Merlin, > Good, I'll give it a shot and see what I come up with...thx. > Do share your experience with us. -- With Best Regards, Vishal Kashyap. Did you know SaiPACS is one and only PACS Management tool. http://saihertz.com ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Strange (?) Index behavior?
Matt Clark <[EMAIL PROTECTED]> writes: > Well, 74000/7600 ~= 0.1%, way less than 1/26, so no surprise that an > indexscan is better, and also no surprise that the planner can't know > that I is such an uncommon initial char. But it *can* know that, at least given adequate ANALYZE statistics. I'm pretty convinced that the basic answer to Allen's problem is to increase the histogram size. How large he needs to make it is not clear --- obviously his data distribution is not uniform, but I don't have a fix on how badly non-uniform. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] vacuum analyze slows sql query
patrick ~ <[EMAIL PROTECTED]> writes: > 1. Is this really the only solution left for me? You still haven't followed the suggestions that were given to you (ie, find out what is happening with the plan for the query inside the problematic function). regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] vacuum analyze slows sql query
Hi Tom, -performance@, I apologize if I didn't follow through with the PREPARE and EXECUTE. I assume that is what you are refering to. After reading the PostgreSQL docs on PREPARE statement I realized two things: a) PREPARE is only session long and b) that I can not (at least I haven't figured out how) PREPARE a statement which would mimic my original select statement which I could EXECUTE over all rows of pkk_offer table. Best I could do is either: PREPARE pkk_01 ( interger ) select $1, pkk_offer_has_pending_purch( $1 ) from pkk_offer ; or PREPARE pkk_00 ( integer ) In the former case the EXPLAIN ANALYZE doesn't give enough data (it is the same as w/o the PREPARE statement). In the latter case, I can only execute it with one offer_id at at time. Is this sufficient? If so, here are the results before and after VACUUM ANALYZE: pkk=# explain analyze execute pkk_00( 795 ) ; QUERY PLAN Result (cost=8.57..8.58 rows=1 width=0) (actual time=0.095..0.096 rows=1 loops=1) InitPlan -> Limit (cost=0.00..8.57 rows=1 width=4) (actual time=0.083..0.084 rows=1 loops=1) -> Index Scan using pur_offer_id_idx on pkk_purchase p0 (cost=0.00..17.13 rows=2 width=4) (actual time=0.079..0.079 rows=1 loops=1) Index Cond: (offer_id = $1) Filter: expire_time)::timestamp with time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR (pending = true))) Total runtime: 0.238 ms (7 rows) pkk=# VACUUM ANALYZE ; VACUUM Time: 97105.589 ms pkk=# explain analyze execute pkk_00( 795 ) ; QUERY PLAN Result (cost=8.57..8.58 rows=1 width=0) (actual time=0.329..0.330 rows=1 loops=1) InitPlan -> Limit (cost=0.00..8.57 rows=1 width=4) (actual time=0.311..0.312 rows=1 loops=1) -> Index Scan using pur_offer_id_idx on pkk_purchase p0 (cost=0.00..17.13 rows=2 width=4) (actual time=0.307..0.307 rows=1 loops=1) Index Cond: (offer_id = $1) Filter: expire_time)::timestamp with time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR (pending = true))) Total runtime: 0.969 ms (7 rows) Time: 16.252 ms In both before and after "Index Scan" is used on pur_offer_id_idx. So, unless I'm missing something obvious here I am at a loss. I went as far as doing the EXPLAIN ANALYZE EXECUTE pkk_00( offer_id ) for each offer_id in pkk_offer table one at a time (not manually but by scripting it). All instances use "Index Scan". I only noticed a couple that had quite large "actual times" like this following: pkk=# explain analyze execute pkk_00( 2312 ) ; QUERY PLAN Result (cost=8.57..8.58 rows=1 width=0) (actual time=21.279..21.282 rows=1 loops=1) InitPlan -> Limit (cost=0.00..8.57 rows=1 width=4) (actual time=21.256..21.258 rows=1 loops=1) -> Index Scan using pur_offer_id_idx on pkk_purchase p0 (cost=0.00..17.13 rows=2 width=4) (actual time=21.249..21.249 rows=1 loops=1) Index Cond: (offer_id = $1) Filter: expire_time)::timestamp with time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR (pending = true))) Total runtime: 21.435 ms (7 rows) Time: 22.541 ms Which makes sense when you look at the number of entries this offer_id has in pkk_purchase table vs offer_id = 795: pkk=# select offer_id, count(*) from pkk_purchase where offer_id in ( 795, 2312 ) group by offer_id ; offer_id | count --+--- 795 | 4 2312 | 1015 (2 rows) Time: 21.118 ms --patrick --- Tom Lane <[EMAIL PROTECTED]> wrote: > patrick ~ <[EMAIL PROTECTED]> writes: > > 1. Is this really the only solution left for me? > > You still haven't followed the suggestions that were given to you > (ie, find out what is happening with the plan for the query inside > the problematic function). > > regards, tom lane __ Do you Yahoo!? Check out the new Yahoo! Front Page. www.yahoo.com ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] vacuum analyze slows sql query
patrick ~ <[EMAIL PROTECTED]> writes: > PREPARE pkk_00 ( integer )
Re: [PERFORM] Strange (?) Index behavior?
On Fri, 05 Nov 2004 23:04:23 -0500, Tom Lane <[EMAIL PROTECTED]> wrote: > Matt Clark <[EMAIL PROTECTED]> writes: > > Well, 74000/7600 ~= 0.1%, way less than 1/26, so no surprise that an > > indexscan is better, and also no surprise that the planner can't know > > that I is such an uncommon initial char. > > But it *can* know that, at least given adequate ANALYZE statistics. > I'm pretty convinced that the basic answer to Allen's problem is to > increase the histogram size. How large he needs to make it is not > clear --- obviously his data distribution is not uniform, but I don't > have a fix on how badly non-uniform. > Tom just an update, it's now 2am.. several hours since I started that EXPLAIN ANALYZE and it still hasn't finished, so I've aborted it. I will do the example with the more precise substring instead to illustrate the performance differences, both with and without the substring index and report back here. I'm also interested in something someone else posted, namely that the 36 indexes I have, "A%" through "Z%" and "0%" through "9%" could be replaced with a single index like: "CREATE INDEX idx_table_field_substr ON table substr(field, 1, 1);" I'm wondering, histogram and other information aside, will this function as well (or better) than creating all the individual indexes? ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings