Re: [PERFORM] Scaling further up
Can you describe the vendors/components of a "cheap SAN setup?" Thanks, Anjan -Original Message- From: Rod Taylor [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 02, 2004 5:57 PM To: Scott Marlowe Cc: Anjan Dave; Chris Ruprecht; [EMAIL PROTECTED]; William Yu; Postgresql Performance Subject: Re: [PERFORM] Scaling further up > For speed, the X86 32 and 64 bit architectures seem to be noticeable > faster than Sparc. However, running Linux or BSD on Sparc make them > pretty fast too, but you lose the fault tolerant support for things like > hot swappable CPUs or memory. Agreed.. You can get a Quad Opteron with 16GB memory for around 20K. Grab 3, a cheap SAN and setup a little master/slave replication with failover (how is Slony coming?), and you're all set. ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Database Server Tuning
Vivek, > I did a bunch of testing with different RAID levels on a 14 disk > array. I finally settled on this: RAID5 across 14 disks for the > data, the OS (including syslog directory) and WAL on a RAID1 pair on > the other channel of the same controller (I didn't want to spring for > dual RAID controllers). The biggest bumps in performance came from > increasing the checkpoint_buffers since my DB is heavily written to, > and increasing sort_mem. With large RAID, have you found that having WAL on a seperate array actually boosts performance? The empirical tests we've seen so far don't seem to support this. -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Scaling further up
Hi all, If you have a DB of 'only' 13 GB and you do not expect it to grow much, it might be advisable to have enough memory (RAM) to hold the entire DB in shared memory (everything is cached). If you have a server with say 24 GB or memory and can allocate 20 GB for cache, you don't care about the speed of disks any more - all you worry about is the speed of your memory and your network connection. I believe, this not possible using 32-bit technology, you would have to go to some 64-bit platform, but if it's speed you want ... You can also try solid state hard disk drives. These are actually just meory, there are no moving parts, but the look and behave like very very fast disk drives. I have seen them at capacities of 73 GB - but they didn't mention the price (I'd probably have a heart attack when I look at the price tag). Best regards, Chris On Tuesday 02 March 2004 14:41, Anjan Dave wrote: > "By lots I mean dozen(s) in a raid 10 array with a good controller." > > I believe, for RAID-10, I will need even number of drives. Currently, > the size of the database is about 13GB, and is not expected to grow > exponentially with thousands of concurrent users, so total space is not > of paramount importance compared to performance. > > Does this sound reasonable setup? > 10x36GB FC drives on RAID-10 > 4x36GB FC drives for the logs on RAID-10 (not sure if this is the > correct ratio)? > 1 hotspare > Total=15 Drives per enclosure. > > Tentatively, I am looking at an entry-level EMC CX300 product with 2GB > RAID cache, etc. > > Question - Are 73GB drives supposed to give better performance because > of higher number of platters? > > Thanks, > Anjan > > > -Original Message- > From: Fred Moyer [mailto:[EMAIL PROTECTED] > Sent: Tuesday, March 02, 2004 5:57 AM > To: William Yu; Anjan Dave > Cc: [EMAIL PROTECTED] > Subject: Re: [PERFORM] Scaling further up > > On Tue, 2004-03-02 at 17:42, William Yu wrote: > > Anjan Dave wrote: > > > We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running > > > RH9, > > > PG 7.4.0. There's an internal U320, 10K RPM RAID-10 setup on 4 > > drives. > > > > We are expecting a pretty high load, a few thousands of 'concurrent' > > > users executing either select, insert, update, statments. > > > > The quick and dirty method would be to upgrade to the recently > > announced > > 3GHz Xeon MPs with 4MB of L3. My semi-educated guess is that you'd get > > > > another +60% there due to the huge L3 hiding the Xeon's shared bus > > penalty. > > If you are going to have thousands of 'concurrent' users you should > seriously consider the 2.6 kernel if you are running Linux or as an > alternative going with FreeBSD. You will need to load test your system > and become an expert on tuning Postgres to get the absolute maximum > performance from each and every query you have. > > And you will need lots of hard drives. By lots I mean dozen(s) in a > raid 10 array with a good controller. Thousands of concurrent users > means hundreds or thousands of transactions per second. I've personally > seen it scale that far but in my opinion you will need a lot more hard > drives and ram than cpu. > > > ---(end of broadcast)--- > TIP 7: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index oddity
ken <[EMAIL PROTECTED]> writes: > ... and here is the plan with statistics set to 1000 ... > Seq Scan on nrgfeature f (cost=0.00..31675.57 rows=18608 width=218) > (actual time=63.544..1002.701 rows=225 loops=1) >Filter: ((upperrightx > 321264.236977215::double precision) AND > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > 123286.261898636::double precision) AND (lowerlefty < > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > precision)) > ... so yeah, its obviously finding way, way less rows than it thinks it > will. Yup. I think your problem here is that the conditions on the different columns are highly correlated, but the planner doesn't know anything about that, because it has no cross-column statistics. You could check that the individual conditions are accurately estimated by looking at the estimated and actual row counts in explain analyze SELECT * FROM nrgfeature f WHERE upperRightX > 321264.23697721504; explain analyze SELECT * FROM nrgfeature f WHERE lowerLeftX < 324046.79981208267; etc --- but I'll bet lunch that they are right on the money with the higher statistics targets, and probably not too far off even at the default. The trouble is that the planner is simply multiplying these probabilities together to get its estimate for the combined query, and when the columns are not independent that leads to a very bad estimate. In particular it sounds like you have a *whole lot* of rows that have diagonalSize just under 50, and so changing the diagonalSize condition to include those makes for a big change in the predicted number of rows, even though for the specific values of upperRightX and friends in your test query there isn't any change in the actual number of matching rows. I don't have any advice to magically solve this problem. I would suggest experimenting with alternative data representations -- for example, maybe it would help to store "leftX" and "width" in place of "leftX" and "rightX", etc. What you want to do is try to decorrelate the column values. leftX and rightX are likely to have a strong correlation, but maybe leftX and width don't. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Index oddity
ANALYZE might be producing poor stats due to : i) many dead tuples or ii) high proportion of dead tuples in the first few pages of the table Does a VACUUM FULL followed by ANALYZE change the estimates (or have you tried this already)? (p.s. - I probably don't qualify for the 'more knowledge' bit either...) You can also increase your statistics_target which will make ANALYZE take longer but can help a great deal with larger data sets. Sincerely, Joshua D. Drake regards Mark ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com PostgreSQL Replicator -- production quality replication for PostgreSQL ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Index oddity
Rod Taylor wrote: The big problem is a very poor estimate (off by a couple orders of magnitude). I was hoping someone with more knowledge in fixing those would jump in. ANALYZE might be producing poor stats due to : i) many dead tuples or ii) high proportion of dead tuples in the first few pages of the table Does a VACUUM FULL followed by ANALYZE change the estimates (or have you tried this already)? (p.s. - I probably don't qualify for the 'more knowledge' bit either...) regards Mark ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Scaling further up
For the disks part - I am looking at a SAN implementation, and I will be planning a separate RAID group for the WALs. The controller is a PERC, with 128MB cache, and I think it is writeback. Other than the disks, I am curious what other people are using in terms of the horsepower needed. The Quad server has been keeping up, but we are expecting quite high loads in the near future, and I am not sure if just by having the disks on a high-end storage will do it. Thanks, Anjan -Original Message- From: Magnus Hagander [mailto:[EMAIL PROTECTED] Sent: Monday, March 01, 2004 3:54 PM To: Anjan Dave; [EMAIL PROTECTED] Subject: RE: [PERFORM] Scaling further up > All: > > We have a Quad-Intel XEON 2.0GHz (1MB cache), 12GB memory, running RH9, PG 7.4.0. There's > an internal U320, 10K RPM RAID-10 setup on 4 drives. > > We are expecting a pretty high load, a few thousands of 'concurrent' users executing either > select, insert, update, statments. > What is the next step up in terms of handling very heavy loads? Clustering? I'd look at adding more disks first. Depending on what type of query load you get, that box sounds like it will be very much I/O bound. More spindles = more parallell operations = faster under load. Consider adding 15KRPM disks as well, they're not all that much more expensive, and should give you better performance than 10KRPM. Also, make sure you put your WAL disks on a separate RAIDset if possible (not just a separate partition on existing RAIDset). Finally, if you don't already have it, look for a battery-backed RAID controller that can do writeback-cacheing, and enable that. (Don't even think about enabling it unless it's battery backed!) And add as much RAM as you can to that controller. //Magnus ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])
Re: [PERFORM] Index oddity
On Wed, 2004-06-09 at 21:45, Christopher Kings-Lynne wrote: > > If I take away the diagonalSize condition in my query I find that there > > are 225 rows that satisfy the other conditions. 155 of these have a > Maybe you should drop your random_page_cost to something less than 4, > eg. 3 or even 2... The big problem is a very poor estimate (off by a couple orders of magnitude). I was hoping someone with more knowledge in fixing those would jump in. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Index oddity
If I take away the diagonalSize condition in my query I find that there are 225 rows that satisfy the other conditions. 155 of these have a diagonalSize value of exactly 50.000, while the remaining 70 rows all have values larger than 50. Thus there is a big discrete jump in the number of rows at a diagonalSize of 50. However, the statistics are off by two orders of magnitude in guessing how many rows there are going to be in this case and thus is not using my index. How can I fix that? Maybe you should drop your random_page_cost to something less than 4, eg. 3 or even 2... Chris ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Slow query
On Tue, 24 Feb 2004, Ivan Voras wrote: > -> Nested Loop (cost=1.04..788.76 rows=3 width=500) (actual > time=4078.85..20185.89 rows=38999 loops=1) > -> Nested Loop (cost=1.04..771.27 rows=3 width=485) > (actual time=4078.71..14673.27 rows=38999 loops=1) > -> Nested Loop (cost=0.00..752.16 rows=195 > width=288) (actual time=4078.20..6702.17 rows=38999 loops=1) > -> Nested Loop > (cost=0.00..748.72 rows=195 width=184) (actual time=0.21..3197.16 > rows=38999 loops=1) Note those nested loops up there. They think that you are going to be operating on 3,3,195, and 195 rows respectively, when they actually are operating on 38999, 38999, 38999, and 38999 in reality. set enable_nestloop = off and see if that helps. If so, see if altering the responsible columns default stats to something higher (100 is a good start) and reanalyze to see if you get a better plan. As long as those estimates are that far off, you're gonna get a poorly performing query when the planner is allowed to use nested loops. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] Index oddity
I had already tried setting the statistics to 1000 for all five of these double precision fields with effectively no improvement. Should have mentioned that. Also the between makes all values for diagonalSize bad since it is effectively doing digonalSize > X and diagonalSize < Y. If I do a query with the same values for the four x,y values and diagonalSize < Y, then for Y=49.999 the query is fast but for anything 50.000 and greater the query is slow. The exact opposite of the greater than queries not surprisingly. I also think I originally reported that the two queries gave the same number of rows. That is not true. It was true when I had other joins in, but when I stripped the query down to this core problem I should have noticed that the number of results now differs between the two, which I didn't at first. If I take away the diagonalSize condition in my query I find that there are 225 rows that satisfy the other conditions. 155 of these have a diagonalSize value of exactly 50.000, while the remaining 70 rows all have values larger than 50. Thus there is a big discrete jump in the number of rows at a diagonalSize of 50. However, the statistics are off by two orders of magnitude in guessing how many rows there are going to be in this case and thus is not using my index. How can I fix that? Ken On Wed, 2004-06-09 at 14:29, Rod Taylor wrote: > > ... and here is the plan with statistics set to 1000 ... > > > > Seq Scan on nrgfeature f (cost=0.00..31675.57 rows=18608 width=218) > > (actual time=63.544..1002.701 rows=225 loops=1) > >Filter: ((upperrightx > 321264.236977215::double precision) AND > > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > > 123286.261898636::double precision) AND (lowerlefty < > > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > > precision)) > > It's better like this, but still way off the mark. Even your good query > which uses the index was out by more than an order of magnitude. > > Try raising the statistics levels for upperrightx, lowerleftx, > upperrighty and lowerlefty. > > Failing that, you might be able to push it back down again by giving > diagonalsize an upper limit. Perhaps 500 is a value that would never > occur. > > AND (diagonalsize BETWEEN 49.999::double precision AND 500) > > > ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Index oddity
> ... and here is the plan with statistics set to 1000 ... > > Seq Scan on nrgfeature f (cost=0.00..31675.57 rows=18608 width=218) > (actual time=63.544..1002.701 rows=225 loops=1) >Filter: ((upperrightx > 321264.236977215::double precision) AND > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > 123286.261898636::double precision) AND (lowerlefty < > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > precision)) It's better like this, but still way off the mark. Even your good query which uses the index was out by more than an order of magnitude. Try raising the statistics levels for upperrightx, lowerleftx, upperrighty and lowerlefty. Failing that, you might be able to push it back down again by giving diagonalsize an upper limit. Perhaps 500 is a value that would never occur. AND (diagonalsize BETWEEN 49.999::double precision AND 500) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Index oddity
On Wed, 2004-06-09 at 13:56, Rod Taylor wrote: > On Wed, 2004-06-09 at 16:50, ken wrote: > > Thanks Rod, > > > > This setting has no effect however. If I set statistics to 1000, or > > Okay.. but you never did send EXPLAIN ANALYZE output. I want to know > what it is really finding. Ah, sorry, missed the ANALYZE portion of your request (thought you only wanted the result of explain if it changed due to the setting). Here is the query plan with statistics on diagonalsize set to the default (-1) ... Seq Scan on nrgfeature f (cost=0.00..32176.98 rows=19134 width=218) (actual time=61.640..1009.414 rows=225 loops=1) Filter: ((upperrightx > 321264.236977215::double precision) AND (lowerleftx < 324046.799812083::double precision) AND (upperrighty > 123286.261898636::double precision) AND (lowerlefty < 124985.927450476::double precision) AND (diagonalsize > 49.999::double precision)) ... and here is the plan with statistics set to 1000 ... Seq Scan on nrgfeature f (cost=0.00..31675.57 rows=18608 width=218) (actual time=63.544..1002.701 rows=225 loops=1) Filter: ((upperrightx > 321264.236977215::double precision) AND (lowerleftx < 324046.799812083::double precision) AND (upperrighty > 123286.261898636::double precision) AND (lowerlefty < 124985.927450476::double precision) AND (diagonalsize > 49.999::double precision)) ... so yeah, its obviously finding way, way less rows than it thinks it will. thanks, ken > > > On Wed, 2004-06-09 at 13:12, Rod Taylor wrote: > > > It seems to believe that the number of rows returned for the >49.999 > > > case will be 4 times the number for the >50 case. If that was true, then > > > the sequential scan would be correct. > > > > > > ALTER TABLE ALTER COLUMN diagonalsize SET STATISTICS 1000; > > > ANALZYE ; > > > > > > Send back EXPLAIN ANALYZE output for the >49.999 case. > > > ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [PERFORM] Index oddity
On Wed, 2004-06-09 at 16:50, ken wrote: > Thanks Rod, > > This setting has no effect however. If I set statistics to 1000, or Okay.. but you never did send EXPLAIN ANALYZE output. I want to know what it is really finding. > On Wed, 2004-06-09 at 13:12, Rod Taylor wrote: > > It seems to believe that the number of rows returned for the >49.999 > > case will be 4 times the number for the >50 case. If that was true, then > > the sequential scan would be correct. > > > > ALTER TABLE ALTER COLUMN diagonalsize SET STATISTICS 1000; > > ANALZYE ; > > > > Send back EXPLAIN ANALYZE output for the >49.999 case. > > > > > The query plan for diagonalSize > 50.000 is ... > > > > > > Index Scan using nrgfeature_xys_index on nrgfeature f > > > (cost=0.00..17395.79 rows=4618 width=220) > > >Index Cond: ((upperrightx > 321264.236977215::double precision) AND > > > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > > > 123286.261898636::double precision) AND (lowerlefty < > > > 124985.927450476::double precision) AND (diagonalsize > 50::double > > > precision)) > > > > > > ... while for diagonalSize > 49.999 is ... > > > > > > Seq Scan on nrgfeature f (cost=0.00..31954.70 rows=18732 width=220) > > >Filter: ((upperrightx > 321264.236977215::double precision) AND > > > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > > > 123286.261898636::double precision) AND (lowerlefty < > > > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > > > precision)) > > > > > > ---(end of broadcast)--- > > TIP 9: the planner will ignore your desire to choose an index scan if your > > joining column's datatypes do not match > > ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [PERFORM] Index oddity
Thanks Rod, This setting has no effect however. If I set statistics to 1000, or even 0, (and then reanalyze the table) I see no change in the behaviour of the query plans. i.e. there is still the odd transtion in the plans at diagonalSize = 50. Ken On Wed, 2004-06-09 at 13:12, Rod Taylor wrote: > It seems to believe that the number of rows returned for the >49.999 > case will be 4 times the number for the >50 case. If that was true, then > the sequential scan would be correct. > > ALTER TABLE ALTER COLUMN diagonalsize SET STATISTICS 1000; > ANALZYE ; > > Send back EXPLAIN ANALYZE output for the >49.999 case. > > > The query plan for diagonalSize > 50.000 is ... > > > > Index Scan using nrgfeature_xys_index on nrgfeature f > > (cost=0.00..17395.79 rows=4618 width=220) > >Index Cond: ((upperrightx > 321264.236977215::double precision) AND > > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > > 123286.261898636::double precision) AND (lowerlefty < > > 124985.927450476::double precision) AND (diagonalsize > 50::double > > precision)) > > > > ... while for diagonalSize > 49.999 is ... > > > > Seq Scan on nrgfeature f (cost=0.00..31954.70 rows=18732 width=220) > >Filter: ((upperrightx > 321264.236977215::double precision) AND > > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > > 123286.261898636::double precision) AND (lowerlefty < > > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > > precision)) > > > ---(end of broadcast)--- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Index oddity
It seems to believe that the number of rows returned for the >49.999 case will be 4 times the number for the >50 case. If that was true, then the sequential scan would be correct. ALTER TABLE ALTER COLUMN diagonalsize SET STATISTICS 1000; ANALZYE ; Send back EXPLAIN ANALYZE output for the >49.999 case. > The query plan for diagonalSize > 50.000 is ... > > Index Scan using nrgfeature_xys_index on nrgfeature f > (cost=0.00..17395.79 rows=4618 width=220) >Index Cond: ((upperrightx > 321264.236977215::double precision) AND > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > 123286.261898636::double precision) AND (lowerlefty < > 124985.927450476::double precision) AND (diagonalsize > 50::double > precision)) > > ... while for diagonalSize > 49.999 is ... > > Seq Scan on nrgfeature f (cost=0.00..31954.70 rows=18732 width=220) >Filter: ((upperrightx > 321264.236977215::double precision) AND > (lowerleftx < 324046.799812083::double precision) AND (upperrighty > > 123286.261898636::double precision) AND (lowerlefty < > 124985.927450476::double precision) AND (diagonalsize > 49.999::double > precision)) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[PERFORM] Index oddity
I'm having a performance issue that I just can't resolve and its very, very curious. Thought someone here might be able to shed some light on the subject. I'm using Postgres 7.4.2 on Red Hat 9. I have a table with 763,809 rows in it defined as follows ... ksedb=# \d nrgfeature Table "public.nrgfeature" Column |Type | Modifiers +-+--- fid1 | numeric(64,0) | not null fid2 | numeric(64,0) | not null created| timestamp without time zone | not null createdby | character varying(30) | not null modified | timestamp without time zone | modifiedby | character varying(30) | geommodified | timestamp without time zone | geommodifiedby | character varying(30) | deleted| timestamp without time zone | deletedby | character varying(30) | featuretypeid | smallint| not null description| text| datasourceid | smallint| not null lowerleftx | double precision| not null lowerlefty | double precision| not null upperrightx| double precision| not null upperrighty| double precision| not null diagonalsize | double precision| login | character varying(25) | Indexes: "nrgfeature_pkey" primary key, btree (fid1, fid2) "nrgfeature_ft_index" btree (featuretypeid) "nrgfeature_xys_index" btree (upperrightx, lowerleftx, upperrighty, lowerlefty, diagonalsize) Inherits: commonfidattrs, commonrevisionattrs ... If I write a query as follows ... SELECT * FROM nrgfeature f WHERE upperRightX > 321264.23697721504 AND lowerLeftX < 324046.79981208267 AND upperRightY > 123286.26189863647 AND lowerLeftY < 124985.92745047594 AND diagonalSize > 50.000 ; ... (or any value for diagonalsize over 50) then my query runs in 50-100 milliseconds. However, if the diagonalSize value is changed to 49.999 or any value below 50, then the query takes over a second for a factor of 10 degradation in speed, even though the exact same number of rows is returned. The query plan for diagonalSize > 50.000 is ... Index Scan using nrgfeature_xys_index on nrgfeature f (cost=0.00..17395.79 rows=4618 width=220) Index Cond: ((upperrightx > 321264.236977215::double precision) AND (lowerleftx < 324046.799812083::double precision) AND (upperrighty > 123286.261898636::double precision) AND (lowerlefty < 124985.927450476::double precision) AND (diagonalsize > 50::double precision)) ... while for diagonalSize > 49.999 is ... Seq Scan on nrgfeature f (cost=0.00..31954.70 rows=18732 width=220) Filter: ((upperrightx > 321264.236977215::double precision) AND (lowerleftx < 324046.799812083::double precision) AND (upperrighty > 123286.261898636::double precision) AND (lowerlefty < 124985.927450476::double precision) AND (diagonalsize > 49.999::double precision)) ... and yes, if I set enable_seqscan=false then the index is forced to be used. However, despite this being an undesirable solution for this simple case it doesn't solve the problem for the general case. As soon as I add in joins with a couple of tables to perform the actual query I want to perform, the seq scan setting doesn't force the index to be used anymore. Instead, the primary key index is used at this same diagonalSize cutoff and the 5-part double precision clause is used as a filter to the index scan and the result is again a very slow query. I can provide those queries and results but that would only complicate this already lengthy email and the above seems to be the crux of the problem anyway. Any help or thoughts would be greatly appreciated of course. Thanks, Ken Southerland -- --samsixedd-- -- Ken Southerland Senior Consultant Sam Six EDD http://www.samsixedd.com 503-236-4288 (office) 503-358-6542 (cell) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] seq scan woes
On Mon, 2004-06-07 at 16:12, Dan Langille wrote: > On 7 Jun 2004 at 16:00, Rod Taylor wrote: > > > On Mon, 2004-06-07 at 15:45, Dan Langille wrote: > > > A production system has had a query recently degrade in performance. > > > What once took < 1s now takes over 1s. I have tracked down the > > > problem to a working example. > > > > What changes have you made to postgresql.conf? > > Nothing recently (ie. past few months). Nothing at all really. > Perhaps I need to start tuning that. > > > Could you send explain analyse again with SEQ_SCAN enabled but with > > nested loops disabled? > > See http://rafb.net/paste/results/zpJEvb28.html This doesn't appear to be the same query as we were shown earlier. > > Off the cuff? I might hazard a guess that effective_cache is too low or > > random_page_cost is a touch too high. Probably the former. > > I grep'd postgresql.conf: > > #effective_cache_size = 1000# typically 8KB each > #random_page_cost = 4 # units are one sequential page fetch cost This would be the issue. You haven't told PostgreSQL anything about your hardware. The defaults are somewhat modest. http://www.postgresql.org/docs/7.4/static/runtime-config.html Skim through the run-time configuration parameters that can be set in postgresql.conf. Pay particular attention to: * shared_buffers (you may be best with 2000 or 4000) * effective_cache_size (set to 50% of ram size if dedicated db machine) * random_page_cost (good disks will bring this down to a 2 from a 4) -- Rod Taylor Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL PGP Key: http://www.rbt.ca/signature.asc ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [PERFORM] seq scan woes
On Mon, 2004-06-07 at 15:45, Dan Langille wrote: > A production system has had a query recently degrade in performance. > What once took < 1s now takes over 1s. I have tracked down the > problem to a working example. What changes have you made to postgresql.conf? Could you send explain analyse again with SEQ_SCAN enabled but with nested loops disabled? Off the cuff? I might hazard a guess that effective_cache is too low or random_page_cost is a touch too high. Probably the former. -- Rod Taylor Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL PGP Key: http://www.rbt.ca/signature.asc ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [PERFORM] Use of Functional Indexs and Planner estimates
"Scott Marlowe" <[EMAIL PROTECTED]> writes: > >-> Seq Scan on rules > > (cost=0.00..22296.32 rows=11294 width=12) > > (actual time=540.149..2047.308 rows=1 loops=1) > Simple, the planner is choosing a sequential scan when it should be > choosing an index scan. This is usually because random_page_cost is set > too high, at the default of 4. Try settings between 1.2 and 2.x or so > to see how that helps. Be sure and test with various queries of your > own to be sure you've got about the right setting. Unless you make random_page_cost about .0004 (4/11294) it isn't going to be costing this query right (That's a joke, don't do it:). It's thinking there are 11,000 records matching the where clause when in fact there is only 1. If you know how an upper bound on how many records the query should be finding you might try a kludge involving putting a LIMIT inside the group by. ie, something like select rulename,redirect from (select rulename,redirect from ... where ... limit 100) as kludge group by rulename,redirect This would at least tell the planner not to expect more than 100 rows and to take the plan likely to produce the first 100 rows fastest. But this has the disadvantage of uglifying your code and introducing an arbitrary limit. When 7.5 comes out it you'll want to rip this out. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])