Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 03:22:35PM +, Frits Jalvingh wrote: > Hi Babu, > > That was all already done, as it is common practice for JDBC. Your > parameter was added to the code that already did all that - and worked > brilliantly there ;) > Hi Frits, What was the parameter? I did not see an Email in the thread from Babu. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 02:39:37PM +, Frits Jalvingh wrote: > Hi all, > > Thanks a lot for the many responses! > > About preparing statements: this is done properly in Java, and pgsql does > it by itself. So that cannot be done better ;) > > I tried the copy command, and that indeed works quite brilliantly: > Inserted 2400 rows in 22004 milliseconds, 1090710.7798582076 rows per > second > > That's faster than Oracle. But with a very bad interface I have to say for > normal database work.. I will try to make this work in the tooling, but it > needs some very special code to format all possible values properly, and to > manage the end of the copy, so it is not usable in general which is a pity, > I think. > > So, I am still very interested in getting normal inserts faster, because > that will gain speed for all work.. If Oracle can do it, and Postgres is > able to insert fast with copy- where lies the bottleneck with the insert > command? There seems to be quite a performance hit with the JDBC driver > itself (as the stored procedure is a lot faster), so I can look into that. > But even after that there is quite a gap.. > > Regards, > > Frits Hi Frits, Have you looked at UNLOGGED tables and also having more that 1 insert stream running at a time. Sometimes multiple parallel inserts can be faster. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 01:56:58PM +, Frits Jalvingh wrote: > Hi Kenneth, Andreas, > > Thanks for your tips! > > I increased shared_buffers to 8GB but it has no measurable effect at all. I > think that is logical: shared buffers are important for querying but not > for inserting; for that the speed to write to disk seems most important- no > big reason to cache the data if the commit requires a full write anyway. > I also changed the code to do only one commit; this also has no effect I > can see. > > It is true that Oracle had more memory assigned to it (1.5G), but unlike > Postgres (which is completely on a fast SSD) Oracle runs on slower disk > (ZFS).. > > I will try copy, but I first need to investigate how to use it- its > interface seems odd to say the least ;) I'll report back on that once done. > > Any other tips would be welcome! > > Regards, > > Frits Hi Frits, Here is an article that is still valid: https://www.depesz.com/2007/07/05/how-to-insert-data-to-database-as-fast-as-possible/ Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Improving PostgreSQL insert performance
On Fri, Jun 09, 2017 at 03:24:15PM +0200, Andreas Kretschmer wrote: > > > Am 09.06.2017 um 15:04 schrieb Frits Jalvingh: > >Hi all, > > > >I am trying to improve the runtime of a big data warehouse > >application. One significant bottleneck found was insert > >performance, so I am investigating ways of getting Postgresql to > >insert data faster. > > * use COPY instead of Insert, it is much faster > * bundle all Insert into one transaction > * use a separate disk/spindel for the transaction log > > > > > > >I already changed the following config parameters: > >work_mem 512MB > >synchronous_commit off > >shared_buffers 512mb > >commit_delay 10 > >autovacuum_naptime 10min > > > >Postgres version is 9.6.3 on Ubuntu 17.04 64 bit, on a i7-4790K > >with 16GB memory and an Intel 750 SSD. JDBC driver is > >postgresql-42.1.1. > > > > increase shared_buffers, with 16gb ram i would suggest 8gb +1 Without even checking, I think Oracle is configured to use a LOT more memory than 512mb. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] REINDEX takes half a day (and still not complete!)
On Sat, Apr 30, 2011 at 05:26:36PM +0800, Phoenix Kiula wrote: On Sat, Apr 30, 2011 at 4:07 PM, Greg Smith g...@2ndquadrant.com wrote: On 04/23/2011 03:44 PM, Robert Haas wrote: On Apr 17, 2011, at 11:30 AM, Phoenix Kiulaphoenix.ki...@gmail.com ?wrote: Postgres is 8.2.9. An upgrade would probably help you a lot, and as others have said it sounds like your hardware is failing, so you probably want to deal with that first. I am a bit surprised, however, that no one seems to have mentioned using CLUSTER rather than VACUUM or REINDEX. Sometimes that's worth a try... Don't know if it was for this reason or not for not mentioning it by others, but CLUSTER isn't so great in 8.2. ?The whole not MVCC-safe bit does not inspire confidence on a production server. To everyone. Thanks so much for everything, truly. We have managed to salvage the data by exporting it in bits and pieces. 1. First the schema only 2. Then pg_dump of specific small tables 3. Then pg_dump of timed bits of the big mammoth table Not to jinx it, but the newer hardware seems to be doing well. I am on 9.0.4 now and it's pretty fast. Also, as has been mentioned in this thread and other discussions on the list, just doing a dump and then fresh reload has compacted the DB to nearly 1/3rd of its previously reported size! I suppose that's what I am going to do on a periodic basis from now on. There is a lot of DELETE/UPDATE activity. But I wonder if the vacuum stuff really should do something that's similar in function? What do the high-end enterprise folks do -- surely they can't be dumping/restoring every quarter or soor are they? Anyway, many many thanks to the lovely folks on this list. Much appreciated! The autovacuum and space management in 9.0 is dramatically more effective and efficient then that of 8.2. Unless you have an odd corner-case there really should be no reason for a periodic dump/restore. This is not your grandmother's Oldsmobile... :) Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Time to put theory to the test?
On Tue, Apr 26, 2011 at 09:58:49AM -0500, Kevin Grittner wrote: J Sisson sisso...@gmail.com wrote: Rob Wultsch wult...@gmail.com wrote: Tip from someone that manages thousands of MySQL servers: Use InnoDB when using MySQL. Granted, my knowledge of PostgreSQL (and even MSSQL) far surpasses my knowledge of MySQL, but if InnoDB has such amazing benefits as being crash safe, and even speed increases in some instances, why isn't InnoDB default? Because it's not as fast as the unsafe ISAM implementation for most benchmarks. There is one minor gotcha in InnoDB (unless it's been fixed since 2008): the release of locks is not atomic with the persistence of the data in the write-ahead log (which makes it S2PL but not SS2PL). So it is possible for another connection to see data that won't be there after crash recovery. This is justified as an optimization. Personally, I would prefer not to see data from other transactions until it has actually been successfully committed. -Kevin In addition, their fulltext indexing only works with MyISAM tables. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad Query Plan with Range Query
On Fri, Apr 15, 2011 at 10:17:32AM -0700, Mark Williams wrote: We are experiencing a problem with our query plans when using a range query in Postgresql 8.3. The query we are executing attempts to select the minimum primary key id after a certain date. Our date columns are bigint's holding a unix epoch representation of the date. We have an index on the primary key and the date column. For the following query just specified the predicate modificationDate = ? explain SELECT min(messageID) FROM Message WHERE modificationDate = 1302627793988; QUERY PLAN - Result (cost=2640.96..2640.97 rows=1 width=0) InitPlan - Limit (cost=0.00..2640.96 rows=1 width=8) - Index Scan using message_pk on message (cost=0.00..3298561.09 rows=1249 width=8) Filter: ((messageid IS NOT NULL) AND (modificationdate = 1302627793988::bigint)) (5 rows) For some reason it is deciding to scan the primary key column of the table. This results in scanning the entire table which is huge (10 million records). However, if we specify a fake upper bound then the planner will correctly use the date column index: explain SELECT min(messageID) FROM Message WHERE modificationDate = 1302627793988 and modificationDate ; QUERY PLAN - Aggregate (cost=9.64..9.65 rows=1 width=8) - Index Scan using jvmssg_mdate_idx on message (cost=0.00..9.64 rows=1 width=8) Index Cond: ((modificationdate = 1302627793988::bigint) AND (modificationdate ::bigint)) (3 rows) We have carried out all the usual maintenance tasks. We have increase the statistics_target on both indexes to the maximum (1000) and performed a vacuum analyze on the table. Our resource configurations are very good since this is our production server. Interestingly this does not appear to happen with exactly the same database when using 8.4. Instead we get the correct plan without having to add the upper bound. Here is the full description of the the table. It contains upwards of 10 million rows. Table public.message Column | Type | Modifiers --++--- messageid| bigint | not null parentmessageid | bigint | threadid | bigint | not null containertype| integer| not null containerid | bigint | not null userid | bigint | subject | character varying(255) | body | text | modvalue | integer| not null rewardpoints | integer| not null creationdate | bigint | not null modificationdate | bigint | not null status | integer| not null Indexes: message_pk PRIMARY KEY, btree (messageid) jvmssg_cdate_idx btree (creationdate) jvmssg_cidctmd_idx btree (containerid, containertype, modificationdate) jvmssg_mdate_idx btree (modificationdate) jvmssg_mdvle_idx btree (modvalue) jvmssg_prntid_idx btree (parentmessageid) jvmssg_thrd_idx btree (threadid) jvmssg_usrid_idx btree (userid) Referenced by: TABLE answer CONSTRAINT answer_mid_fk FOREIGN KEY (messageid) REFERENCES message(messageid) TABLE messageprop CONSTRAINT jmp_msgid_fk FOREIGN KEY (messageid) REFERENCES message(messageid) Any insight into this would be greatly appreciated. We are not able to upgrade our databases to 8.4. We are reluctant to re-write all our range queries if possible. -m Here is the fix that was added to 8.4+: http://archives.postgresql.org/pgsql-committers/2010-01/msg00021.php I think you are stuck with one of those options so if upgrading is not available, then re-writing the range queries wins by a landslide. :) Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Updating histogram_bounds after a delete
On Thu, Mar 17, 2011 at 09:49:45AM -0500, Kevin Grittner wrote: Derrick Rice derrick.r...@gmail.com wrote: Kevin Grittner kevin.gritt...@wicourts.gov wrote: there is a feature to probe the end of an index's range in situations where data skew was often causing less than optimal plans to be chosen. Was this introduced in 9.0 or was it earlier? I don't remember when it was added. I took a stab at searching for it, but didn't get it figured out; if nobody who knows off-hand jumps in, I'll try again when I have more time. I think this is it: http://archives.postgresql.org/pgsql-committers/2010-01/msg00021.php Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Help with Query Tuning
On Wed, Mar 16, 2011 at 02:43:38PM +0530, Adarsh Sharma wrote: Dear all, I am facing a problem while creating the index to make the below query run faster. My table size is near about 1065 MB and 428467 rows. explain analyze select count(*) from page_content where publishing_date like '%2010%' and content_language='en' and content is not null and isprocessable = 1 and (content like '%Militant%' OR content like '%jihad%' OR content like '%Mujahid%' OR content like '%fedayeen%' OR content like '%insurgent%' OR content like '%terrorist%' OR content like '%cadre%' OR content like '%civilians%' OR content like '%police%' OR content like '%defence%' OR content like '%cops%' OR content like '%crpf%' OR content like '%dsf%' OR content like '%ssb%') AND (content like '%kill%' or content like '%injure%'); *Output: * Aggregate (cost=107557.78..107557.79 rows=1 width=0) (actual time=18564.631..18564.631 rows=1 loops=1) - Seq Scan on page_content (cost=0.00..107466.82 rows=36381 width=0) (actual time=0.146..18529.371 rows=59918 loops=1) Filter: ((content IS NOT NULL) AND (publishing_date ~~ '%2010%'::text) AND (content_language = 'en'::bpchar) AND (isprocessable = 1) AND (((content) ::text ~~ '%kill%'::text) OR ((content)::text ~~ '%injure%'::text)) AND (((content)::text ~~ '%Militant%'::text) OR ((content)::text ~~ '%jihad%'::text) OR ( (content)::text ~~ '%Mujahid%'::text) OR ((content)::text ~~ '%fedayeen%'::text) OR ((content)::text ~~ '%insurgent%'::text) OR ((content)::text ~~ '%terrori st%'::text) OR ((content)::text ~~ '%cadre%'::text) OR ((content)::text ~~ '%civilians%'::text) OR ((content)::text ~~ '%police%'::text) OR ((content)::text ~~ '%defence%'::text) OR ((content)::text ~~ '%cops%'::text) OR ((content)::text ~~ '%crpf%'::text) OR ((content)::text ~~ '%dsf%'::text) OR ((content)::text ~~ '%ssb%'::text))) Total runtime: 18564.673 ms *Index on that Table : *CREATE INDEX idx_page_id ON page_content USING btree (crawled_page_id); *Index I create :* CREATE INDEX idx_page_id_content ON page_content USING btree (crawled_page_id,content_language,publishing_date,isprocessable); *Index that fail to create: *CREATE INDEX idx_page_id_content1 ON page_content USING btree (crawled_page_id,content); Error :-ERROR: index row requires 13240 bytes, maximum size is 8191 ** Error ** ERROR: index row requires 13240 bytes, maximum size is 8191 SQL state: 54000 How to resolve this error Please give any suggestion to tune the query. Thanks best Regards, Adarsh Sharma You should probably be looking at using full-text indexing: http://www.postgresql.org/docs/9.0/static/textsearch.html or limit the size of content for the index. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] ANTI-JOIN needs table, index scan not possible?
On Fri, Mar 11, 2011 at 06:54:39PM +0100, hans wulf wrote: Thanks for the answer. so there's no way around this problem? A nice index bitmap merge thing would be super fast. Big table ANTI JOIN queries with only a few results expected, are totally broken, if this is true. This way the query breaks my neck. This is a massive downside of postgres which makes this kind of query impossible. Mysql gives you the answer in a few seconds :-( Super! I am glad that MySQL can meet your needs. No software is perfect and you should definitely chose based on your use-case. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Performance issues
On Mon, Mar 07, 2011 at 03:17:05PM +0100, Andreas For? Tollefsen wrote: Thanks, Ken. It seems like the tip to turn off synchronous_commit did the trick: /usr/lib/postgresql/8.4/bin/pgbench -T 60 test1 starting vacuum...end. transaction type: TPC-B (sort of) scaling factor: 1 query mode: simple number of clients: 1 duration: 60 s number of transactions actually processed: 86048 tps = 1434.123199 (including connections establishing) tps = 1434.183362 (excluding connections establishing) Is this acceptable compared to others when considering my setup? Cheers, Andreas These are typical results for synchronous_commit off. The caveat is you must be able to handle loosing transactions if you have a database crash, but your database is still intact. This differs from turning fsync off in which a crash means you would need to restore from a backup. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On Thu, Feb 03, 2011 at 04:39:12PM -0800, da...@lang.hm wrote: On Thu, 3 Feb 2011, Robert Haas wrote: On Thu, Feb 3, 2011 at 3:54 PM, da...@lang.hm wrote: with the current code, this is a completely separate process that knows nothing about the load, so if you kick it off when you start the load, it makes a pass over the table (competing for I/O), finishes, you continue to update the table, so it makes another pass, etc. As you say, this is a bad thing to do. I am saying to have an option that ties the two togeather, essentially making the data feed into the Analyze run be a fork of the data comeing out of the insert run going to disk. So the Analyze run doesn't do any I/O and isn't going to complete until the insert is complete. At which time it will have seen one copy of the entire table. Yeah, but you'll be passing the entire table through this separate process that may only need to see 1% of it or less on a large table. If you want to write the code and prove it's better than what we have now, or some other approach that someone else may implement in the meantime, hey, this is an open source project, and I like improvements as much as the next guy. But my prediction for what it's worth is that the results will suck. :-) I will point out that 1% of a very large table can still be a lot of disk I/O that is avoided (especially if it's random I/O that's avoided) David Lang In addition, the streaming ANALYZE can provide better statistics at any time during the load and it will be complete immediately. As far as passing the entire table through the ANALYZE process, a simple counter can be used to only send the required samples based on the statistics target. Where this would seem to help the most is in temporary tables which currently do not work with autovacuum but it would streamline their use for more complicated queries that need an analyze to perform well. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On Thu, Feb 03, 2011 at 09:05:52PM -0500, Robert Haas wrote: On Thu, Feb 3, 2011 at 8:37 PM, da...@lang.hm wrote: On Thu, 3 Feb 2011, Robert Haas wrote: On Thu, Feb 3, 2011 at 7:39 PM, ?da...@lang.hm wrote: Yeah, but you'll be passing the entire table through this separate process that may only need to see 1% of it or less on a large table. If you want to write the code and prove it's better than what we have now, or some other approach that someone else may implement in the meantime, hey, this is an open source project, and I like improvements as much as the next guy. ?But my prediction for what it's worth is that the results will suck. ?:-) I will point out that 1% of a very large table can still be a lot of disk I/O that is avoided (especially if it's random I/O that's avoided) Sure, but I think that trying to avoid it will be costly in other ways - you'll be streaming a huge volume of data through some auxiliary process, which will have to apply some algorithm that's very different from the one we use today. ?The reality is that I think there's little evidence that the way we do ANALYZE now is too expensive. ?It's typically very cheap and works very well. ?It's a bit annoying when it fires off in the middle of a giant data load, so we might need to change the time of it a little, but if there's a problem with the operation itself being too costly, this is the first I'm hearing of it. ?We've actually worked *really* hard to make it cheap. I could be misunderstanding things here, but my understanding is that it's 'cheap' in that it has little impact on the database while it is running. I mean that it's cheap in that it usually takes very little time to complete. the issue here is that the workflow is load data analyze start work so the cost of analyze in this workflow is not 1% impact on query speed for the next X time, it's the database can't be used for the next X time while we wait for analyze to finish running OK. I don't understand why the algorithm would have to be so different than what's done today, surely the analyze thread could easily be tweaked to ignore the rest of the data (assuming we don't have the thread sending the data to analyze do the filtering) If you want to randomly pick 10,000 rows out of all the rows that are going to be inserted in the table without knowing in advance how many there will be, how do you do that? Maybe there's an algorithm, but it's not obvious to me. But mostly, I question how expensive it is to have a second process looking at the entire table contents vs. going back and rereading a sample of rows at the end. I can't remember anyone ever complaining ANALYZE took too long to run. I only remember complaints of the form I had to remember to manually run it and I wish it had just happened by itself. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company Probably doomed to be shot down, but since you are effectively inline, you could sample assuming a range of table row counts. Start at the size of a table where random (index) lookups are faster than a sequential scan and then at appropriate multiples, 100x, 100*100X,... then you should be able to generate appropriate statistics. I have not actually looked at how that would happen, but it would certainly allow you to process far, far fewer rows than the entire table. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Really really slow select count(*)
On Fri, Feb 04, 2011 at 05:20:27PM +0100, felix wrote: reply was meant for the list -- Forwarded message -- From: felix crucialfe...@gmail.com Date: Fri, Feb 4, 2011 at 4:39 PM Subject: Re: [PERFORM] Really really slow select count(*) To: Greg Smith g...@2ndquadrant.com On Fri, Feb 4, 2011 at 3:56 PM, Greg Smith g...@2ndquadrant.com wrote: PostgreSQL version? If you're running on 8.3 or earlier, I would be suspicous that your Free Space Map has been overrun. 8.3 What you are seeing is that the table itself is much larger on disk than it's supposed to be. which part of the explain told you that ? shaun thomas SELECT relpages*8/1024 FROM pg_class WHERE relname='fastadder_fastadderstatus'; 458MB way too big. build_cache is text between 500-1k chars As has been suggested, you really need to CLUSTER the table to remove dead rows. VACUUM will not do that, VACUUM FULL will but will take a full table lock and then you would need to REINDEX to fix index bloat. CLUSTER will do this in one shot. You almost certainly have your free space map way too small, which is how you bloated in the first place. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On Thu, Feb 03, 2011 at 02:11:58AM -0800, da...@lang.hm wrote: On Thu, 3 Feb 2011, Vitalii Tymchyshyn wrote: 02.02.11 20:32, Robert Haas ???(??): Yeah. Any kind of bulk load into an empty table can be a problem, even if it's not temporary. When you load a bunch of data and then immediately plan a query against it, autoanalyze hasn't had a chance to do its thing yet, so sometimes you get a lousy plan. May be introducing something like 'AutoAnalyze' threshold will help? I mean that any insert/update/delete statement that changes more then x% of table (and no less then y records) must do analyze right after it was finished. Defaults like x=50 y=1 should be quite good as for me. If I am understanding things correctly, a full Analyze is going over all the data in the table to figure out patterns. If this is the case, wouldn't it make sense in the situation where you are loading an entire table from scratch to run the Analyze as you are processing the data? If you don't want to slow down the main thread that's inserting the data, you could copy the data to a second thread and do the analysis while it's still in RAM rather than having to read it off of disk afterwords. this doesn't make sense for updates to existing databases, but the use case of loading a bunch of data and then querying it right away isn't _that_ uncommon. David Lang +1 for in-flight ANALYZE. This would be great for bulk loads of real tables as well as temp tables. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Server Configuration
On Wed, Feb 02, 2011 at 03:15:22PM -0300, Cesar Arrieta wrote: Hi, I have a Server with Fedora Core 11, Tomcat and Postgresql 8.3. With Hardware: * 8GB RAM * 8 processors Intel Xeon E5520 @2.27GHz * 250GB SATA DISK Actually, it serves at most 250 connections. The problem happends when it serves many many connections at a time, tables and queries began to get blocked, then I have to kill some processes in order to allow other people continue working. Wich recommendations could you give me for to configure postgresql.conf, and could it be eficcient to buy another server with almost same hardware in order to use pgPool2 with replication, load balance and parallel query?. It sounds like you may just need a connection pooler (pgpool, pgbouncer) and it might work just fine. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] Slow count(*) again...
On Wed, Feb 02, 2011 at 03:54:26PM -0500, Mladen Gogala wrote: Greg Smith wrote: Given that even Oracle kicked out the RBO a long time ago, I'm not so sure longing for those good old days will go very far. I regularly see queries that were tweaked to always use an index run at 1/10 or less the speed of a sequential scan against the same data. The same people complaining all over the place about this topic are also the sort who write them. There are two main fallacies at play here that make this happen: Oracle just gives an impression that RBO is gone. It's actually still there, even in 11.2: Connected to: Oracle Database 11g Enterprise Edition Release 11.2.0.2.0 - Production With the Partitioning, OLAP, Data Mining and Real Application Testing options SQL alter session set optimizer_mode=rule; Session altered. Oracle people were just as puritanical as Postgres people, if not more so. However, the huge backlash made them reconsider the decision. RBO is officially de-supported, obsolete and despised but it is also widely used, even in the Oracle's own SYS schema. Oracle is having huge problems with trying to get people to the cost based optimizer, but they are not yet quite done. This problem in getting people to migrate to the cost-based optimizer seems to stem from the original use of the rule based optimizer and the ability to (mis)hint every option in the DB. If I were running a shop with 100k-1m lines of SQL code with embedded hints, I would run screaming at the QA required to move to the cost-based system. In many ways, the RBO itself + hints is hindering the adoption of the CBO. Are there any stats on the adoption/use of the CBO on new Oracle users/shops? 1) Even if you use an index, PostgreSQL must still retrieve the associated table data to execute the query in order to execute its version of MVCC Of course. Nobody contests that. However, index scans for OLTP are indispensable. Sequential scans just don't do the trick in some situations. 2) The sort of random I/O done by index lookups can be as much as 50X as expensive on standard hard drives as sequential, if every block goes to physical hardware. Greg, how many questions about queries not using an index have you seen? There is a reason why people keep asking that. The sheer number of questions like that on this group should tell you that there is a problem there. There must be a relatively simple way of influencing optimizer decisions. With all due respect, I consider myself smarter than the optimizer. I'm 6'4, 235LBS so telling me that you disagree and that I am more stupid than a computer program, would not be a smart thing to do. Please, do not misunderestimate me. I see them come up regularly. However, there really are not all that many when you consider how many people are using PostgreSQL. Its optimizer works quite well. Knowing how hints can be misused, I would rather have the developers use their resource to improve the optimizer than spend time on a hint system that would be mis-used over and over by beginners, with the attendent posts to HACKERS/PERFORM/NOVICE/... groups. I certainly have had a fun time or two in my limited Oracle experience tracking down a hint-based performance problem, so it works both ways. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] postgres 9 query performance
On Fri, Jan 28, 2011 at 09:30:19AM -0800, yazan suleiman wrote: I am evaluating postgres 9 to migrate away from Oracle. The following query runs too slow, also please find the explain plan: explain analyze select DISTINCT EVENT.ID, ORIGIN.ID AS ORIGINID,EVENT.PREFERRED_ORIGIN_ID AS PREFERRED_ORIGIN, EVENT.CONTRIBUTOR, ORIGIN.TIME, ORIGIN.LATITUDE, ORIGIN.LONGITUDE, ORIGIN.DEPTH,ORIGIN.EVTYPE, ORIGIN.CATALOG, ORIGIN.AUTHOR OAUTHOR, ORIGIN.CONTRIBUTOR OCONTRIBUTOR, MAGNITUDE.ID AS MAGID, MAGNITUDE.MAGNITUDE,MAGNITUDE.TYPE AS MAGTYPE from event.event left join event.origin on event.id=origin.eventid left join event.magnitude on origin.id=event.magnitude.origin_id WHERE EXISTS(select origin_id from event.magnitude where magnitude.magnitude=7.2 and origin.id=origin_id) order by ORIGIN.TIME desc,MAGNITUDE.MAGNITUDE desc,EVENT.ID ,EVENT.PREFERRED_ORIGIN_ID,ORIGIN.ID Unique (cost=740549.86..741151.42 rows=15039 width=80) (actual time=17791.557..17799.092 rows=5517 loops=1) - Sort (cost=740549.86..740587.45 rows=15039 width=80) (actual time=17791.556..17792.220 rows=5517 loops=1) Sort Key: origin.time, event.magnitude.magnitude, event.id, event.preferred_origin_id, origin.id, event.contributor, origin.latitude, origin.longitude, origin.depth, origin.evtype, origin.catalog, origin.author, origin.contributor, event.magnitude.id, event.magnitude.type Sort Method: quicksort Memory: 968kB - Nested Loop Left Join (cost=34642.50..739506.42 rows=15039 width=80) (actual time=6.927..17769.788 rows=5517 loops=1) - Hash Semi Join (cost=34642.50..723750.23 rows=14382 width=62) (actual time=6.912..17744.858 rows=2246 loops=1) Hash Cond: (origin.id = event.magnitude.origin_id) - Merge Left Join (cost=0.00..641544.72 rows=6133105 width=62) (actual time=0.036..16221.008 rows=6133105 loops=1) Merge Cond: (event.id = origin.eventid) - Index Scan using event_key_index on event (cost=0.00..163046.53 rows=3272228 width=12) (actual time=0.017..1243.616 rows=3276192 loops=1) - Index Scan using origin_fk_index on origin (cost=0.00..393653.81 rows=6133105 width=54) (actual time=0.013..3033.657 rows=6133105 loops=1) - Hash (cost=34462.73..34462.73 rows=14382 width=4) (actual time=6.668..6.668 rows=3198 loops=1) Buckets: 2048 Batches: 1 Memory Usage: 113kB - Bitmap Heap Scan on magnitude (cost=324.65..34462.73 rows=14382 width=4) (actual time=1.682..5.414 rows=3198 loops=1) Recheck Cond: (magnitude = 7.2) - Bitmap Index Scan on mag_index (cost=0.00..321.05 rows=14382 width=0) (actual time=1.331..1.331 rows=3198 loops=1) Index Cond: (magnitude = 7.2) - Index Scan using mag_fkey_index on magnitude (cost=0.00..1.06 rows=3 width=22) (actual time=0.007..0.009 rows=2 loops=2246) Index Cond: (origin.id = event.magnitude.origin_id) Total runtime: 17799.669 ms This query runs in Oracle in 1 second while takes 16 seconds in postgres, The difference tells me that I am doing something wrong somewhere. This is a new installation on a local Mac machine with 12G of RAM. I have: effective_cache_size=4096MB shared_buffer=2048MB work_mem=100MB It sounds like the queries are not doing the same thing. What is the schema/index definition for Oracle versus PostgreSQL? Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Postgres 9.0 has a bias against indexes
On Thu, Jan 27, 2011 at 10:41:08AM -0500, Mladen Gogala wrote: I have a table EMP, with 14 rows and a description like this: scott= \d+ emp Table public.emp Column |Type | Modifiers | Storage | Description --+-+---+--+- empno| smallint| not null | plain| ename| character varying(10) | | extended | job | character varying(9)| | extended | mgr | smallint| | plain| hiredate | timestamp without time zone | | plain| sal | double precision| | plain| comm | double precision| | plain| deptno | smallint| | plain| Indexes: emp_pkey PRIMARY KEY, btree (empno) emp_mgr_i btree (mgr) Foreign-key constraints: fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno) Has OIDs: no scott= A recursive query doesn't use existing index on mgr: scott= explain analyze with recursive e(empno,ename,mgr,bossname,level) as ( select empno,ename,mgr,NULL::varchar,0 from emp where empno=7839 union select emp.empno,emp.ename,emp.mgr,e.ename,e.level+1 from emp,e where emp.mgr=e.empno) select * from e; QUERY PLAN - CTE Scan on e (cost=20.59..23.21 rows=131 width=78) (actual time=0.020..0.143 rows=14 loops=1) CTE e - Recursive Union (cost=0.00..20.59 rows=131 width=52) (actual time=0.018..0.128 rows=14 loops=1) - Seq Scan on emp (cost=0.00..1.18 rows=1 width=10) (actual time=0.013..0.015 rows=1 loops=1) Filter: (empno = 7839) - Hash Join (cost=0.33..1.68 rows=13 width=52) (actual time=0.016..0.021 rows=3 loops=4) Hash Cond: (public.emp.mgr = e.empno) - Seq Scan on emp (cost=0.00..1.14 rows=14 width=10) (actual time=0.001..0.004 rows=14 loops=4) - Hash (cost=0.20..0.20 rows=10 width=44) (actual time=0.004..0.004 rows=4 loops=4) Buckets: 1024 Batches: 1 Memory Usage: 1kB - WorkTable Scan on e (cost=0.00..0.20 rows=10 width=44) (actual time=0.001..0.002 rows=4 loops=4) Total runtime: 0.218 ms (12 rows) scott= The optimizer will not use index, not even when I turn off both hash and merge joins. This is not particularly important for a table with 14 rows, but for a larger table, this is a problem. The only way to actually force the use of index is by disabling seqscan, but that chooses a wrong path again, because it reads the outer table by primary key, which will be very slow. Full table scan, done by the primary key is probably the slowest thing around. I know about the PostgreSQL philosophy which says hints are bad, and I deeply disagree with it, but would it be possible to have at least one parameter that would change calculations in such a way that indexes are favored, where they exist? -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 www.vmsinfo.com Hi Mladen, PostgreSQL will only use an index if the planner thinks that it will be faster than the alternative, a sequential scan in this case. For 14 rows, a sequential scan is 1 read and should actually be faster than the index. Did you try the query using EXPLAIN ANALYZE once with index and once without? What were the timings? If they do not match reality, adjusting cost parameters would be in order. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Slow query + why bitmap index scan??
On Wed, Jan 12, 2011 at 03:21:45PM +0100, Laszlo Nagy wrote: On 2011-01-12 14:42, Florian Weimer wrote: * Laszlo Nagy: This query: select hid from product_price_history where id=35547581 Returns 759 rows in 8837 msec! How can this be that slow??? If most records are on different heap pages, processing this query requires many seeks. 11ms per seek is not too bad if most of them are cache misses. How about this: select id,hdate from product_price_history where id=35547581 -- 759 rows, 8837 ms Query time average: 3 sec. Query plan: Bitmap Heap Scan on product_price_history (cost=13.91..1871.34 rows=474 width=16) Recheck Cond: (id = 35547582) - Bitmap Index Scan on idx_product_price_history_id_hdate (cost=0.00..13.79 rows=474 width=0) Index Cond: (id = 35547582) Why still the heap scan here? All fields in the query are in the index... Wouldn't a simple index scan be faster? (This is only a theoretical question, just I'm curious.) Because of PostgreSQL's MVCC design, it must visit each heap tuple to check its visibility as well as look it up in the index. My first idea to speed things up is to cluster this table regularly. That would convert (most of the) rows into a few pages. Few page reads - faster query. Is it a good idea? Yes, clustering this table would greatly speed up this type of query. Another question. Do you think that increasing shared_mem would make it faster? I doubt it. Currently we have: shared_mem = 6GB work_mem = 512MB total system memory=24GB Total database size about 30GB, but there are other programs running on the system, and many other tables. Thanks, Laszlo Clustering is your best option until we get indexes with visibility information. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL 9.0 x64 bit pgbench TPC very low question?
On Thu, Dec 23, 2010 at 09:20:59PM +0700, tuanhoanganh wrote: Could you show me what parameter of pgbouncer.ini can do that. I read pgbouncer and can not make pgbouncer open and keep 200 connect to postgres (Sorry for my English) Thanks you very much. Tuan Hoang ANh You need to use session pooling for that to work. From the man page: In order not to compromise transaction semantics for connection pooling, pgbouncer supports several types of pooling when rotating connections: Session pooling Most polite method. When client connects, a server connection will be assigned to it for the whole duration the client stays connected. When the client disconnects, the server connection will be put back into the pool. This is the default method. Transaction pooling A server connection is assigned to client only during a transaction. When PgBouncer notices that transaction is over, the server connection will be put back into the pool. Statement pooling Most aggressive method. The server connection will be put back into pool immediately after a query completes. Multi-statement transactions are disallowed in this mode as they would break. The fact that pgbouncer will not keep 200 connections open to the database means that you do not have enough work to actually keep 200 permanent connections busy. It is much more efficient to use transaction pooling. You typically want the number of persistent database connections to be a small multiple of the number of CPUs (cores) on your system. Then set pgbouncer to allow as many client connections as you need. This will give you the best throughput and pgbouncer can setup and tear down the connections to your clients much, much faster than making a full connection to the PostgreSQL database. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] CPU bound
On Mon, Dec 20, 2010 at 10:33:26AM -0500, James Cloos wrote: MG == Mladen Gogala mladen.gog...@vmsinfo.com writes: MG Good time accounting is the most compelling reason for having a wait MG event interface, like Oracle. Without the wait event interface, one MG cannot really tell where the time is spent, at least not without MG profiling the database code, which is not an option for a production MG database. And how exactly, given that the kernel does not know whether the CPU is active or waiting on ram, could an application do so? Exactly. I have only seen this data from hardware emulators. It would be nice to have... :) Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Compared MS SQL 2000 to Postgresql 9.0 on Windows
On Tue, Dec 07, 2010 at 11:56:51AM -0800, Richard Broersma wrote: On Tue, Dec 7, 2010 at 11:43 AM, Andy Colson a...@squeakycode.net wrote: In PG the first statement you fire off (like an insert into for example) will start a transaction. ?If you dont commit before you disconnect that transaction will be rolled back. ?Even worse, if your program does not commit, but keeps the connection to the db open, the transaction will stay open too. Huh - is this new? I always thought that every statement was wrapped in its own transaction unless you explicitly start your own. So you shouldn't need to commit before closing a connection if you never opened a transaction to begin with. -- Regards, Richard Broersma Jr. The default of autocommit unless explicitly starting a transaction with BEGIN is the normal behavior that I have seen as well. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Update problem on large table
On Mon, Dec 06, 2010 at 03:24:31PM -0500, Josh Kupershmidt wrote: On Mon, Dec 6, 2010 at 2:48 PM, Jon Nelson jnelson+pg...@jamponi.net wrote: On Mon, Dec 6, 2010 at 1:46 PM, bricklen brick...@gmail.com wrote: Not sure if anyone replied about killing your query, but you can do it like so: select pg_cancel_backend(5902); ?-- assuming 5902 is the pid of the query you want canceled. How does this differ from just killing the pid? pg_cancel_backend(5902) does the same thing as: kill -SIGINT 5902 Josh Yes, but you can use it from within the database. The kill command requires shell access to the backend. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] SELECT INTO large FKyed table is slow
On Wed, Dec 01, 2010 at 12:15:19PM -0500, Mladen Gogala wrote: Mario Splivalo wrote: I'll try what Pierre suggested, on whole new filesystem. This one did get quite filled with thousands of files that I deleted while the database was working. Mario Yes, that is a good idea. That's the reason why we need a defragmentation tool on Linux. Unfortunately, the only file system that currently has a decent defragmentation tool is XFS and that is a paid option, at least with Red Hat. Greg Smith has recently posted a wonderful review of PostgreSQL on various file systems: http://blog.2ndquadrant.com/en/2010/04/the-return-of-xfs-on-linux.html There is a operating system which comes with a very decent extent based file system and a defragmentation tool, included in the OS. The file system is called NTFS and company is in the land of Redmond, WA where the shadows lie. One OS to rule them all... -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 http://www.vmsinfo.com The Leader in Integrated Media Intelligence Solutions Redhat6 comes with ext4 which is an extent based filesystem with decent performance. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MVCC performance issue
On Fri, Nov 12, 2010 at 03:47:30PM +0200, Kyriacos Kyriacou wrote: This is my first post in this mailing list and I would like to raise an issue that in my opinion is causing performance issues of PostgreSQL especially in a transaction processing environment. In my company we are using PostgreSQL for the last 8 year for our in-house developed billing system (telecom). The last few months we started considering moving to another RDBMS just because of this issue. After all these years, I believe that the biggest improvement that could be done and will boost overall performance especially for enterprise application will be to improve Multiversion Concurrency Control (MVCC) mechanism. In theory this seems to be improving performance for SELECT queries but on tables with very intensive and frequent updates, even that is not fully true because of the fragmentation of data caused by MVCC. I saw cases were a SELECT COUNT(*) on an empty (!!!) table (used as a buffer) took more than 40min to return a result! VACUUM is not a solution in my opinion even though after the introduction of autovacuum daemon situation got much better. PROBLEM DECRIPTION -- By definition of MVCC, when an UPDATE is performed, PostgreSQL creates a new copy of the row in a new location. Any SELECT queries within the same session are accessing the new version of the raw and all other queries from other users are still accessing the old version. When transaction is COMMIT PostgreSQL makes the a new version of the row as the active row and expires the old row that remains dead and then is up to VACUUM procedure to recover the dead rows space and make it available to the database engine. In case that transaction is ROLLBACK then space reserved for the new version of the row is released. The result is to have huge fragmentation on table space, unnecessary updates in all affected indexes, unnecessary costly I/O operations, poor performance on SELECT that retrieves big record sets (i.e. reports etc) and slower updates. As an example, consider updating the live balance of a customer for each phone call where the entire customer record has to be duplicated again and again upon each call just for modifying a numeric value! SUGGESTION -- 1) When a raw UPDATE is performed, store all new raw versions either in separate temporary table space or in a reserved space at the end of each table (can be allocated dynamically) etc 2) Any SELECT queries within the same session will be again accessing the new version of the row 3) Any SELECT queries from other users will still be accessing the old version 4) When UPDATE transaction is ROLLBACK just release the space used in new temporary location 5) When UPDATE transaction is COMMIT then try to LOCK the old version and overwrite it at the same physical location (NO FRAGMENTATION). 6) Similar mechanism can be applied on INSERTS and DELETES 7) In case that transaction was COMMIT, the temporary location can be either released or archived/cleaned on a pre-scheduled basis. This will possibly allow the introduction of a TRANSACTION LOG backup mechanism as a next step. 8) After that VACUUM will have to deal only with deletions!!! I understand that my suggestion seems to be too simplified and also that there are many implementation details and difficulties that I am not aware. I strongly believe that the outcome of the discussion regarding this issue will be helpful. Best Regards, Kyriacos Kyriacou Senior Developer/DBA I cannot speak to your suggestion, but it sounds like you are not vacuuming enough and a lot of the bloat/randomization would be helped by making use of HOT updates in which the updates are all in the same page and are reclaimed almost immediately. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MVCC performance issue
On Fri, Nov 12, 2010 at 07:34:36AM -0800, bricklen wrote: On Fri, Nov 12, 2010 at 5:52 AM, Kenneth Marshall k...@rice.edu wrote: I cannot speak to your suggestion, but it sounds like you are not vacuuming enough and a lot of the bloat/randomization would be helped by making use of HOT updates in which the updates are all in the same page and are reclaimed almost immediately. Regards, Ken IIRC, HOT only operates on non-indexed columns, so if you the tables are heavily indexed you won't get the full benefit of HOT. I could be wrong though. That is true, but if they are truly having as big a bloat problem as the message indicated, it would be worth designing the schema to leverage HOT for the very frequent updates. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] MVCC performance issue
Ah, this is a very old version. If you can take advantage of a version with HOT support, you should be much, much happier. Cheers, Ken On Fri, Nov 12, 2010 at 06:14:00PM +0200, Kyriacos Kyriacou wrote: We are still using PostgreSQL 8.2.4. We are running a 24x7 system and database size is over 200Gb so upgrade is not an easy decision! I have it in my plans so in next few months I will setup new servers and upgrade to version 9. Which version of PostgreSQL are you basing this on? -- Thom Brown Twitter: @darkixion IRC (freenode): dark_ixion Registered Linux user: #516935 -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] anti-join chosen even when slower than old plan
On Wed, Nov 10, 2010 at 10:47:21PM -0500, Robert Haas wrote: On Wed, Nov 10, 2010 at 6:07 PM, Tom Lane t...@sss.pgh.pa.us wrote: Kevin Grittner kevin.gritt...@wicourts.gov writes: Robert Haas robertmh...@gmail.com wrote: Unfortunately, to know how much data we're going to grovel through, we need to know the plan; and to decide on the right plan, we need to know how much data we're going to grovel through. And that's where they've been ending. The only half-sane answer I've thought of is to apply a different cost to full-table or full-index scans based on the ratio with effective cache size. Kevin, yes, good point. Bravo! Let's do that. Details TBD, but suppose effective_cache_size = 1GB. What we know for sure is that a 4 GB table is not going to be fully cached but a 4 MB table may well be. In fact, I think we should assume that the 4 MB table IS cached, because the point is that if it's used at all, it soon will be. It's almost certainly a bad idea to build a plan around the idea of minimizing reads from that 4 MB table in favor of doing a substantial amount of additional work somewhere else. I suppose this could break down if you had hundreds and hundreds of 4 MB tables all of which were accessed regularly, but that's an unusual situation, and anyway it's not clear that assuming them all uncached is going to be any better than assuming them all cached. This might have some connection to some rather half-baked ideas I've been having in connection with the generalized-inner-indexscan problem. I don't have anything in the way of a coherent presentation to make yet, but the thing I'm being forced to realize is that sane modeling of a complex subplan that's on the inside of a nestloop join requires treating *every* scan type as having different costs the first time versus during rescan. ?If the total amount of data touched in the query is less than effective_cache_size, it's not unreasonable to suppose that I/O costs during rescan might be zero, even for a seqscan or a non-parameterized indexscan. ?In fact, only parameterized indexscans would be able to touch pages they'd not touched the first time, and so they ought to have higher not lower rescan costs in this environment. But once the total data volume exceeds effective_cache_size, you have to do something different since you shouldn't any longer assume the data is all cached from the first scan. ?(This isn't quite as hard as the case you're talking about, since I think the relevant data volume is the sum of the sizes of the tables used in the query; which is easy to estimate at the start of planning, unlike the portion of the tables that actually gets touched.) Well, we don't want the costing model to have sharp edges. effective_cache_size can't be taken as much more than an educated guess, and what actually happens will depend a lot on what else is going on on the system. If only one query is running on a system at a time and it is repeatedly seq-scanning a large table, the cost of reading pages in will be very small until the table grows large enough that you can't fit the whole thing in memory at once, and then will abruptly go through the roof. But realistically you're not going to know exactly where that edge is going to be, because you can't predict exactly how much concurrent activity there will be, for example, or how much work_mem allocations will push out of the OS buffer cache. So I'm thinking we should start the costs at something like 0.05/0.05 for tables that are much smaller than effective_cache_size and ramp up to 4/1 for tables that are larger than effective_cache_size. Maybe just by linearly ramping up, although that has a certain feeling of being without mathemetical soundness. An idea that isn't even half-baked yet is that once we had a cost model like that, we might be able to produce plans that are well-tuned for a heavily cached environment by applying the rescan cost model even to the first scan for a particular query. ?So that might lead to some sort of assume_cached GUC parameter, and perhaps Kevin could tune his reporting queries by turning that off instead of messing with individual cost constants. I think the real goal here should be to try to avoid needing a GUC. A lot of people could benefit if the system could make some attempt to recognize on its own which queries are likely to be cached. We already have parameters you can hand-tune for each query as necessary. Being able to set some parameters system-wide and then get sensible behavior automatically would be much nicer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a weighted sum of the
Re: [PERFORM] CREATE INDEX as bottleneck
On Thu, Nov 11, 2010 at 02:41:12PM +0100, Marc Mamin wrote: Hello, in the last years, we have successfully manage to cope with our data growth using partitioning and splitting large aggregation tasks on multiple threads. The partitioning is done logically by our applicationn server, thus avoiding trigger overhead. There are a few places in our data flow where we have to wait for index creation before being able to distribute the process on multiple threads again. With the expected growth, create index will probably become a severe bottleneck for us. Is there any chance to see major improvement on it in a middle future ? I guess the question is naive, but why can't posgres use multiple threads for large sort operation ? best regards, Marc Mamin There has been a recent discussion on the hackers mailing list on using the infrastructure that is already in place to lauch autovacuum processes to launch other helper processes. Something like this could be used to offload the sort process to a much more parallelize version that could take advantage of multiple I/O streams and CPU cores. Many things are possible given the appropriate resources: funding, coding and development cycles... Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] anti-join chosen even when slower than old plan
On Thu, Nov 11, 2010 at 09:15:58AM -0500, Mladen Gogala wrote: Kenneth Marshall wrote: I agree with the goal of avoiding the need for a GUC. This needs to be as automatic as possible. One idea I had had was computing a value for the amount of cache data in the system by keeping a sum or a weighted sum of the table usage in the system. Smaller tables and indexes would contribute a smaller amount to the total, while larger indexes and tables would contribute a larger amount. Then by comparing this running total to the effective_cache_size, set the random and sequential costs for a query. This would allow the case of many 4MB tables to favor disk I/O more than memory I/O. The weighting could be a function of simultaneous users of the table. I know this is a bit of hand-waving but some sort of dynamic feedback needs to be provided to the planning process as system use increases. Regards, Ken Kenneth, you seem to be only concerned with the accuracy of the planning process, not with the plan stability. As a DBA who has to monitor real world applications, I find things like an execution plan changing with the use of the system to be my worst nightmare. The part where you say that this needs to be as automatic as possible probably means that I will not be able to do anything about it, if the optimizer, by any chance, doesn't get it right. That looks to me like an entirely wrong way to go. When application developer tunes the SQL both him and me expect that SQL to always perform that way, not to change the execution plan because the system is utilized more than it was 1 hour ago. Nobody seems to have taken my suggestion about having a parameter which would simply invent the percentage out of thin air seriously, because it's obviously not accurate. However, the planner accuracy is not the only concern. Running applications on the system usually requires plan stability. Means of external control of the execution plan, DBA knobs and buttons that can be turned and pushed to produce the desired plan are also very much desired. -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 www.vmsinfo.com Hi Mladen, I think in many ways, this is the same problem. Because we are not correctly modeling the system, the plan choices are not accurate either for some scenarios. This means that when plan costs are compared, the evaluation is not accurate. This is what causes the terrible plans being right next to the good plans and is what impacts the plan stability. If the costs are correct, then in fact the plan stability will be much better with the better costing, not worse. Plans with close costs should actually have close performance. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] anti-join chosen even when slower than old plan
On Thu, Nov 11, 2010 at 03:56:25PM -0500, gnuo...@rcn.com wrote: On a thread some time ago, on a similar subject, I opined that I missed the ability to assign tables to tablespaces and buffers to tablespaces, thus having the ability to isolate needed tables (perhaps a One True Lookup Table, for example; or a Customer table) to memory without fear of eviction. I was sounding beaten about the face and breast. It really is an Enterprise way of handling the situation. regards, Robert ALTER TABLE can be used to change the tablespace of a table and/or index. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bufer cache replacement LRU algorithm?
Mladen, You would need to check the mailing lists. The release notes have it as being a clock sweep algorithm starting in version 8. Then additional changes were added to eliminate the cache blowout caused by a sequential scan and by vacuum/autovacuum. I do not believe that there are any parameters available other than total size of the pool and whether sequential scans are synchronized. Regards, Ken On Wed, Nov 03, 2010 at 12:35:33PM -0400, Mladen Gogala wrote: Where can I find the documentation describing the buffer replacement policy? Are there any parameters governing the page replacement policy? -- Mladen Gogala Sr. Oracle DBA 1500 Broadway New York, NY 10036 (212) 329-5251 http://www.vmsinfo.com The Leader in Integrated Media Intelligence Solutions -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Select count(*), the sequel
On Wed, Oct 27, 2010 at 05:49:42PM -0400, Tom Lane wrote: Kenneth Marshall k...@rice.edu writes: Just keeping the hope alive for faster compression. Is there any evidence that that's something we should worry about? I can't recall ever having seen a code profile that shows the pg_lzcompress.c code high enough to look like a bottleneck compared to other query costs. Now, the benefits of 2X or 3X space savings would be pretty obvious, but I've seen no evidence that we could easily achieve that either. regards, tom lane One use is to allow substr() on toasted values without needing to decompress the entire contents. Another possibility is to keep larger fields compressed in memory for some value of larger. With faster compression, it might by useful to compress the WAL files to support faster data rates and therefore update rates for the same hardware. And there are always the in page common substring storage optimizations to reduce index/table sizes. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Select count(*), the sequel
On Wed, Oct 27, 2010 at 09:52:49PM +0200, Pierre C wrote: Even if somebody had a great idea that would make things smaller without any other penalty, which I'm not sure I believe either. I'd say that the only things likely to bring an improvement significant enough to warrant the (quite large) hassle of implementation would be : - read-only / archive tables (get rid of row header overhead) - in-page compression using per-column delta storage for instance (no random access penalty, but hard to implement, maybe easier for read-only tables) - dumb LZO-style compression (license problems, needs parallel decompressor, random access penalty, hard to implement too) Different algorithms have been discussed before. A quick search turned up: quicklz - GPL or commercial fastlz - MIT works with BSD okay zippy - Google - no idea about the licensing lzf - BSD-type lzo - GPL or commercial zlib - current algorithm Of these lzf can compress at almost 3.7X of zlib and decompress at 1.7X and fastlz can compress at 3.1X of zlib and decompress at 1.9X. The same comparison put lzo at 3.0X for compression and 1.8X decompress. The block design of lzl/fastlz may be useful to support substring access to toasted data among other ideas that have been floated here in the past. Just keeping the hope alive for faster compression. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Periodically slow inserts
Hi, There are a lot of details missing about your system: http://wiki.postgresql.org/wiki/SlowQueryQuestions Cheers, Ken On Thu, Oct 21, 2010 at 02:25:44PM +0200, Gael Le Mignot wrote: Hello, We are using PostgreSQL for storing data and full-text search indexes for the webiste of a daily newspaper. We are very happy overall with the results, but we have one weird behaviour that we would like to solve. ... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Select count(*), the sequel
Hi, Interesting data points. The amount of rows that you managed to insert into PostgreSQL before Oracle gave up the ghost is 95% of the rows in the Oracle version of the database. To count 5% fewer rows, it took PostgreSQL 24 seconds longer. Or adjusting for the missing rows, 52 seconds longer for the entire table or 18% longer than the full table scan in Oracle. This seems to be well within the table layout size differences, possibly due to the fillfactor used --not really bad at all. Now the timings due to algorithm changes are interesting as indicating the room for improvement due to those type of changes. A parallel sequential full-table scan in PostgreSQL could provide the same speed up. Currently that is not possible ... but development continues a pace... In fact, developing such functions in PostgreSQL could end up being less expensive long-term than licensing Oracle RAC. I think the point that you have helped make is that PostgreSQL performs very well for many use cases that have typically been relegated to expensive commecial databases such as Oracle, DB2,... Regards, Ken On Sat, Oct 16, 2010 at 12:53:50PM -0400, Mladen Gogala wrote: There was some doubt as for the speed of doing the select count(*) in PostgreSQL and Oracle. To that end, I copied the most part of the Oracle table I used before to Postgres. Although the copy wasn't complete, the resulting table is already significantly larger than the table it was copied from. The result still shows that Oracle is significantly faster: Oracle result: SQL alter system flush buffer_cache; System altered. SQL select /*+ full(NO) noparallel */ count(*) from ni_occurrence no; COUNT(*) -- 402062638 Elapsed: 00:03:16.45 Hints are necessary because Oracle table is declared as parallel and I didn't want the PK index to be used for counting. Oracle has a good habit of using PK's for counting, if available. SQL select bytes/1048576 as MB 2 from user_segments 3 where segment_name='NI_OCCURRENCE'; MB -- 35329 Elapsed: 00:00:00.85 SQL So, oracle stores 402 million records in 35GB and counts them in 3 minutes 16.45 seconds The very same table was partially copied to Postgres, copying died with ORA-01555 snapshot too old sometimes this morning. I ran vacuumdb -f -z on the database after the copy completed and the results are below. mgogala=# select count(*) from ni_occurrence; count --- 382400476 (1 row) Time: 221716.466 ms mgogala=# mgogala=# select 221/60::real; ?column? -- 3.68 (1 row) Time: 0.357 ms mgogala=# mgogala=# select pg_size_pretty(pg_table_size('ni_occurrence')); pg_size_pretty 46 GB (1 row) Time: 0.420 ms mgogala=# The database wasn't restarted, no caches were flushed, the comparison was done with a serious advantage for PostgreSQL. Postgres needed 3.68 minutes to complete the count which is about the same Oracle but still somewhat slower. Also, I am worried about the sizes. Postgres table is 11GB larger than the original, despite having less data. That was an unfair and unbalanced comparison because Oracle's cache was flushed and Oracle was artificially restrained to use the full table scan without the aid of parallelism. Here is the same result, with no hints and the autotrace on, which shows what happens if I turn the hints off: SQL select count(*) from ni_occurrence no; COUNT(*) -- 402062638 Elapsed: 00:00:52.61 Execution Plan -- Plan hash value: 53476935 | Id | Operation | Name | Rows | Cost (%CPU)| Time |TQ |IN-OUT| PQ Distrib | | 0 | SELECT STATEMENT | |1 | 54001 (19)| 00:01:08 | | | | | 1 | SORT AGGREGATE | |1 | | | | | | | 2 | PX COORDINATOR | | | | | | | | | 3 |PX SEND QC (RANDOM) | :TQ1 |1 | | | Q1,00 | P-S | QC (RAND) | | 4 | SORT AGGREGATE | |1 | | | Q1,00 | PCWP | | | 5 | PX BLOCK ITERATOR | | 402M| 54001 (19)| 00:01:08 | Q1,00 | PCWC | | | 6 | INDEX FAST FULL SCAN| IDX_NI_OCCURRENCE_PID | 402M| 54001 (19)| 00:01:08 | Q1,00 | PCWP | | It took just 52 seconds to count everything, but Oracle didn't
Re: [PERFORM] Query much faster with enable_seqscan=0
You DB is more than likely cached. You should adjust your page costs to better reflect reality and then the planner can make more accurate estimates and then choose the proper plan. Cheers, Ken On Tue, Sep 21, 2010 at 12:32:01PM -0500, Ogden wrote: Hello, I have received some help from the IRC channel, however, the problem still exists. When running the following query with enable_seqscan set to 0, it takes less than a second, whereas with it set to 1, the query returns in 14 seconds. The machine itself has 8GB Ram and is running PostgreSQL 9.0 on Debian Lenny. The database size is about 7GB. Query: SELECT tr.id, tr.sid FROM test_registration tr, INNER JOIN test_registration_result r on (tr.id = r.test_registration_id) WHERE. tr.test_administration_id='32a22b12-aa21-11df-a606-96551e8f4e4c'::uuid GROUP BY tr.id, tr.sid demo=# \d test_registration Table public.test_registration Column |Type | Modifiers +-+ id | uuid| not null sid| character varying(36) | not null created_date | timestamp without time zone | not null default now() modified_date | timestamp without time zone | not null test_administration_id | uuid| not null teacher_number | character varying(15) | test_version_id| uuid| Indexes: test_registration_pkey PRIMARY KEY, btree (id) test_registration_sid_key UNIQUE, btree (sid, test_administration_id) test_registration_teacher btree (teacher_number) test_registration_test_id btree (test_administration_id) demo=# \d test_registration_result Table public.test_registration_result Column| Type | Modifiers --+---+--- answer | character varying(15) | question_id | uuid | not null score| double precision | test_registration_id | uuid | not null Indexes: test_registration_result_pkey PRIMARY KEY, btree (question_id, test_registration_id) test_registration_result_answer btree (test_registration_id, answer, score) test_registration_result_test btree (test_registration_id) Explain Analyze: -- HashAggregate (cost=951169.97..951198.37 rows=2840 width=25) (actual time=14669.039..14669.843 rows=2972 loops=1) - Hash Join (cost=2988.07..939924.85 rows=2249024 width=25) (actual time=551.464..14400.061 rows=638980 loops=1) Hash Cond: (r.test_registration_id = tr.id) - Seq Scan on test_registration_result r (cost=0.00..681946.72 rows=37199972 width=16) (actual time=0.015..6073.101 rows=37198734 loops=1) - Hash (cost=2952.57..2952.57 rows=2840 width=25) (actual time=2.516..2.516 rows=2972 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 160kB - Bitmap Heap Scan on test_registration tr (cost=44.29..2952.57 rows=2840 width=25) (actual time=0.528..1.458 rows=2972 loops=1) Recheck Cond: (test_administration_id = 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid) - Bitmap Index Scan on test_registration_test_administration_id (cost=0.00..43.58 rows=2840 width=0) (actual time=0.507..0.507 rows=2972 loops=1) Index Cond: (test_administration_id = 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid) Total runtime: 14670.337 ms (11 rows) real 0m14.698s user 0m0.000s sys 0m0.008s With set enable_seqscan=0; SET QUERY PLAN - HashAggregate (cost=1225400.19..1225428.59 rows=2840 width=25) (actual time=748.397..749.160 rows=2972 loops=1) - Nested Loop (cost=0.00..1214155.07 rows=2249024 width=25) (actual time=0.107..465.165 rows=638980 loops=1) - Index Scan using test_registration_test_administration_id on test_registration tr (cost=0.00..4413.96 rows=2840 width=25) (actual time=0.050..1.610 rows=2972 loops=1) Index Cond: (test_administration_id = 'e26a165a-c19f-11df-be2f-778af560e5a2'::uuid) - Index Scan using
Re: [PERFORM] now() gives same time within the session
On Mon, Jul 12, 2010 at 06:11:31AM -0700, Rob Wultsch wrote: On Mon, Jul 12, 2010 at 4:15 AM, A. Kretschmer andreas.kretsch...@schollglas.com wrote: In response to atul.g...@globaldatapoint.com : Hi, I need to log the start and end time of the procedures in a table. But the start and end time are same. This is how I recreated the issue. create table test_time (time timestamp); delete from ?test_time; insert into test_time select now(); Use timeofday() instead, now() returns the transaction starting time. Is this part of the SQL standard? No, see section 9.9.4 of the manual. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Highly Efficient Custom Sorting
Hi Eliot, Would you mind posting your code for reference. It is nice to have working examples when trying to figure out how it all fits together. Regards, Ken On Wed, Jul 07, 2010 at 03:23:12PM -0400, Eliot Gable wrote: Thanks again for all the input and suggestions from people. I have this sorting algorithm re-implemented in C now and it is somewhere 2ms to run it now; though it is difficult to get a more accurate measure. There may be some additional optimizations I can come up with, but for now, this will work very well compared to the alternative methods. On Tue, Jul 6, 2010 at 6:21 PM, Tom Lane t...@sss.pgh.pa.us wrote: Eliot Gable egable+pgsql-performa...@gmail.comegable%2bpgsql-performa...@gmail.com writes: Do I need to somehow force the server to unload and then re-load this .so file each time I build a new version of it? If so, how do I do that? Start a new database session. regards, tom lane -- Eliot Gable We do not inherit the Earth from our ancestors: we borrow it from our children. ~David Brower I decided the words were too conservative for me. We're not borrowing from our children, we're stealing from them--and it's not even considered to be a crime. ~David Brower Esse oportet ut vivas, non vivere ut edas. (Thou shouldst eat to live; not live to eat.) ~Marcus Tullius Cicero -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Write performance
On Thu, Jun 24, 2010 at 02:43:33PM +0200, Janning wrote: Hi, at the moment we encounter some performance problems with our database server. We have a 12 GB RAM machine with intel i7-975 and using 3 disks Seagate Barracuda 7200.11, ST31500341AS (1.5 GB) One disk for the system and WAL etc. and one SW RAID-0 with two disks for postgresql data. Our database is about 24GB. Our munin graph reports at 9:00 a clock writes of 3000 blocks per second and reads of about 1000 blocks per second on our disk which holds the data directories of postgresql (WAL are on a different disk) 3000 blocks ~ about 3 MB/s write 1000 blocks ~ about 1 MB/s read At the same time we have nearly 50% CPU I/O wait and only 12% user CPU load (so 4 of 8 cpu cores are in use for io wait) We know, its a poor man disk setup (but we can not find a hoster with rather advanced disk configuration at an affordable price). Anyway, we ran some tests on it: # time sh -c dd if=/dev/zero of=bigfile bs=8k count=300 sync 300+0 records in 300+0 records out 2457600 bytes (25 GB) copied, 276.03 s, 89.0 MB/s real 4m48.658s user 0m0.580s sys 0m51.579s # time dd if=bigfile of=/dev/null bs=8k 300+0 records in 300+0 records out 2457600 bytes (25 GB) copied, 222.841 s, 110 MB/s real 3m42.879s user 0m0.468s sys 0m18.721s Of course, writing large chunks is quite a different usage pattern. But I am wondering that writing 3MB/s and reading 1 MB/s seams to be a limit if i can run a test with 89 MB/s writing and 110MB/s reading. Can you give some hints, if this numbers seems to be reasonable? kind regards Janning Yes, these are typical random I/O versus sequential I/O rates for hard drives. Your I/O is extremely under-powered relative to your CPU/memory. For DB servers, many times you need much more I/O instead. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] requested shared memory size overflows size_t
On Fri, Jun 18, 2010 at 12:46:11AM +0100, Tom Wilcox wrote: On 17/06/2010 22:41, Greg Smith wrote: Tom Wilcox wrote: Any suggestions for good monitoring software for linux? By monitoring, do you mean for alerting purposes or for graphing purposes? Nagios is the only reasonable choice for the former, while doing at best a mediocre job at the latter. For the later, I've found that Munin does a good job of monitoring Linux and PostgreSQL in its out of the box configuration, in terms of providing useful activity graphs. And you can get it to play nice with Nagios. Thanks Greg. Ill check Munin and Nagios out. It is very much for graphing purposes. I would like to be able to perform objective, platform-independent style performance comparisons. Cheers, Tom Zabbix-1.8+ is also worth taking a look at and it can run off our favorite database. It allows for some very flexible monitoring and trending data collection. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] slow query performance
Hi Anj, That is an indication that your system was less correctly modeled with a random_page_cost=2 which means that the system will assume that random I/O is cheaper than it is and will choose plans based on that model. If this is not the case, the plan chosen will almost certainly be slower for any non-trivial query. You can put a 200mph speedometer in a VW bug but it will never go 200mph. Regards, Ken On Thu, Jun 10, 2010 at 07:54:01PM -0700, Anj Adu wrote: I changed random_page_cost=4 (earlier 2) and the performance issue is gone I am not clear why a page_cost of 2 on really fast disks would perform badly. Thank you for all your help and time. On Thu, Jun 10, 2010 at 8:32 AM, Anj Adu fotogra...@gmail.com wrote: Attached Thank you On Thu, Jun 10, 2010 at 6:28 AM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 9, 2010 at 11:17 PM, Anj Adu fotogra...@gmail.com wrote: The plan is unaltered . There is a separate index on theDate as well as one on node_id I have not specifically disabled sequential scans. Please do SHOW ALL and attach the results as a text file. This query performs much better on 8.1.9 on a similar sized table.(althought the random_page_cost=4 on 8.1.9 and 2 on 8.4.0 ) Well that could certainly matter... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need to increase performance of a query
On Thu, Jun 10, 2010 at 12:34:07PM -0700, Anne Rosset wrote: Jochen Erwied wrote: Thursday, June 10, 2010, 8:36:08 PM you wrote: psrdb=# (SELECT psrdb(#MAX(item_rank.rank) AS maxRank psrdb(# FROM psrdb(#item_rank item_rank psrdb(# WHERE psrdb(#item_rank.project_id='proj2783' psrdb(# AND item_rank.pf_id IS NULL psrdb(# psrdb(# ) psrdb-# ORDER BY psrdb-# maxRank DESC; Don't think it does really matter, but why do you sort a resultset consisting of only one row? Sorry, I should have removed the ORDER by (the full query has a union). So without the ORDER by, here are the results: psrdb=# SELECT psrdb-#MAX(item_rank.rank) AS maxRank psrdb-# FROM psrdb-#item_rank item_rank psrdb-# WHERE psrdb-#item_rank.pf_id='plan1408'; maxrank - 2050400 (1 row) Time: 1.516 ms psrdb=# SELECT psrdb-#MAX(item_rank.rank) AS maxRank psrdb-# FROM psrdb-#item_rank item_rank psrdb-# WHERE psrdb-#item_rank.project_id='proj2783' psrdb-# AND item_rank.pf_id IS NULL; maxrank - 202 (1 row) Time: 13.177 ms Is there anything that can be done for the second one? Thanks, Anne What about an IS NULL index on pf_id? Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] debugging handle exhaustion and 15 min/ 5mil row delete
On Fri, May 07, 2010 at 09:37:42AM -0400, Mark Stosberg wrote: Hello, We've been a satified user of PostgreSQL for several years, and use it to power a national pet adoption website: http://www.adoptapet.com/ Recently we've had a regularly-timed middle-of-the-night problem where database handles are exhausted for a very brief period. In tracking it down, I have found that the event seems to correspond to a time when a cron script is deleting from a large logging table, but I'm not certain if this is the cause or a correlation. We are deleting about 5 million rows from a time-based logging table that is replicated by Slony. We are currently using a single delete statement, which takes about 15 minutes to run. There is no RI on the table, but the use of Slony means that a trigger call and action is made for every row deleted, which causes a corresponding insertion in another table so the deletion can be replicated to the slave. My questions: - Could this kind of activity lead to an upward spiral in database handle usage? Yes. - Would it be advisable to use several small DELETE statements instead, to delete rows in batches of 1,000. We could use the recipe for this that was posted earlier to this list: Yes, that is the method we use in several cases to avoid this behavior. Deletion is a more intensive process in PostgreSQL, so batching it will keep from dragging down other queries which results in your out-of-handles error. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] autovacuum strategy / parameters
Check out the manual: http://www.postgresql.org/docs/8.4/static/routine-vacuuming.html#AUTOVACUUM Cheers, Ken On Wed, Apr 28, 2010 at 10:37:35AM -0400, akp geek wrote: Hi - don't want to side track the discussion. We have 8.4, which of AUTOVACUUM PARAMETERS can be set to handle individual table? I ran into bloat with small table only. Now the issue is being resolved. Regards On Wed, Apr 28, 2010 at 10:20 AM, Thomas Kellerer spam_ea...@gmx.netwrote: Rick, 22.04.2010 22:42: So, in a large table, the scale_factor is the dominant term. In a small table, the threshold is the dominant term. But both are taken into account. The default values are set for small tables; it is not being run for large tables. With 8.4 you can adjust the autovacuum settings per table... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] mysql to postgresql, performance questions
If you expect this DB to be memory resident, you should update the cpu/disk cost parameters in postgresql.conf. There was a post earlier today with some more reasonable starting values. Certainly your test DB will be memory resident. Ken On Thu, Mar 18, 2010 at 03:31:18PM +0100, Corin wrote: Hi all, I'm running quite a large social community website (250k users, 16gb database). We are currently preparing a complete relaunch and thinking about switching from mysql 5.1.37 innodb to postgresql 8.4.2. The database server is a dual dualcore operton 2216 with 12gb ram running on debian amd64. For a first impression I ran a simple query on our users table (snapshot with only ~ 45.000 records). The table has an index on birthday_age [integer]. The test executes 10 times the same query and simply discards the results. I ran the tests using a php and a ruby script, the results are almost the same. Unluckily mysql seems to be around 3x as fast as postgresql for this simple query. There's no swapping, disc reading involved...everything is in ram. query select * from users where birthday_age between 12 and 13 or birthday_age between 20 and 22 limit 1000 mysql {select_type=SIMPLE, key_len=1, id=1, table=users, type=range, possible_keys=birthday_age, rows=7572, Extra=Using where, ref=nil, key=birthday_age} 15.104055404663 14.209032058716 18.857002258301 15.714883804321 14.73593711853 15.048027038574 14.589071273804 14.847040176392 15.192985534668 15.115976333618 postgresql {QUERY PLAN=Limit (cost=125.97..899.11 rows=1000 width=448) (actual time=0.927..4.990 rows=1000 loops=1)} {QUERY PLAN= - Bitmap Heap Scan on users (cost=125.97..3118.00 rows=3870 width=448) (actual time=0.925..3.420 rows=1000 loops=1)} {QUERY PLAN= Recheck Cond: (((birthday_age = 12) AND (birthday_age = 13)) OR ((birthday_age = 20) AND (birthday_age = 22)))} {QUERY PLAN= - BitmapOr (cost=125.97..125.97 rows=3952 width=0) (actual time=0.634..0.634 rows=0 loops=1)} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..41.67 rows=1341 width=0) (actual time=0.260..0.260 rows=1327 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 12) AND (birthday_age = 13))} {QUERY PLAN= - Bitmap Index Scan on birthday_age (cost=0.00..82.37 rows=2611 width=0) (actual time=0.370..0.370 rows=2628 loops=1)} {QUERY PLAN= Index Cond: ((birthday_age = 20) AND (birthday_age = 22))} {QUERY PLAN=Total runtime: 5.847 ms} 44.173002243042 41.156768798828 39.988040924072 40.470123291016 40.035963058472 40.077924728394 40.94386100769 40.183067321777 39.83211517334 40.256977081299 I also wonder why the reported runtime of 5.847 ms is so much different to the runtime reported of my scripts (both php and ruby are almost the same). What's the best tool to time queries in postgresql? Can this be done from pgadmin? Thanks, Corin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad query plan inside EXISTS clause
EXISTS matches NULLs too and since they are not indexed a sequential scan is needed to check for them. Try using IN instead. Cheers, Ken On Wed, Mar 10, 2010 at 02:26:20PM +0100, Benoit Delbosc wrote: Hi all, I am trying to understand why inside an EXISTS clause the query planner does not use the index: EXPLAIN ANALYZE SELECT 1 WHERE EXISTS (SELECT 1 FROM read_acls_cache WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'); QUERY PLAN Result (cost=1.19..1.20 rows=1 width=0) (actual time=466.317..466.318 rows=1 loops=1) One-Time Filter: $0 InitPlan 1 (returns $0) - Seq Scan on read_acls_cache (cost=0.00..62637.01 rows=52517 width=0) (actual time=466.309..466.309 rows=1 loops=1) Filter: ((users_md5)::text = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text) Total runtime: 466.369 ms (6 rows) While it does use the index when executing only the subquery: EXPLAIN ANALYZE SELECT 1 FROM read_acls_cache WHERE users_md5 = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'; QUERY PLAN -- Bitmap Heap Scan on read_acls_cache (cost=2176.10..35022.98 rows=52517 width=0) (actual time=9.065..21.988 rows=51446 loops=1) Recheck Cond: ((users_md5)::text = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text) - Bitmap Index Scan on read_acls_cache_users_md5_idx (cost=0.00..2162.97 rows=52517 width=0) (actual time=8.900..8.900 rows=51446 loops=1) Index Cond: ((users_md5)::text = '9bc9012eb29c0bb2ae3cc7b5e78c2acf'::text) Total runtime: 25.464 ms (5 rows) The table has been vacuumed, analyzed and reindexed. Thanks for your support. Regards ben Here are some more info : \d read_acls_cache Table public.read_acls_cache Column | Type | Modifiers ---+---+--- users_md5 | character varying(34) | not null acl_id| character varying(34) | Indexes: read_acls_cache_users_md5_idx btree (users_md5) SELECT COUNT(*) FROM read_acls_cache; count - 2520899 (1 row) SELECT COUNT(DISTINCT(users_md5)) FROM read_acls_cache ; count --- 49 (1 row) SELECT Version(); version -- PostgreSQL 8.4.2 on x86_64-pc-linux-gnu, compiled by GCC gcc-4.2.real (GCC) 4.2.4 (Ubuntu 4.2.4-1ubuntu4), 64 (1 row) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] prepared statements and partitioning (partition elimination not working)
On Mon, Mar 08, 2010 at 10:24:56AM -0700, Kevin Kempter wrote: Hi all; we've found that partition elimination is not happening for a prepared statement, however running the same statement in psql manually does give us partition elimination. Is this a known issue? Yes, see the recent threads on performance of prepared queries. It concerns the availability of information on the query inputs that is available to psql and not a pre-prepared query. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partitioned tables query not using indexes
On Sun, Feb 28, 2010 at 12:29:14PM -0800, Josh Berkus wrote: However the same query against the base table when specifying the check constraint key in the where clause produces sequential scans: Does the master table have the same indexes as the slave partitions? --Josh Berkus Does this help? I have an empty base table without indexes and partitions underneath that do have the index. I did not think that an index on the parent table did anything. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] index usage in not like
On Thu, Feb 18, 2010 at 01:18:10PM +0100, A. Kretschmer wrote: In response to Thom Brown : On 18 February 2010 11:55, AI Rumman rumman...@gmail.com wrote: Not like operation does not use index. select * from vtiger_contactscf where lower(cf_1253) not like lower('Former%') I created index on lower(cf_1253). How can I ensure index usage in not like operation? Anyone please help. How many rows do you have in your table? If there are relatively few, it probably guesses it to be cheaper to do a sequential scan and calculate lower values on-the-fly rather than bother with the index. That's one reason, an other reason, i think, is, that a btree-index can't search with an 'not like' - operator. test=*# insert into words select 'fucking example' from generate_series(1,1); INSERT 0 1 test=*# insert into words select 'abc' from generate_series(1,10); INSERT 0 10 test=*# explain select * from words where lower(w) like lower('a%') or lower(w) like lower('b%'); QUERY PLAN - Bitmap Heap Scan on words (cost=1538.75..6933.39 rows=55643 width=36) Recheck Cond: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) Filter: ((lower(w) ~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) - BitmapOr (cost=1538.75..1538.75 rows=57432 width=0) - Bitmap Index Scan on idx_words (cost=0.00..1027.04 rows=39073 width=0) Index Cond: ((lower(w) ~=~ 'a'::text) AND (lower(w) ~~ 'b'::text)) - Bitmap Index Scan on idx_words (cost=0.00..483.90 rows=18359 width=0) Index Cond: ((lower(w) ~=~ 'b'::text) AND (lower(w) ~~ 'c'::text)) (8 rows) test=*# explain select * from words where lower(w) not like lower('a%') or lower(w) like lower('b%'); QUERY PLAN --- Seq Scan on words (cost=0.00..10624.48 rows=282609 width=36) Filter: ((lower(w) !~~ 'a%'::text) OR (lower(w) ~~ 'b%'::text)) (2 rows) In other words: revert your where-condition from 'not like' to multiple 'like' conditions for all letters except 'f%'. Andreas The 'not like' condition is likely to be extremely non-selective which would cause a sequential scan to be used in any event whether or not an index could be used. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad plan choice nestloop vs. hashjoin
On Mon, Jan 18, 2010 at 12:13:24PM -0500, Tom Lane wrote: Kenneth Marshall k...@rice.edu writes: We have just upgraded our monitoring server software and now the following query for graphing the data performs abysmally with the default settings. Here is the results of the EXPLAIN ANALYZE run with nestloops enabled: I poked at this a bit more and now think I see where the problem is. The thing that would be easiest for you to do something about is the misestimation here: - Nested Loop Anti Join (cost=94.07..10294.64 rows=1 width=8) (actual time=98.049..27907.702 rows=281 loops=1) Join Filter: (gii.graphid = g.graphid) - Bitmap Heap Scan on graphs g (cost=94.07..233.17 rows=1 width=8) (actual time=0.529..1.772 rows=281 loops=1) Recheck Cond: (graphid = ANY ('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581! ,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[])) Filter: ((graphid / 100::bigint) = 0) - Bitmap Index Scan on graphs_pkey (cost=0.00..94.07 rows=246 width=0) (actual time=0.507..0.507 rows=294 loops=1) Index Cond: (graphid = ANY ('{2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835! ,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010}'::bigint[])) The estimate of the ANY condition is not too bad (246 vs 294 actual). But it hasn't got any ability to deal with the (graphid / 100::bigint) = 0 filter condition, and is falling back to a default selectivity estimate for that, which IIRC is just 0.005 --- but actually, that condition doesn't eliminate any rows at all. Do you need that condition in the first place? Can you persuade your client-side software to eliminate it when it's impossible based on the ANY list? Or at least recast it to something more easily estimatable, like graphid 100? If you really have to have the condition just like that, I'd advise creating an index on (graphid / 100::bigint). That would cause ANALYZE to accumulate statistics on that expression, which'd result in a far better estimate. The reason that this misestimate hammers it so hard is that the inside of the nestloop looks like - Nested Loop (cost=0.00..17449.43 rows=1954 width=8) (actual time=99.304..99.304 rows=0 loops=281) - Index Scan using graphs_items_2 on graphs_items gii (cost=0.00..69.83 rows=1954 width=16) (actual time=0.013..3.399 rows=1954 loops=281) - Index Scan using
[PERFORM] Bad plan choice nestloop vs. hashjoin
Dear performance group: We have just upgraded our monitoring server software and now the following query for graphing the data performs abysmally with the default settings. Here is the results of the EXPLAIN ANALYZE run with nestloops enabled: SET enable_nestloop = 'on'; EXPLAIN SELECT g.graphid FROM graphs g,graphs_items gi,items i,hosts_groups hg,rights r,users_groups ug WHERE (g.graphid/100) in (0) AND gi.graphid=g.graphid AND i.itemid=gi.itemid AND hg.hostid=i.hostid AND r.id=hg.groupid AND r.groupid=ug.usrgrpid AND ug.userid=20 AND r.permission=2 AND NOT EXISTS( SELECT gii.graphid FROM graphs_items gii, items ii WHERE gii.graphid=g.graphid AND gii.itemid=ii.itemid AND EXISTS( SELECT hgg.groupid FROM hosts_groups hgg, rights rr, users_groups ugg WHERE ii.hostid=hgg.hostid AND rr.id=hgg.groupid AND rr.groupid=ugg.usrgrpid AND ugg.userid=20 AND rr.permission2)) AND (g.graphid IN (2,3,4,5,386,387,969,389,971,972,973,446,447,448,449,450,451,471,456,470,473,477,472,474,475,476,478,479,480,481,482,483,484,459,614,655,658,645,490,492,489,493,496,495,498,497,499,501,500,502,974,558,559,562,566,563,564,565,567,568,569,570,571,535,572,573,534,536,538,539,540,541,542,543,544,545,537,546,547,548,552,553,554,555,556,549,550,551,557,577,578,579,580,574,576,581,835,587,588,589,590,560,561,836,591,592,593,594,595,827,389,495,498,497,597,598,599,975,978,999,1004,604,605,606,679,616,634,635,636,637,638,618,629,630,631,632,633,671,682,669,670,678,679,680,674,672,676,673,675,677,681,682,683,683,644,652,829,681,687,698,685,686,705,706,707,708,830,945,946,710,716,712,714,713,709,718,721,720,719,723,724,747,749,750,730,731,732,733,734,735,736,737,738,739,740,741,742,743,744,745,772,774,775,755,756,757,758,759,760,761,762,763,764,765,766,767,768,769,770,777,776,977,824,823,826,825,829,832,833,835,581,836,842,852,854,839,840,838,853,855,847,848,944,846,859,850,899,901,902,903,864,865,866,867,976,979,939,941,942,943,906,907,908,909,910,868,969,991,950,955,964,966,952,953,962,965,967,959,961,968,1001,1002,1003,986,987,988,994,995,996,1008,1006,1007,1009,1010)); - QUERY PLAN -- Nested Loop (cost=94.07..10304.00 rows=1 width=8) (actual time=194.557..27975.338 rows=607 loops=1) Join Filter: (r.groupid = ug.usrgrpid) - Seq Scan on users_groups ug (cost=0.00..1.15 rows=1 width=8) (actual time=0.020..0.026 rows=1 loops=1) Filter: (userid = 20) - Nested Loop (cost=94.07..10302.65
Re: [PERFORM] Bad plan choice nestloop vs. hashjoin
On Fri, Jan 15, 2010 at 04:58:57PM -0600, Kevin Grittner wrote: Kenneth Marshall k...@rice.edu wrote: with the default settings Do you mean you haven't changed any settings in your postgresql.conf file from their defaults? -Kevin Sorry, here are the differences from the default: max_connections = 100 # (change requires restart) shared_buffers = 256MB # min 128kB or max_connections*16kB work_mem = 16MB # min 64kB maintenance_work_mem = 512MB# min 1MB synchronous_commit = off# immediate fsync at commit wal_buffers = 256kB # min 32kB checkpoint_segments = 30# in logfile segments, min 1, 16MB each seq_page_cost = 1.0 # measured on an arbitrary scale random_page_cost = 2.0 # same scale as above effective_cache_size = 12GB log_min_duration_statement = 5000 The machine has 16GB of RAM and the DB is currently about 8GB. It is going to grow much larger as information is acquired. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PG optimization question
On Sat, Jan 09, 2010 at 03:42:08PM +0300, Nickolay wrote: I do not see any way to normalize this table anymore. it's size is 4Gig for ~4M rows, i.e. 1Kb per row, i think it's ok. Also there are 2 indexes: by date_time and by a couple of service fields (total index size is 250Mb now). I think i'll be going to partition by months (approx. 1M rows or 1Gig per month), so it would be like 60 partitions for 5 years. Is that OK for postgres? Not a problem. We have a log server that has 64 daily partitions. Oh, btw, 95% of queries are searching rows for current date (last 24 hours). You may want to use a daily staging table and then flush to the monthly archive tables at the end of the day. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Massive table (500M rows) update nightmare
On Fri, Jan 08, 2010 at 12:38:46PM -0500, Carlo Stonebanks wrote: I thought that post mentioned that the plan was one statement in an iteration, and that the cache would have been primed by a previous query checking whether there were any rows to update. If that was the case, it might be worthwhile to look at the entire flow of an iteration. This is the only SQL query in the code in question - the rest of the code manages the looping and commit. The code was copied to PgAdminIII and values written in for the WHERE clause. In order for me to validate that rows would have been updated, I had to run a SELECT with the same WHERE clause in PgAdminIII first to see how many rows would have qualified. But this was for testing purposes only. The SELECT statement does not exist in the code. The vast majority of the rows that will be processed will be updated as this is a backfill to synch the old rows with the values being filled into new columns now being inserted. Also, if you ever responded with version and configuration information, I missed it. This is hosted on a new server the client set up so I am waiting for the exact OS and hardware config. PG Version is PostgreSQL 8.3.6, compiled by Visual C++ build 1400, OS appears to be Windows 2003 x64 Server. More than anything, I am more concerned with the long-term use of the system. This particular challenge with the 500M row update is one thing, but I am concerned about the exceptional effort required to do this. Is it REALLY this exceptional to want to update 500M rows of data in this day and age? Or is the fact that we are considering dumping and restoring and dropping indexes, etc to do all an early warning that we don't have a solution that is scaled to the problem? Config data follows (I am assuming commented values which I did not include are defaulted). Carlo Hi Carlo, It all boils down to resource management and constraints. For small problems relative to the amount of system resources available, little effort is needed to have satisfactory performance. As the problems consume more and more of the total resource capacity, you will need to know more and more in depth about the workings of every piece of the system to wring the most possible performance out of the system. Some of the discussions on this topic have covered a smattering of details that will become increasingly important as your system scales and determine whether or not it will scale. Many times the need for updates on such a massive scale do point to normalization problems. My two cents. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] big select is resulting in a large amount of disk writing by kjournald
Hint bit I/O? Ken On Wed, Dec 09, 2009 at 01:29:00PM -0500, Joseph S wrote: I just installed a shiny new database server with pg 8.4.1 running on CentOS 5.4. After using slony to replicate over my database I decided to do some basic performance tests to see how spiffy my shiny new server is. This machine has 32G ram, over 31 of which is used for the system file cache. So I run select count(*) from large_table and I see in xosview a solid block of write activity. Runtime is 28125.644 ms for the first run. The second run does not show a block of write activity and takes 3327.441 ms top shows that this writing is being done by kjournald. What is going on here? There is not a lot of write activity on this server so there should not be a significant number of dirty cache pages that kjournald would need to write out before it could read in my table. Certainly in the 31G being used for file cache there should be enough non-dirty pages that could be dropped to read in my table w/o having to flush anything to disk. My table size is 2,870,927,360 bytes. # cat /proc/sys/vm/dirty_expire_centisecs 2999 I restarted postgres and ran a count(*) on an even larger table. [local]= explain analyze select count(*) from et; QUERY PLAN --- Aggregate (cost=6837051.82..6837051.83 rows=1 width=0) (actual time=447240.157..447240.157 rows=1 loops=1) - Seq Scan on et (cost=0.00..6290689.25 rows=218545025 width=0) (actual time=5.971..400326.911 rows=218494524 loops=1) Total runtime: 447240.402 ms (3 rows) Time: 447258.525 ms [local]= explain analyze select count(*) from et; QUERY PLAN -- Aggregate (cost=6837113.44..6837113.45 rows=1 width=0) (actual time=103011.724..103011.724 rows=1 loops=1) - Seq Scan on et (cost=0.00..6290745.95 rows=218546995 width=0) (actual time=9.844..71629.497 rows=218496012 loops=1) Total runtime: 103011.832 ms (3 rows) Time: 103012.523 ms [local]= select pg_relation_size('et'); pg_relation_size -- 33631543296 (1 row) I posted xosview snapshots from the two runs at: http://www.tupari.net/2009-12-9/ This time the first run showed a mix of read/write activity instead of the solid write I saw before. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Unexpected sequential scan on an indexed column
On Mon, Nov 16, 2009 at 12:45:46PM -0800, Eddy Escardo-Raffo wrote: Yeah this kind of thing would probably work. Doing this in java with separate queries would be easy to code but require multiple round trips. Doing it as a stored procedure would be nicer but I'd have to think a little more about how to refactor the java code around the query to make this happen. Thanks for the suggestion. Eddy Hi Eddy, Here is a lookup wrapper that is used in DSPAM to work around a similar problem. Maybe you can use it as a template for your function: create function lookup_tokens(integer,bigint[]) returns setof dspam_token_data language plpgsql stable as ' declare v_rec record; begin for v_rec in select * from dspam_token_data where uid=$1 and token in (select $2[i] from generate_series(array_lower($2,1),array_upper($2,1)) s(i)) loop return next v_rec; end loop; return; end;'; Regards, Ken On Mon, Nov 16, 2009 at 9:44 AM, Dave Crooke dcro...@gmail.com wrote: Hi Eddy Perhaps a slightly naive suggestion have you considered converting the query to a small stored procedure ('function' in Postgres speak)? You can pull the location values, and then iterate over a query like this: select userid from users where location=:x which is more-or-less guaranteed to use the index. I had a somewhat similar situation recently, where I was passing in a list of id's (from outwith Postgres) and it would on occasion avoid the index in favour of a full table scan I changed this to iterate over the id's with separate queries (in Java, but using a function will achieve the same thing) and went from one 5 minute query doing full table scan to a handful of queries doing sub-millisecond direct index lookups. Cheers Dave -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] FTS performance with the Polish config
On Sat, Nov 14, 2009 at 12:25:05PM +0100, Wojciech Knapik wrote: Hello I just finished implementing a search engine for my site and found ts_headline extremely slow when used with a Polish tsearch configuration, while fast with English. All of it boils down to a simple testcase, but first some background. I tested on 8.3.1 on G5/OSX 10.5.8 and Xeon/Gentoo AMD64-2008.0 (2.6.21), then switched both installations to 8.3.8 (both packages compiled, but provided by the distro - port/emerge). The Polish dictionaries and config were created according to this article (it's in Polish, but the code is self-explanatory): http://www.depesz.com/index.php/2008/04/22/polish-tsearch-in-83-polski-tsearch-w-postgresie-83/ Now for the testcase: text = 'Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.' # explain analyze select ts_headline('polish', text, plainto_tsquery('polish', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=6.407..6.470 rows=1 loops=1) Total runtime: 6.524 ms (2 rows) # explain analyze select ts_headline('english', text, plainto_tsquery('english', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.861..0.895 rows=1 loops=1) Total runtime: 0.935 ms (2 rows) # explain analyze select ts_headline('simple', text, plainto_tsquery('simple', 'foobar')); QUERY PLAN Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.627..0.660 rows=1 loops=1) Total runtime: 0.697 ms (2 rows) # As you can see, the results differ by an order of magnitude between Polish and English. While in this simple testcase it's a non-issue, in the real world this translates into enormous overhead. One of the queries I ran testing my site's search function took 1870ms. When I took that query and changed all ts_headline(foo) calls to just foo, the time dropped below 100ms. That's the difference between something completely unacceptable and something quite useful. I can post various details about the hardware, software and specific queries, but the testcases speak for themselves. I'm sure you can easily reproduce my results. Hints would be very much appreciated, since I've already spent way more time on this, than I could afford. cheers, Wojciech Knapik PS. A few other details can be found here http://pastie.textmate.org/private/hqnqfnsfsknjyjlffzmog along with snippets of my conversations in #postgresql that lead to this testcase. Big thanks to RhodiumToad for helping me with fts for the last couple days ;] Hi, The documentation for ts_headline() states: ts_headline uses the original document, not a tsvector summary, so it can be slow and should be used with care. A typical mistake is to call ts_headline for every matching document when only ten documents are to be shown. SQL subqueries can help; here is an example: SELECT id, ts_headline(body, q), rank FROM (SELECT id, body, q, ts_rank_cd(ti, q) AS rank FROM apod, to_tsquery('stars') q WHERE ti @@ q ORDER BY rank DESC LIMIT 10) AS foo; It looks like you have proven that behavior. I have not looked at the ts_headline code, but it may also be slowed by the locale, so showing that it is faster for English is not really saying much. Maybe there is a better algorithm that could be used, but that would require code changes. It may be that you can change some of the parameters to speed it up. Good luck. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] limiting performance impact of wal archiving.
On Tue, Nov 10, 2009 at 12:55:42PM +0100, Laurent Laborde wrote: Hi ! We recently had a problem with wal archiving badly impacting the performance of our postgresql master. And i discovered cstream, that can limite the bandwidth of pipe stream. Here is our new archive command, FYI, that limit the IO bandwidth to 500KB/s : archive_command = '/bin/cat %p | cstream -i -o -t -500k | nice gzip -9 -c | /usr/bin/ncftpput etc...' PS : While writing that mail, i just found that i could replace : cat %p | cstream -i ... with cstream -i %p ... *grins* And here is a simple perl program that I have used for a similar reason. Obviously, it can be adapted to your specific needs. Regards, Ken throttle.pl--- #!/usr/bin/perl -w require 5.0;# written for perl5, hasta labyebye perl4 use strict; use Getopt::Std; # # This is an simple program to throttle network traffic to a # specified KB/second to allow a restore in the middle of the # day over the network. # my($file, $chunksize, $len, $offset, $written, $rate, $buf ); my($options, $blocksize, $speed, %convert, $inv_rate, $verbose); %convert = ( # conversion factors for $speed,$blocksize '', '1', 'w','2', 'W','2', 'b','512', 'B','512', 'k','1024', 'K','1024', ); $options = 'vhs:r:b:f:'; # # set defaults # $speed = '100k'; $rate = '5'; $blocksize = '120k'; # Works for the DLT drives under SunOS $file = '-'; $buf = ''; $verbose = 0; # default to quiet sub usage { my($usage); $usage = Usage: throttle [-s speed][-r rate/sec][-b blksize][-f file][-v][-h] (writes data to STDOUT) -s speed max data rate in B/s - defaults to 100k -r ratewrites/sec - defaults to 5 -b sizeread blocksize - defaults to 120k -f filefile to read for input - defaults to STDIN -h print this message -v print parameters used ; print STDERR $usage; exit(1); } getopts($options) || usage; if ($::opt_h || $::opt_h) { usage; } usage unless $#ARGV 0; $speed = $::opt_s if $::opt_s; $rate = $::opt_r if $::opt_r; $blocksize = $::opt_b if $::opt_b; $file = $::opt_f if $::opt_f; # # Convert $speed and $blocksize to bytes for use in the rest of the script if ( $speed =~ /^(\d+)([wWbBkK]*)$/ ) { $speed = $1 * $convert{$2}; } if ( $blocksize =~ /^(\d+)([wWbBkK]*)$/ ) { $blocksize = $1 * $convert{$2}; } $inv_rate = 1/$rate; $chunksize = int($speed/$rate); $chunksize = 1 if $chunksize == 0; if ($::opt_v || $::opt_v) { print STDERR speed = $speed B/s\nrate = $rate/sec\nblocksize = $blocksize B\nchunksize = $chunksize B\n; } # Return error if unable to open file open(FILE, $file) or die Cannot open $file: $!\n; # Read data from stdin and write it to stdout at a rate based # on $rate and $speed. # while($len = sysread(FILE, $buf, $blocksize)) { # # print out in chunks of $speed/$rate size to allow a smoother load $offset = 0; while ($len) { $written = syswrite(STDOUT, $buf, $chunksize, $offset); die System write error: $!\n unless defined $written; $len -= $written; $offset += $written; # # Now wait 1/$rate seconds before doing the next block # select(undef, undef, undef, $inv_rate); } } close(FILE); -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Are folks running 8.4 in production environments? and 8.4 and slon 1.2?
On Tue, Oct 13, 2009 at 01:03:10AM -0600, Scott Marlowe wrote: On Mon, Oct 12, 2009 at 1:06 PM, Tory M Blue tmb...@gmail.com wrote: Any issues, has it baked long enough, is it time for us 8.3 folks to deal with the pain and upgrade? I am running 8.4.1 for my stats and search databases, and it's working fine. Anymore updates regarding 8.4 and slon 1.2 as well, since I usually build/upgrade both at the same time. I don't think 1.2 supports 8.4 just yet, and 2.0.3 or so is still not stable enough for production (I had major unexplained outages with it) so for now, no 8.4 with slony. slony-1.2.17-rc2 works fine with version 8.4 in my limited testing. I have not been able to get replication to work reliably with any current release of slony-2.x. There was a recent comment that the latest version in CVS has the 2.x bug fixed but I have not had a chance to try. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Bad performance of SELECT ... where id IN (...)
On Fri, Oct 09, 2009 at 08:31:54PM +0800, Xia Qingran wrote: On Mon, Oct 5, 2009 at 9:58 AM, Omar Kilani omar.kil...@gmail.com wrote: Hi Xia, Try this patch: http://treehou.se/~omar/postgresql-8.4.1-array_sel_hack.patch It's a hack, but it works for us. I think you're probably spending most of your query time planning, and this patch helps speed things up 10x over here. Thanks! I am trying it. Regards, Xia Qingran We have a similar situation when using DSPAM with a PostgreSQL backend. In that case we used a function like the following to speed up the lookups. I do not know if it would be useful in your situation, but I thought I would post it for the group: The original query was of the form: SELECT uid, token, spam_hits, innocent_hits FROM dspam_token_data WHERE uid = 'xxx' AND token IN (...); The faster version of the query in the current code is: SELECT * FROM lookup_tokens(%d, '{...}); where lookup_tokens is defined as follows: create function lookup_tokens(integer,bigint[]) returns setof dspam_token_data language plpgsql stable as ' declare v_rec record; begin for v_rec in select * from dspam_token_data where uid=$1 and token in (select $2[i] from generate_series(array_lower($2,1), array_upper($2,1)) s(i)) loop return next v_rec; end loop; return; end;'; Anyway, you may want to try a similar approach instead of the posted code change. Regards, Ken Regards, Omar On Sun, Sep 27, 2009 at 5:13 PM, Xia Qingran qingran@gmail.com wrote: On Sat, Sep 26, 2009 at 10:59 PM, Craig James craig_ja...@emolecules.com wrote: If your user_id is always in a narrow range like this, or even in any range that is a small fraction of the total, then add a range condition, like this: select * from event where user_id = 500 and user_id = 0 and user_id in (...) I did this exact same thing in my application and it worked well. Craig It is a good idea. But In my application, most of the queries' user_id are random and difficult to range. Thanks anyway. -- ? Xia Qingran qingran@gmail.com Sent from Beijing, 11, China Charles de Gaulle ??- The better I get to know men, the more I find myself loving dogs. - http://www.brainyquote.com/quotes/authors/c/charles_de_gaulle.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- ? Xia Qingran qingran@gmail.com Sent from Beijing, 11, China Stephen Leacock - I detest life-insurance agents: they always argue that I shall some day die, which is not so. - http://www.brainyquote.com/quotes/authors/s/stephen_leacock.html -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partition queries hitting all partitions even though check key is specified
The planner does not yet work as efficiently as it could with child tables. Check the recent mail archives for a long discussion of the same. Regards, Ken On Wed, Sep 02, 2009 at 08:52:30AM -0600, Kevin Kempter wrote: Hi all; I cant figure out why we're scanning all of our partitions. We setup our tables like this: Base Table: CREATE TABLE url_hits ( id integer NOT NULL, content_type_id integer, file_extension_id integer, time integer, bytes integer NOT NULL, path_id integer, protocol public.protocol_enum ); Partitions: create table url_hits_2011_12 ( check ( time = extract ('epoch' from timestamp '2011-12-01 00:00:00')::int4 and time = extract ('epoch' from timestamp '2011-12-31 23:59:59')::int4 ) ) INHERITS (url_hits); CREATE RULE url_hits_2011_12_insert as ON INSERT TO url_hits where ( time = extract ('epoch' from timestamp '2011-12-01 00:00:00')::int4 and time = extract ('epoch' from timestamp '2011-12-31 23:59:59')::int4 ) DO INSTEAD INSERT INTO url_hits_2011_12 VALUES (NEW.*) ; ... create table url_hits_2009_08 ( check ( time = extract ('epoch' from timestamp '2009-08-01 00:00:00')::int4 and time = extract ('epoch' from timestamp '2009-08-31 23:59:59')::int4 ) ) INHERITS (url_hits); CREATE RULE url_hits_2009_08_insert as ON INSERT TO url_hits where ( time = extract ('epoch' from timestamp '2009-08-01 00:00:00')::int4 and time = extract ('epoch' from timestamp '2009-08-31 23:59:59')::int4 ) DO INSTEAD INSERT INTO url_hits_2009_08 VALUES (NEW.*) ; ... the explain plan shows most any query scans/hits all partitions even if we specify the partition key: explain select * from pwreport.url_hits where time date_part('epoch'::text, '2009-08-12'::timestamp without time zone)::integer; QUERY PLAN -- Result (cost=0.00..23766294.06 rows=816492723 width=432) - Append (cost=0.00..23766294.06 rows=816492723 width=432) - Seq Scan on url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_12 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_11 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_10 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_09 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_08 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_07 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_06 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_05 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_04 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200) - Seq Scan on url_hits_2011_03 url_hits (cost=0.00..12.12 rows=57 width=432) Filter: (time 1250035200)
Re: [PERFORM] Best settings to load a fresh database
On Thu, Aug 06, 2009 at 01:42:06PM -0500, Campbell, Lance wrote: PostgreSQL 8.3 Linux RedHat 4.X 24G of memory When loading a file generated from pg_dumpall is there a key setting in the configuration file that would allow the load to work faster. Thanks, Lance Campbell Project Manager/Software Architect/DBA Web Services at Public Affairs 217-333-0382 I have found that increasing maintenance_work_mem speeds index rebuilds, turn off synchronous_commit or fsync if you really can afford to start over. Another big help is to use the parallel pg_restore from PostgreSQL 8.4.0 to perform the restore. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] cluster index on a table
Clustering reorganizes the layout of a table according to the ordering of a SINGLE index. This will place items that are adjacent in the index adjacent in the heap. So you need to cluster on the index that will help the locality of reference for the queries which will benefit you the most. Execution time sensitive queries are a good way to choose. Cheers, Ken On Wed, Jun 24, 2009 at 08:32:14PM +0300, Ibrahim Harrani wrote: Hello, I have a table like following. To increase the performance of this table, I would like to create CLUSTER. First, Which index should I use on this table for CLUSTER? Secondly, Can I create multiple CLUSTER on the same table? I will appreciate, if you can suggest other options to increase the performance of the table. I use this table to save metadata of the mails on my system. mail=# \d maillogs Table public.maillogs Column |Type | Modifiers +-+--- id | bigint | not null default nextval('maillogs_id_seq'::regclass) queueid| character varying(255) | not null default '*'::character varying recvtime | timestamp without time zone | default now() remoteip | character varying(128) | not null default '0.0.0.0'::character varying relayflag | smallint| not null default (0)::smallint retaction | integer | retval | integer | not null default 0 probspam | double precision| not null default (0)::double precision messageid | text| fromaddress| text| not null toaddress | text| not null envelopesender | text| enveloperecipients | text| messagesubject | text| size | bigint | logstr | character varying(1024) | destinationaddress | character varying(255) | quarantinepath | character varying(1024) | not null default ''::character varying backuppath | character varying(1024) | not null default ''::character varying quarantineflag | smallint| not null default (0)::smallint backupflag | smallint| not null default (0)::smallint deletedflag| smallint| not null default 0 profileid | integer | not null default 0 Indexes: maillogs_pkey PRIMARY KEY, btree (id) CLUSTER idx_maillogs_backupflag btree (backupflag) idx_maillogs_deletedflag btree (deletedflag) idx_maillogs_enveloperecipients btree (enveloperecipients) idx_maillogs_envelopesender btree (envelopesender) idx_maillogs_messagesubject btree (messagesubject) idx_maillogs_quarantineflag btree (quarantineflag) idx_maillogs_recvtime btree (recvtime) idx_maillogs_remoteip btree (remoteip) idx_maillogs_revtal btree (retval) Foreign-key constraints: maillogs_profileid_fkey FOREIGN KEY (profileid) REFERENCES profiles(profileid) Triggers: maillogs_insert AFTER INSERT ON maillogs FOR EACH ROW EXECUTE PROCEDURE maillogs_insert() mail=# -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.4 COPY performance regression on Solaris
Hi, Looking at the XLogInsert() from 8.3 and 8.4, the 8.4 version includes a call to RecoveryInProgress() at the top as well as a call to TRACE_POSTGRESQL_XLOG_INSERT(). Could either of those have caused a context switch or cache flush resulting in worse performance. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange performance response for high load times
On Thu, Jun 18, 2009 at 08:27:02PM +0200, Peter Alban wrote: Hi All, We are having a reasonably powerful machine for supporting about 20 databases but in total they're not more then 4GB in size. The machine is 2 processor 8 core and 8 Gig or ram so I would expect that PG should cache the whole db into memory. Well actually it doesn't. What is more strange that a query that under zero load is running under 100ms during high load times it can take up to 15 seconds !! What on earth can make such difference ? here are the key config options that I set up : # - Memory - shared_buffers = 17 # min 16 or max_connections*2, 8KB each temp_buffers = 21000# min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1048576 # min 64, size in KB maintenance_work_mem = 1048576 # min 1024, size in KB 1GB of work_mem is very high if you have more than a couple of queries that use it. Ken #max_stack_depth = 2048 # min 100, size in KB # - Free Space Map - max_fsm_pages = 524298 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 32768 # min 100, ~70 bytes each # - Kernel Resource Usage - max_files_per_process = 4000# min 25 #preload_libraries = '' any ideas ? cheers, Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Strange performance response for high load times
On Thu, Jun 18, 2009 at 09:42:47PM +0200, Peter Alban wrote: So Ken , What do you reckon it should be ? What is the rule of thumb here ? cheers, Peter It really depends on your query mix. The key to remember is that multiples (possibly many) of the work_mem value can be allocated in an individual query. You can set it on a per query basis to help manage it use, i.e. up it for only the query that needs it. With our systems, which run smaller number of queries we do use 256MB. I hope that this helps. Regards, Ken On Thu, Jun 18, 2009 at 8:30 PM, Kenneth Marshall k...@rice.edu wrote: On Thu, Jun 18, 2009 at 08:27:02PM +0200, Peter Alban wrote: Hi All, We are having a reasonably powerful machine for supporting about 20 databases but in total they're not more then 4GB in size. The machine is 2 processor 8 core and 8 Gig or ram so I would expect that PG should cache the whole db into memory. Well actually it doesn't. What is more strange that a query that under zero load is running under 100ms during high load times it can take up to 15 seconds !! What on earth can make such difference ? here are the key config options that I set up : # - Memory - shared_buffers = 17 # min 16 or max_connections*2, 8KB each temp_buffers = 21000# min 100, 8KB each #max_prepared_transactions = 5 # can be 0 or more # note: increasing max_prepared_transactions costs ~600 bytes of shared memory # per transaction slot, plus lock space (see max_locks_per_transaction). work_mem = 1048576 # min 64, size in KB maintenance_work_mem = 1048576 # min 1024, size in KB 1GB of work_mem is very high if you have more than a couple of queries that use it. Ken #max_stack_depth = 2048 # min 100, size in KB # - Free Space Map - max_fsm_pages = 524298 # min max_fsm_relations*16, 6 bytes each max_fsm_relations = 32768 # min 100, ~70 bytes each # - Kernel Resource Usage - max_files_per_process = 4000# min 25 #preload_libraries = '' any ideas ? cheers, Peter -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Storing sensor data
On Thu, May 28, 2009 at 04:55:34PM +0200, Ivan Voras wrote: 2009/5/28 Heikki Linnakangas heikki.linnakan...@enterprisedb.com: Ivan Voras wrote: I need to store data about sensor readings. There is a known (but configurable) number of sensors which can send update data at any time. The current state needs to be kept but also all historical records. I'm trying to decide between these two designs: 1) create a table for current data, one record for each sensor, update this table when a sensor reading arrives, create a trigger that would transfer old record data to a history table (of basically the same structure) 2) write only to the history table, use relatively complex queries or outside-the-database magic to determine what the current values of the sensors are. 3) write only to the history table, but have an INSERT trigger to update the table with current data. This has the same performance characteristics as 1, but let's you design your application like 2. Excellent idea! I think I'd choose this approach (or 2), since it can handle out-of-order or delayed arrival of sensor readings gracefully (assuming they are timestamped at source). It seems like your approach is currently the winner. If you go with 2, I'd recommend to still create a view to encapsulate the complex query for the current values, to make the application development simpler. And if it gets slow, you can easily swap the view with a table, updated with triggers or periodically, without changing the application. The volume of sensor data is potentially huge, on the order of 500,000 updates per hour. Sensor data is few numeric(15,5) numbers. Whichever design you choose, you should also consider partitioning the data. I'll look into it, but we'll first see if we can get away with limiting the time the data needs to be available. Mr. Voras, One big benefit of partitioning is that you can prune old data with minimal impact to the running system. Doing a large bulk delete would be extremely I/O impacting without partion support. We use this for a DB log system and it allows us to simply truncate a day table instead of a delete -- much, much faster. Regards, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Storing sensor data
On Thu, May 28, 2009 at 05:24:33PM +0200, Ivan Voras wrote: 2009/5/28 Kenneth Marshall k...@rice.edu: One big benefit of partitioning is that you can prune old data with minimal impact to the running system. Doing a large bulk delete would be extremely I/O impacting without partion support. We use this for a DB log system and it allows us to simply truncate a day table instead of a delete -- much, much faster. Thanks. I'll need to investigate how much administrative overhead and fragility partitioning will introduce since the data will also be replicated between 2 servers (I'm thinking of using Slony). Any experience with this combination? We use Slony1 on a number of databases, but none yet on which we use data partitioning. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On Wed, May 06, 2009 at 04:01:03PM +0800, Craig Ringer wrote: Dimitri wrote: Hi, any idea if there is a more optimal execution plan possible for this query: select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as hbeg, H.END_DATE as hend, H.NOTE as hnote from HISTORY H, STAT S where S.REF = H.REF_STAT and H.REF_OBJECT = '01' order by H.HORDER ; OK, so you're taking a simple: history INNER JOIN stat ON (stat.ref = history.ref_stat) then filtering for records with a particular value of history.ref_object and finally performing a sort. If I'm reading it right, the plan below does a sequential scan on the `stat' table. The stat table only has 1000 rows, so this isn't necessarily an unreasonable choice even if there is an appropriate index and even if not many of the rows will be needed. It then does an index scan of the history table looking for tuples with ref_object = '01' (text match). It hash joins the hashed results of the initial seq scan to the results of the index scan, and sorts the result. To me, that looks pretty reasonable. You might be able to avoid the hash join in favour of a nested loop scan of stat_ref_idx (looping over records from history.ref_stat where ref_object = '001') by providing a composite index on HISTORY(ref_stat, ref_object). I'm really not too sure, though; plan optimization isn't my thing, I'm just seeing if I can offer a few ideas. Table definitions: While not strictly necessary, it's a *REALLY* good idea to define a suitable PRIMARY KEY. Also, the `CHAR(n)' data type is evil. E.V.I.L. Use `varchar(n)' for bounded-length values, or `text' for unbounded fields, unless you REALLY want the crazy behaviour of `CHAR(n)'. I'm a little bit puzzled about why you seem to be doing lots of things with integer values stored in text strings, but that probably doesn't matter too much for the issue at hand. NOTE: The same query runs 2 times faster on MySQL. With InnoDB tables and proper transactional safety? Or using scary MyISAM tables and a just pray approach to data integrity? If you're using MyISAM tables I'm not surprised; MySQL with MyISAM is stunningly fast, but oh-my-god dangerous. -- Craig Ringer I just thought I would ask. Are you using the query cache in MySQL? If that is on, that could be the difference. Another thing to check, try issuing the selects concurrently: 2 at a time, 5 at a time, 10 at a time... and see if that has an effect on timing. In many of the benchmarks, MySQL will out perform PostgreSQL for very low numbers of clients. Once you are using more than a handful, PostgreSQL pulls ahead. Also, is this a completely static table? i.e. no updates or inserts. How is the performance with those happening? This should help you get a clearer picture of the performance. My two cents. Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On Wed, May 06, 2009 at 02:49:23PM +0200, Dimitri wrote: The story is simple: for the launching of MySQL 5.4 I've done a testing comparing available on that time variations of InnoDB engines, and at the end by curiosity started the same test with PostgreSQL 8.3.7 to see if MySQL performance level is more close to PostgreSQL now (PG was a strong true winner before). For my big surprise MySQL 5.4 outpassed 8.3.7... However, analyzing the PostgreSQL processing I got a feeling something goes wrong on PG side.. So, now I've installed both 8.3.7 and 8.4beta1 to see more in depth what's going on. Currently 8.4 performs much better than 8.3.7, but there is still a room for improvement if such a small query may go faster :-) Rgds, -Dimitri On 5/6/09, Albe Laurenz laurenz.a...@wien.gv.at wrote: Dimitri wrote: I've run several tests before and now going in depth to understand if there is nothing wrong. Due such a single query time difference InnoDB is doing 2-3 times better TPS level comparing to PostgreSQL.. Why don't you use MySQL then? Or tune PostgreSQL? Yours, Laurenz Albe Another thought, have you tuned PostgreSQL for an in memory database? Those tuning options may be what is needed to improve the plan chosen by PostgreSQL. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
No. Ken On Wed, May 06, 2009 at 10:04:33PM +0800, Craig Ringer wrote: Dimitri wrote: Hi Chris, the only problem I see here is it's 2 times slower vs InnoDB, so before I'll say myself it's ok I want to be sure there is nothing else to do.. :-) Can the genetic query optimizer come into play on small queries? -- Craig Ringer -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On Wed, May 06, 2009 at 09:38:59AM +0200, Dimitri wrote: Hi, any idea if there is a more optimal execution plan possible for this query: select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as hbeg, H.END_DATE as hend, H.NOTE as hnote from HISTORY H, STAT S where S.REF = H.REF_STAT and H.REF_OBJECT = '01' order by H.HORDER ; EXPLAIN ANALYZE output on 8.4: QUERY PLAN Sort (cost=4549.75..4555.76 rows=2404 width=176) (actual time=1.341..1.343 rows=20 loops=1) Sort Key: h.horder Sort Method: quicksort Memory: 30kB - Hash Join (cost=33.50..4414.75 rows=2404 width=176) (actual time=1.200..1.232 rows=20 loops=1) Hash Cond: (h.ref_stat = s.ref) - Index Scan using history_ref_idx on history h (cost=0.00..4348.20 rows=2404 width=135) (actual time=0.042..0.052 rows=20 loops=1) Index Cond: (ref_object = '01'::bpchar) - Hash (cost=21.00..21.00 rows=1000 width=45) (actual time=1.147..1.147 rows=1000 loops=1) - Seq Scan on stat s (cost=0.00..21.00 rows=1000 width=45) (actual time=0.005..0.325 rows=1000 loops=1) Total runtime: 1.442 ms (10 rows) Table HISTORY contains 200M rows, only 20 needed Table STAT contains 1000 rows, only 20 needed to be joined to HISTORY values. Table definitions: create table STAT ( REF CHAR(3)not null, NAMECHAR(40) not null, NUMBINTnot null ); create table HISTORY ( REF_OBJECT CHAR(10) not null, HORDER INT not null, REF_STATCHAR(3) not null, BEGIN_DATE CHAR(12) not null, END_DATECHAR(12) , NOTECHAR(100) ); create unique index stat_ref_idx on STAT( ref ); create index history_ref_idx on HISTORY( ref_object, horder ); NOTE: The same query runs 2 times faster on MySQL. Any idea?.. Rgds, -Dimitri Dimitri, Is there any chance of profiling the postgres backend to see where the time is used? Just an idea, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Any better plan for this query?..
On Wed, May 06, 2009 at 04:48:21PM +0200, Dimitri wrote: Hi Ken, yes, I may do it, but I did not expect to come into profiling initially :-) I expected there is just something trivial within a plan that I just don't know.. :-) BTW, is there already an integrated profiled within a code? or do I need external tools?.. Rgds, -Dimitri I only suggested it because it might have the effect of changing the sequential scan on the stat table to an indexed scan. Cheers, Ken On 5/6/09, Kenneth Marshall k...@rice.edu wrote: On Wed, May 06, 2009 at 09:38:59AM +0200, Dimitri wrote: Hi, any idea if there is a more optimal execution plan possible for this query: select S.REF as stref, S.NAME as stnm, H.HORDER as hord, H.BEGIN_DATE as hbeg, H.END_DATE as hend, H.NOTE as hnote from HISTORY H, STAT S where S.REF = H.REF_STAT and H.REF_OBJECT = '01' order by H.HORDER ; EXPLAIN ANALYZE output on 8.4: QUERY PLAN Sort (cost=4549.75..4555.76 rows=2404 width=176) (actual time=1.341..1.343 rows=20 loops=1) Sort Key: h.horder Sort Method: quicksort Memory: 30kB - Hash Join (cost=33.50..4414.75 rows=2404 width=176) (actual time=1.200..1.232 rows=20 loops=1) Hash Cond: (h.ref_stat = s.ref) - Index Scan using history_ref_idx on history h (cost=0.00..4348.20 rows=2404 width=135) (actual time=0.042..0.052 rows=20 loops=1) Index Cond: (ref_object = '01'::bpchar) - Hash (cost=21.00..21.00 rows=1000 width=45) (actual time=1.147..1.147 rows=1000 loops=1) - Seq Scan on stat s (cost=0.00..21.00 rows=1000 width=45) (actual time=0.005..0.325 rows=1000 loops=1) Total runtime: 1.442 ms (10 rows) Table HISTORY contains 200M rows, only 20 needed Table STAT contains 1000 rows, only 20 needed to be joined to HISTORY values. Table definitions: create table STAT ( REF CHAR(3)not null, NAMECHAR(40) not null, NUMBINTnot null ); create table HISTORY ( REF_OBJECT CHAR(10) not null, HORDER INT not null, REF_STATCHAR(3) not null, BEGIN_DATE CHAR(12) not null, END_DATECHAR(12) , NOTECHAR(100) ); create unique index stat_ref_idx on STAT( ref ); create index history_ref_idx on HISTORY( ref_object, horder ); NOTE: The same query runs 2 times faster on MySQL. Any idea?.. Rgds, -Dimitri Dimitri, Is there any chance of profiling the postgres backend to see where the time is used? Just an idea, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] partition question for new server setup
On Tue, Apr 28, 2009 at 11:56:25AM -0600, Scott Marlowe wrote: On Tue, Apr 28, 2009 at 11:48 AM, Whit Armstrong armstrong.w...@gmail.com wrote: Thanks, Scott. Just to clarify you said: postgres. ?So, my pg_xlog and all OS and logging stuff goes on the RAID-10 and the main store for the db goes on the RAID-10. Is that meant to be that the pg_xlog and all OS and logging stuff go on the RAID-1 and the real database (the /var/lib/postgresql/8.3/main/base directory) goes on the RAID-10 partition? Yeah, and extra 0 jumped in there. Faulty keyboard I guess. :) OS and everything but base is on the RAID-1. This is very helpful. ?Thanks for your feedback. Additionally are there any clear choices w/ regard to filesystem types? ?Our choices would be xfs, ext3, or ext4. Well, there's a lot of people who use xfs and ext3. XFS is generally rated higher than ext3 both for performance and reliability. However, we run Centos 5 in production, and XFS isn't one of the blessed file systems it comes with, so we're running ext3. It's worked quite well for us. The other optimizations are using data=writeback when mounting the ext3 filesystem for PostgreSQL and using the elevator=deadline for the disk driver. I do not know how you specify that for Ubuntu. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] performance for high-volume log insertion
Hi, I just finished reading this thread. We are currently working on setting up a central log system using rsyslog and PostgreSQL. It works well once we patched the memory leak. We also looked at what could be done to improve the efficiency of the DB interface. On the rsyslog side, moving to prepared queries allows you to remove the escaping that needs to be done currently before attempting to insert the data into the SQL backend as well as removing the parsing and planning time from the insert. This is a big win for high insert rates, which is what we are talking about. The escaping process is also a big CPU user in rsyslog which then hands the escaped string to the backend which then has to undo everything that had been done and parse/plan the resulting query. This can use a surprising amount of additional CPU. Even if you cannot support a general prepared query interface, by specifying what the query should look like you can handle much of the low-hanging fruit query-wise. We are currently using a date based trigger to use a new partition each day and keep 2 months of logs currently. This can be usefully managed on the backend database, but if rsyslog supported changing the insert to the new table on a time basis, the CPU used by the trigger to support this on the backend could be reclaimed. This would be a win for any DB backend. As you move to the new partition, issuing a truncate to clear the table would simplify the DB interfaces. Another performance enhancement already mentioned, would be to allow certain extra fields in the DB to be automatically populated as a function of the log messages. For example, logging the mail queue id for messages from mail systems would make it much easier to locate particular mail transactions in large amounts of data. To sum up, eliminating the escaping in rsyslog through the use of prepared queries would reduce the CPU load on the DB backend. Batching the inserts will also net you a big performance increase. Some DB-based applications allow for the specification of several types of queries, one for single inserts and then a second to support multiple inserts (copy). Rsyslog already supports the queuing pieces to allow you to batch inserts. Just some ideas. Regards, Ken On Tue, Apr 21, 2009 at 09:56:23AM +0100, Richard Huxton wrote: da...@lang.hm wrote: On Tue, 21 Apr 2009, Stephen Frost wrote: * da...@lang.hm (da...@lang.hm) wrote: while I fully understand the 'benchmark your situation' need, this isn't that simple. It really is. You know your application, you know it's primary use cases, and probably have some data to play with. You're certainly in a much better situation to at least *try* and benchmark it than we are. rsyslog is a syslog server. it replaces (or for debian and fedora, has replaced) your standard syslog daemon. it recieves log messages from every app on your system (and possibly others), filters, maniulates them, and then stores them somewhere. among the places that it can store the logs are database servers (native support for MySQL, PostgreSQL, and Oracle. plus libdbi for others) Well, from a performance standpoint the obvious things to do are: 1. Keep a connection open, do NOT reconnect for each log-statement 2. Batch log statements together where possible 3. Use prepared statements 4. Partition the tables by day/week/month/year (configurable I suppose) The first two are vital, the third takes you a step further. The fourth is a long-term admin thing. And possibly 5. Have two connections, one for fatal/error etc and one for info/debug level log statements (configurable split?). Then you can use the synchronous_commit setting on the less important ones. Might buy you some performance on a busy system. http://www.postgresql.org/docs/8.3/interactive/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS other apps then search and report on the data after it is stored. what apps?, I don't know either. pick your favorite reporting tool and you'll be a step ahead of me (I don't know a really good reporting tool) as for sample data, you have syslog messages, just like I do. so you have the same access to data that I have. how would you want to query them? how would people far less experianced that you want to query them? I can speculate that some people would do two columns (time, everything else), others will do three (time, server, everything else), and others will go further (I know some who would like to extract IP addresses embedded in a message into their own column). some people will index on the time and host, others will want to do full-text searches of everything. Well, assuming it looks much like traditional syslog, I would do something like: (timestamp, host, facility, priority, message). It's easy enough to stitch back together if people want that. PostgreSQL's full-text indexing is quite well suited to logfiles I'd have thought, since it knows about filenames,
Re: [PERFORM] performance for high-volume log insertion
On Tue, Apr 21, 2009 at 08:37:54AM -0700, da...@lang.hm wrote: Kenneth, could you join the discussion on the rsyslog mailing list? rsyslog-users rsys...@lists.adiscon.com I'm surprised to hear you say that rsyslog can already do batch inserts and am interested in how you did that. what sort of insert rate did you mange to get? David Lang David, I would be happy to join the discussion. I did not mean to say that rsyslog currently supported batch inserts, just that the pieces that provide stand-by queuing could be used to manage batching inserts. Cheers, Ken On Tue, 21 Apr 2009, Kenneth Marshall wrote: Date: Tue, 21 Apr 2009 08:33:30 -0500 From: Kenneth Marshall k...@rice.edu To: Richard Huxton d...@archonet.com Cc: da...@lang.hm, Stephen Frost sfr...@snowman.net, Greg Smith gsm...@gregsmith.com, pgsql-performance@postgresql.org Subject: Re: [PERFORM] performance for high-volume log insertion Hi, I just finished reading this thread. We are currently working on setting up a central log system using rsyslog and PostgreSQL. It works well once we patched the memory leak. We also looked at what could be done to improve the efficiency of the DB interface. On the rsyslog side, moving to prepared queries allows you to remove the escaping that needs to be done currently before attempting to insert the data into the SQL backend as well as removing the parsing and planning time from the insert. This is a big win for high insert rates, which is what we are talking about. The escaping process is also a big CPU user in rsyslog which then hands the escaped string to the backend which then has to undo everything that had been done and parse/plan the resulting query. This can use a surprising amount of additional CPU. Even if you cannot support a general prepared query interface, by specifying what the query should look like you can handle much of the low-hanging fruit query-wise. We are currently using a date based trigger to use a new partition each day and keep 2 months of logs currently. This can be usefully managed on the backend database, but if rsyslog supported changing the insert to the new table on a time basis, the CPU used by the trigger to support this on the backend could be reclaimed. This would be a win for any DB backend. As you move to the new partition, issuing a truncate to clear the table would simplify the DB interfaces. Another performance enhancement already mentioned, would be to allow certain extra fields in the DB to be automatically populated as a function of the log messages. For example, logging the mail queue id for messages from mail systems would make it much easier to locate particular mail transactions in large amounts of data. To sum up, eliminating the escaping in rsyslog through the use of prepared queries would reduce the CPU load on the DB backend. Batching the inserts will also net you a big performance increase. Some DB-based applications allow for the specification of several types of queries, one for single inserts and then a second to support multiple inserts (copy). Rsyslog already supports the queuing pieces to allow you to batch inserts. Just some ideas. Regards, Ken On Tue, Apr 21, 2009 at 09:56:23AM +0100, Richard Huxton wrote: da...@lang.hm wrote: On Tue, 21 Apr 2009, Stephen Frost wrote: * da...@lang.hm (da...@lang.hm) wrote: while I fully understand the 'benchmark your situation' need, this isn't that simple. It really is. You know your application, you know it's primary use cases, and probably have some data to play with. You're certainly in a much better situation to at least *try* and benchmark it than we are. rsyslog is a syslog server. it replaces (or for debian and fedora, has replaced) your standard syslog daemon. it recieves log messages from every app on your system (and possibly others), filters, maniulates them, and then stores them somewhere. among the places that it can store the logs are database servers (native support for MySQL, PostgreSQL, and Oracle. plus libdbi for others) Well, from a performance standpoint the obvious things to do are: 1. Keep a connection open, do NOT reconnect for each log-statement 2. Batch log statements together where possible 3. Use prepared statements 4. Partition the tables by day/week/month/year (configurable I suppose) The first two are vital, the third takes you a step further. The fourth is a long-term admin thing. And possibly 5. Have two connections, one for fatal/error etc and one for info/debug level log statements (configurable split?). Then you can use the synchronous_commit setting on the less important ones. Might buy you some performance on a busy system. http://www.postgresql.org/docs/8.3/interactive/runtime-config-wal.html#RUNTIME-CONFIG-WAL-SETTINGS other apps then search and report on the data after it is stored. what apps?, I don't know either. pick your favorite reporting tool and you'll
Re: [PERFORM] performance for high-volume log insertion
On Tue, Apr 21, 2009 at 11:09:18AM -0700, da...@lang.hm wrote: On Tue, 21 Apr 2009, Greg Smith wrote: On Mon, 20 Apr 2009, da...@lang.hm wrote: while I fully understand the 'benchmark your situation' need, this isn't that simple. in this case we are trying to decide what API/interface to use in a infrastructure tool that will be distributed in common distros (it's now the default syslog package of debian and fedora), there are so many variables in hardware, software, and load that trying to benchmark it becomes effectivly impossible. From your later comments, you're wandering a bit outside of what you were asking about here. Benchmarking the *query* side of things can be extremely complicated. You have to worry about memory allocation, cold vs. warm cache, scale of database relative to RAM, etc. You were asking specifically about *insert* performance, which isn't nearly as complicated. There are basically three setups: 1) Disk/controller has a proper write cache. Writes and fsync will be fast. You can insert a few thousand individual transactions per second. 2) Disk/controller has a lying write cache. Writes and fsync will be fast, but it's not safe for database use. But since (1) is expensive and this one you can get for free jut by using a regular SATA drive with its write cache enabled, you can use this case as a proxy for approximately how (1) would act. You'll still get a few thousand transactions per second, sustained writes may slow down relative to (1) if you insert enough that you hit a checkpoint (triggering lots of random I/O). 3) All write caches have been disabled because they were not battery-backed. This is the case if you have a regular SATA drive and you disable its write cache because you care about write durability. You'll get a bit less than RPM/60 writes/second, so 120 inserts/second with a typical 7200RPM drive. Here batching multiple INSERTs together is critical to get any sort of reasonable performance. in case #1 would you expect to get significant gains from batching? doesn't it suffer from problems similar to #2 when checkpoints hit? Even with a disk controller with a proper write cache, the latency for single-insert-at-a-time will keep the number of updates to the low thousands per second (on the controllers I have used). If you can batch them, it would not be unreasonable to increase performance by an order of magnitude or more. At the high end, other issues like CPU usage can restrict performance. Ken In (3), I'd expect that trivia like INSERT vs. COPY and COPY BINARY vs. COPY TEXT would be overwhelmed by the overhead of the commit itself. Therefore you probably want to test with case (2) instead, as it doesn't require any additional hardware but has similar performance to a production-worthy (1). All of the other things you're worried about really don't matter here; you can get an approximate measure of what the performance of the various INSERT/COPY schemes are that is somewhat platform dependant, but the results should be good enough to give you some rule of thumb suggestions for whether optimizations are significant enough to justify the coding effort to implement them or not. I'll see about setting up a test in the next day or so. should I be able to script this through psql? or do I need to write a C program to test this? I'm not sure whether you're familiar with all the fsync trivia here. In normal syslog use, there's an fsync call after every write. You can disable that by placing a - before the file name in /etc/syslog.conf The thing that is going to make database-based writes very different is that syslog's fsync'd writes are unlikely to leave you in a bad state if the drive lies about them, while database writes can. So someone using syslog on a standard SATA drive isn't getting the write guarantee they think they are, but the downside on a crash is minimal. If you've got a high-volume syslog environment (100 lines/second), you can't support those as individual database writes unless you've got a battery-backed write controller. A regular disk just can't process genuine fsync calls any faster than that. A serious syslog deployment that turns fsync on and expects it to really do its thing is already exposed to this issue though. I think it may be a the case that a lot of people think they have durable writes in their configuration but really don't. rsyslog is a little different, instead of just input - disk it does input - queue - output (where output can be many things, including disk or database) it's default is to use memory-based queues (and no fsync), but has config options to do disk based queues with a fsync after each update David Lang -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent
Re: [PERFORM] postgreSQL performance 8.2.6 vs 8.3.3
On Fri, Feb 20, 2009 at 04:34:23PM -0500, Battle Mage wrote: I have a server box that has 4GB of RAM, Quad core CPU AMD Opteron 200.152 Mhz (1024 KB cache size each) with plenty of hard drive space. I installed both postgresql 8.2.6 and 8.3.3 on it. I've created a basic test db and used pgbench -i -s 1 -U test -h localhost test to create a sample test db. Then, to benchmark the postgreSQLs, I executed this separately on each of them: pg_bench -h localhost -d test -t 2000 -c 50 -s 50 -U test (2000 transactions per client, 50 clients, scalability factor of 50) Using the above, I get on postgreSQL 8.2.6: Load average: Between 3.4 and 4.3 tps = 589 (including connections establishing) tps = 590 (excluding connections establishing) I get on postgreSQL 8.3.3 Load: Between 4.5 and 5.6 tps = 949 (including connections establishing) tps = 951 (excluding connections establishing) The amount of tps almost doubled, which is good, but i'm worried about the load. For my application, a load increase is bad and I'd like to keep it just like in 8.2.6 (a load average between 3.4 and 4.3). What parameters should I work with to decrease the resulting load average at the expense of tps? Down below is my 8.3.3 configuration file. I removed everything that is commented since if it's commented, it's default value. I also removed from the sample below parameters related to logging. Please evaluate your load on the 8.3.3 box at 590 tps. If the load is proportional to the tps than the scaled load will be: 2.8 to 3.5 for an equivalent tps. There is no free lunch but 8.3 performs much better than 8.2 and I suspect that this trend will continue. :) Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [HACKERS] [PERFORM] GIST versus GIN indexes for intarrays
On Fri, Feb 13, 2009 at 04:12:53PM +0300, Teodor Sigaev wrote: The short-term workaround for Rusty is probably to create his GIN index using the intarray-provided gin__int_ops opclass. But it Right seems to me that we ought to get rid of intarray's @ and @ operators and have the module depend on the core anyarray operators, just as we have already done for = and . Comments? Agree, will do. Although built-in anyarray operators have ~N^2 behaviour while intarray's version - only N*log(N) Is there a way to have the buily-in anyarray opeators be N*log(N)? Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange index performance?
On Mon, Jan 26, 2009 at 10:10:13AM +0100, Thomas Finneid wrote: Scott Marlowe wrote: I'm guessing that you just had more data in the table or something by the time you tested that, or some cron job was running in the background, or some other issue, not the index. It starts from scratch and builds up. Every insert has constant time from the first to the last row, ie. row 1 to row 1.2 billion. There is no background jobs or other disturbances. Quite a similar machine. write back cache with battery backed controller on the controller? A really old Areca like an 11xx series or a newer one 12xx, 16xx? Its an Areca 1220. write back is enabled but it does not have a BBU, because its an development machine and not a production machine. 0.12 seconds per insert is pretty slow. 10 inserts would take a second. I'm inserting 10,000 rows in about 2 seconds. Each insert is definitely in the 0.12 millisecond range. I see the confusion. I use COPY(JDBC) not INSERT, so one transaction contains 2 rows, which is copy inserted in 300 ms, so that gives a per row insert time of 0.015ms. So I actually have pretty decent write performance. If I remove the index, the copy insert only takes about 125ms. So the index update time amounts to half the total update time. This still leaves me with the question of why the smaller index (id1,3,4) take longer to update than the larger index (id1,2,3,4)? Updating an index like id1,2,3 should take shorter time, I have to test it first to verify, so a similar index, id1,3,4 should take approximately the same time. Could it have something to do with the smaller index is more complicated to fill in? Could the placing of the id2 filed in the table have anything to say about it? It may be that the smaller index has update contention for the same blocks that the larger index does not. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] strange index performance?
On Mon, Jan 26, 2009 at 03:49:00PM +0100, Thomas Finneid wrote: Kenneth Marshall wrote: It may be that the smaller index has update contention for the same blocks that the larger index does not. Is that an assumption based on both indexes existing? if so I might agree, but if you are talking about only one index existing at a time then could you explain what the basis for you conclusion is? regards thomas The small index blocks would look like: |abcd|efgh|ijkl|... and the large index: |axxx|...|bxxx|...|cxxx|... and so on. Now, if you try to update a-k, the small index will be trying to update and possibly rearrange/split/... items on the same disk blocks while the larger index would be updating without contention. It may not even be block level contention, the same argument applies to cachelines with in a block. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Question about clustering indexes and restores
On Thu, Jan 22, 2009 at 02:52:12PM -0500, Harold A. Gim?nez Ch. wrote: Hi list, Clustering my indexes dramatically improves the query performance of many of my queries. Also, the actual clustering takes a very long time for big databases, roughly 20 hours. I have two questions about how to improve this: 1. I've tweaked maintenance_mem_max and effective_cache_size to a point where the cluster operation uses a good chunk of my physical RAM, and the OS does not start swapping. Is there any other parameter I should look at? 2. Reading the documentation for cluster at http://www.postgresql.org/docs/8.3/interactive/sql-cluster.html, I see that all clustering does is reorder the data on disk to 'match' the order of the clustered index. My question is, if I dump a clustered database using pg_dump in custom format, is it necessary to cluster after restoring it? Or does a dump/restore not guarantee that the order of the data restored is the same as the original dumped database? 3. Somewhat related to #2, what is the best way to move data from a staging database on one server, to the production environment on a different server? I've been using pg_dump/pg_restore, but there must be a better way... Thanks for any pointers, -Harold Harold, There have been discussions on the hackers list about the pessimal cluster performance. Here is a pointer to the discussion, it seems that a faster way is to build a new table with the desired orderwith CREATE TABLE AS ... ORDER BY ...: http://www.mail-archive.com/pgsql-hack...@postgresql.org/msg121205.html Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] caching indexes and pages?
On Thu, Jan 22, 2009 at 10:58:25PM +0100, Thomas Finneid wrote: Thomas Markus wrote: try to reorganize your data with CLUSTER and create appropriate indixes (dont forget to check statistics). One question. Assume I have clustered and new data has been added after that, according to the docs that data is added outside of the clustered data. What happens when I run cluster again? I would assume its smart and to only clusteres the new data, i.e. adding it to the already created clusters, as apporpriate, so the execution time would be a lot lower, right? or would it run through and recluster everything from scratch again? thomas It reclusters again from scratch. You do get better performance on the reads from the data that is already clustered. Cheers, Ken -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] Need help with 8.4 Performance Testing
Josh, Since a number of these performance patches use our hash function, would it make sense to apply the last patch to upgrade the hash function mix() to the two function mix()/final()? Since the additional changes increases the performance of the hash function by another 50% or so. My two cents. Regards, Ken On Sun, Dec 07, 2008 at 11:38:01AM -0800, Josh Berkus wrote: Database performance geeks, We have a number of patches pending for 8.4 designed to improve database performance in a variety of circumstances. We need as many users as possible to build test versions of PostgreSQL with these patches, and test how well they perform, and report back in some detail. Particularly, users with unusual hardware architectures (16 or more cores, ARM, Power, SSD, NFS-mounted data) or operating systems (Solaris, OSX, Windows-64) are really helpful. Testers need to be familiar with building PostgreSQL from source and patching it, as well as basic PostgreSQL Tuning (except for the Wizard Patch) and have some kind of performance test available, ideally something based on your own application use. If you are going to use pgbench to test, *please* read Greg Smith's notes first: http://www.westnet.com/~gsmith/gregsmith/content/postgresql/pgbench-scaling.htm The Wiki (http://wiki.postgresql.org/wiki/CommitFest_2008-11) has a full list of patches, but below are the ones in particular we could use help with. You *do* need to read the entire mail threads which I link to below to understand the patches. Thanks for your help! Proposal of PITR performance improvement (Koichi Suzuki): http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] Simple postgresql.conf wizard http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] Improve Performance of Multi-Batch Hash Join for Skewed Data Sets http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] Window Functions http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] parallel restore (especially need to test on 16+ cores) http://archives.postgresql.org/message-id/[EMAIL PROTECTED] B-Tree emulation for GIN http://archives.postgresql.org/message-id/[EMAIL PROTECTED] http://archives.postgresql.org/message-id/[EMAIL PROTECTED] Also, the following patches currently still have bugs, but when the bugs are fixed I'll be looking for performance testers, so please either watch the wiki or watch this space: -- Block-level CRC checks (Alvaro Herrera) -- Auto Partitioning Patch (Nikhil Sontakke) -- posix_fadvise (Gregory Stark) -- Hash Join-Filter Pruning using Bloom Filters -- On-disk bitmap indexes Please report your results, with the patchname in the subject line, on this mailing list or on -hackers. Thank you, and your help will get a better 8.4 out sooner. -- Josh Berkus PostgreSQL San Francisco -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] 8.3.1 vs 8.2.X on HP-UX PA-RISC 11.11/11.23
Are you using the same locales for both? Ken On Wed, Jun 11, 2008 at 09:40:20PM -0400, Josh Rovero wrote: We run GCC-compiled postgresql on a number of HP-UX and Linux boxes. Our measurements to date show 8.3.1 performance to be about 30% *worse* than 8.2 on HP-UX for the same drink the firehose insert/update/delete benchmarks. Linux performance is fine. Tweaking the new 8.3.1 synchronous_commit and bg writer delays that *should* speed things up actually makes them a bit worse, again only on HP-UX PA-RISK 11.11 and 11.23. Right now it's 32 bit, both for 8.2 and 8.3. Any hints? P. J. Rovero -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] shared_buffers in 8.3 w/ lots of RAM on dedicated PG machine
On Fri, Feb 15, 2008 at 01:35:29PM +0100, Peter Schuller wrote: Hello, my impression has been that in the past, there has been a general semi-consensus that upping shared_buffers to use the majority of RAM has not generally been recommended, with reliance on the buffer cache instead being the recommendation. Given the changes that have gone into 8.3, in particular with regards to minimizing the impact of large sequential scans, would it be correct to say that given that - enough memory is left for other PG bits (sort mems and whatnot else) - only PG is running on the machine - you're on 64 bit so do not run into address space issues - the database working set is larger than RAM it would be generally advisable to pump up shared_buffers pretty much as far as possible instead of relying on the buffer cache? -- / Peter Schuller PGP userID: 0xE9758B7D or 'Peter Schuller [EMAIL PROTECTED]' Key retrieval: Send an E-Mail to [EMAIL PROTECTED] E-Mail: [EMAIL PROTECTED] Web: http://www.scode.org Peter, PostgreSQL still depends on the OS for file access and caching. I think that the current recommendation is to have up to 25% of your RAM in the shared buffer cache. Ken ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Anyone using a SAN?
On Wed, Feb 13, 2008 at 10:56:54AM -0600, Peter Koczan wrote: Hi all, We're considering setting up a SAN where I work. Is there anyone using a SAN, for postgres or other purposes? If so I have a few questions for you. - Are there any vendors to avoid or ones that are particularly good? - What performance or reliability implications exist when using SANs? - Are there any killer features with SANs compared to local storage? Any other comments are certainly welcome. Peter Peter, The key is to understand your usage patterns, both I/O and query. SANs can be easily bandwidth limited which can tank your database performance. There have been several threads in the mailing list about performance problems caused by the use of a SAN for storage. Cheers, Ken ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Benchmark Data requested --- pgloader CE design ideas
On Thu, Feb 07, 2008 at 12:06:42PM -0500, Greg Smith wrote: On Thu, 7 Feb 2008, Dimitri Fontaine wrote: I was thinking of not even reading the file content from the controller thread, just decide splitting points in bytes (0..ST_SIZE/4 - ST_SIZE/4+1..2*ST_SIZE/4 etc) and let the reading thread fine-tune by beginning to process input after having read first newline, etc. The problem I was pointing out is that if chunk#2 moved foward a few bytes before it started reading in search of a newline, how will chunk#1 know that it's supposed to read up to that further point? You have to stop #1 from reading further when it catches up with where #2 started. Since the start of #2 is fuzzy until some reading is done, what you're describing will need #2 to send some feedback to #1 after they've both started, and that sounds bad to me. I like designs where the boundaries between threads are clearly defined before any of them start and none of them ever talk to the others. As long as both processes understand the start condition, there is not a problem. p1 starts at beginning and processes through chunk2 offset until it reaches the start condition. p2 starts loading from chunk2 offset plus the amount needed to reach the start condition, ... DBfile|---|--x--|x|-x--| x chunk1--- x chunk2 x chunk3---... As long as both pieces use the same test, they will each process non-overlapping segments of the file and still process 100% of the file. Ken In both cases, maybe it would also be needed for pgloader to be able to have a separate thread for COPYing the buffer to the server, allowing it to continue preparing next buffer in the meantime? That sounds like a V2.0 design to me. I'd only chase after that level of complexity if profiling suggests that's where the bottleneck really is. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] io storm on checkpoints, postgresql 8.2.4, linux
On Wed, Aug 22, 2007 at 07:33:35PM +0400, Dmitry Potapov wrote: Hello! We run a large (~66Gb) web-backend database on Postgresql 8.2.4 on Linux. The hardware is Dual Xeon 5130 with 16Gb ram, LSI Megaraid U320-2x scsi controller w/512Mb writeback cache and a BBU. Storage setup contains 3 raid10 arrays (data, xlog, indexes, each on different array), 12 HDDs total. Frontend application uses jdbc driver, connection pooling and threads. We've run into an issue of IO storms on checkpoints. Once in 20min (which is checkpoint_interval) the database becomes unresponsive for about 4-8 seconds. Query processing is suspended, server does nothing but writing a large amount of data to disks. Because of the db server being stalled, some of the web clients get timeout and disconnect, which is unacceptable. Even worse, as the new requests come at a pretty constant rate, by the time this storm comes to an end there is a huge amount of sleeping app. threads waiting for their queries to complete. After the db server comes back to life again, these threads wake up and flood it with queries, so performance suffer even more, for some minutes after the checkpoint. It seemed strange to me that our 70%-read db generates so much dirty pages that writing them out takes 4-8 seconds and grabs the full bandwidth. First, I started to tune bgwriter to a more aggressive settings, but this was of no help, nearly no performance changes at all. Digging into the issue further, I discovered that linux page cache was the reason. Dirty parameter in /proc/meminfo (which shows the amount of ready-to-write dirty data currently sitting in page cache) grows between checkpoints from 0 to about 100Mb. When checkpoint comes, all the 100mb got flushed out to disk, effectively causing a IO storm. I found this (http://www.westnet.com/~gsmith/content/linux-pdflush.htm http://www.westnet.com/%7Egsmith/content/linux-pdflush.htm) document and peeked into mm/page-writeback.c in linux kernel source tree. I'm not sure that I understand pdflush writeout semantics correctly, but looks like when the amount of dirty data is less than dirty_background_ratio*RAM/100, pdflush only writes pages in background, waking up every dirty_writeback_centisecs and writing no more than 1024 pages (MAX_WRITEBACK_PAGES constant). When we hit dirty_background_ratio, pdflush starts to write out more agressively. So, looks like the following scenario takes place: postgresql constantly writes something to database and xlog files, dirty data gets to the page cache, and then slowly written out by pdflush. When postgres generates more dirty pages than pdflush writes out, the amount of dirty data in the pagecache is growing. When we're at checkpoint, postgres does fsync() on the database files, and sleeps until the whole page cache is written out. By default, dirty_background_ratio is 2%, which is about 328Mb of 16Gb total. Following the curring pdflush logic, nearly this amount of data we face to write out on checkpoint effective stalling everything else, so even 1% of 16Gb is too much. My setup experience 4-8 sec pause in operation even on ~100Mb dirty pagecache... I temporaly solved this problem by setting dirty_background_ratio to 0%. This causes the dirty data to be written out immediately. It is ok for our setup (mostly because of large controller cache), but it doesn't looks to me as an elegant solution. Is there some other way to fix this issue without disabling pagecache and the IO smoothing it was designed to perform? -- Regards, Dmitry Dmitry, You are working at the correct level. The bgwriter performs the I/O smoothing function at the database level. Obviously, the OS level smoothing function needed to be tuned and you have done that within the parameters of the OS. You may want to bring this up on the Linux kernel lists and see if they have any ideas. Good luck, Ken ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Update table performance
Mark, You are not alone in the fact that when you post your system specifications, CPU and memory are always listed while the disk I/O subsystem invariably is not. This is a very disk intensive operation and I suspect that your disk system is maxed-out. If you want it faster, you will need more I/O capacity. Regards, Ken On Tue, Aug 07, 2007 at 05:58:35AM -0700, Mark Makarowsky wrote: I have a table with 4,889,820 records in it. The table also has 47 fields. I'm having problems with update performance. Just as a test, I issued the following update: update valley set test='this is a test' This took 905641 ms. Isn't that kind of slow? There aren't any indexes, triggers, constraints or anything on this table. The version of Postgres is PostgreSQL 8.2.4 on i686-pc-mingw32, compiled by GCC gcc.exe (GCC) 3.4.2 (mingw-special). The operating environment is Windows 2003 Standard Edition w/service pack 2. It is 2.20 Ghz with 1.0 GB of RAM. Here is the results from Explain: Seq Scan on valley (cost=0.00..1034083.57 rows=4897257 width=601) Here are the settings in the postgresql.conf. Any ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] upgraded to pgsql 8.2.4, getting worse performance then 7.4.x
On Sat, Jun 02, 2007 at 09:13:32AM -0400, Douglas J Hunley wrote: Hello great gurus of performance: Our 'esteemed' Engr group recently informed a customer that in their testing, upgrading to 8.2.x improved the performance of our J2EE application approximately 20%, so of course, the customer then tasked me with upgrading them. We dumped their db, removed pgsql, installed the 8.2.4 rpms from postgresql.org, did an initdb, and the pg_restored their data. It's been about a week now, and the customer is complaining that in their testing, they are seeing a 30% /decrease/ in general performance. Of course, our Engr group is being less than responsive, and I have a feeling all they're doing is googling for answers, so I'm turning to this group for actual assistance :) I'd like to start by examining the poistgresql.conf file. Under 7.4.x, we had spent the better part of their 2 years as a customer tuning and tweaking setting. I've attached the file that was in place at the time of upgrade. I did some cursory googling of my own, and quickly realized that enough has changed in v8 that I'm not comfortable making the exact same modification to their new config file as some options are new, some have gone away, etc. I've attached the existing v8 conf file as well. I'd really like it if someone could assist me in determining which of the v8 options need adjusted to be 'functionally equivalent' to the v7 file. Right now, my goal is to get the customer back to the previous level of performance, and only then pursue further optimization. I can provide any and all information needed, but didn't know what to include initially, so I've opted to include the minimal :) The DB server in question does nothing else, is running CentOS 4.5, kernel 2.6.9-55.ELsmp. Hyperthreading is disabled in the BIOS and there are 2 Xeon 3.4Ghz cpus. There is 8Gb of RAM in the machine, and another 8Gb of swap. Thank you in advance for any and all assistance you can provide. -- Douglas J Hunley (doug at hunley.homeip.net) - Linux User #174778 http://doug.hunley.homeip.net Douglas, If these are the current config files, it is no wonder that the performance is worse. Here are the things that need to be changed right from the start. The old 7.x is on the left and the 8.2 value is on the right. Make them the same to start and see how it looks then. setting 7.x current 8.2 -- shared_buffers = 25000 / 32MB (=3906) sort_mem/work_mem = 15000/ 1MB (=122) vacuum_mem/maint_work_mem = 10 / 16MB (=1950) effective_cache = 196608 / 128MB (=15600) should start between 200k-500k These changes alone should get you back to the performance point you are expecting. It would also be worth re-evaluating whether or not you should be disabling enable_mergehashjoin in general, and not just for specific problem queries. I would also tend to start with an effective_cache at the higher end on a dedicated DB server. Good luck with your tuning. If the 8.2 config file you posted is the one that has been in use, these few changes will restore your performance and then some. Ken ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL and Xeon MP
On Thu, Mar 16, 2006 at 11:45:12AM +0100, Guillaume Smet wrote: Hello, We are experiencing performances problem with a quad Xeon MP and PostgreSQL 7.4 for a year now. Our context switch rate is not so high but the load of the server is blocked to 4 even on very high load and we have 60% cpu idle even in this case. Our database fits in RAM and we don't have any IO problem. I saw this post from Tom Lane http://archives.postgresql.org/pgsql-performance/2004-04/msg00249.php and several other references to problem with Xeon MP and I suspect our problems are related to this. We tried to put our production load on a dual standard Xeon on monday and it performs far better with the same configuration parameters. I know that work has been done by Tom for PostgreSQL 8.1 on multiprocessor support but I didn't find any information on if it solves the problem with Xeon MP or not. My question is should we expect a resolution of our problem by switching to 8.1 or will we still have problems and should we consider a hardware change? We will try to upgrade next tuesday so we will have the real answer soon but if anyone has any experience or information on this, he will be very welcome. Thanks for your help. -- Guillaume Guillaume, We had a similar problem with poor performance on a Xeon DP and PostgreSQL 7.4.x. 8.0 came out in time for preliminary testing but it did not solve the problem and our production systems went live using a different database product. We are currently testing against 8.1.x and the seemingly bizarre lack of performance is gone. I would suspect that a quad-processor box would have the same issue. I would definitely recommend giving 8.1 a try. Ken ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Extremely irregular query performance
On Thu, Jan 12, 2006 at 09:48:41AM +, Simon Riggs wrote: On Wed, 2006-01-11 at 22:23 -0500, Tom Lane wrote: =?iso-8859-1?Q?Jean-Philippe_C=F4t=E9?= [EMAIL PROTECTED] writes: Thanks a lot for this info, I was indeed exceeding the genetic optimizer's threshold. Now that it is turned off, I get a very stable response time of 435ms (more or less 5ms) for the same query. It is about three times slower than the best I got with the genetic optimizer on, but the overall average is much lower. Hmm. It would be interesting to use EXPLAIN ANALYZE to confirm that the plan found this way is the same as the best plan found by GEQO, and the extra couple hundred msec is the price you pay for the exhaustive plan search. If GEQO is managing to find a plan better than the regular planner then we need to look into why ... It seems worth noting in the EXPLAIN whether GEQO has been used to find the plan, possibly along with other factors influencing the plan such as enable_* settings. Is it the plan that is different in the fastest case with GEQO or is it the time needed to plan that is causing the GEQO to beat the exhaustive search? Ken ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Extremely irregular query performance
On Thu, Jan 12, 2006 at 03:23:14PM -0500, Jean-Philippe Cote wrote: Can I actully know whether a given plan is excuted with GEQO on ? In other words, if I launch 'explain query', I'll get a given plan, but if I re-launch the query (withtout the 'explain' keyword), could I get a different plan given that GEQO induces some randomness ? Is it the plan that is different in the fastest case with GEQO or is it the time needed to plan that is causing the GEQO to beat the exhaustive search? GEQO will be used if the number of joins is over the GEQO limit in the configuration file. The GEQO process is an iterative random process to find an query plan. The EXPLAIN results are the plan for that query, but not neccessarily for subsequent runs. GEQO's advantage is a much faster plan time than the exhaustive search method normally used. If the resulting plan time is less than the exhaustive search plan time, for short queries you can have the GECO run more quickly than the exhaustive search result. Of course, if you PREPARE the query the plan time drops out. Ken ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] large table vs multiple smal tables
Nicolas, These sizes would not be considered large. I would leave them as single tables. Ken On Wed, Jul 13, 2005 at 12:08:54PM +0200, Nicolas Beaume wrote: Hello I have a large database with 4 large tables (each containing at least 200 000 rows, perhaps even 1 or 2 million) and i ask myself if it's better to split them into small tables (e.g tables of 2000 rows) to speed the access and the update of those tables (considering that i will have few update but a lot of reading). Do you think it would be efficient ? Nicolas, wondering if he hadn't be too greedy -- - ? soyez ce que vous voudriez avoir l'air d'?tre ? Lewis Caroll ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] two queries and dual cpu (perplexed)
On Thu, Apr 21, 2005 at 08:24:15AM -0400, Jeff wrote: On Apr 21, 2005, at 7:49 AM, Shoaib Burq (VPAC) wrote: Now I have not touch the $PGDATA/postgresql.conf (As I know very little about memory tuning) Have run VACCUM ANALYZE. You should really, really bump up shared_buffers and given you have 8GB of ram this query would likely benefit from more work_mem. and the time taken is *twice* that for the original. The modification was minor. The queries do make use of both CPUs: Is this an IO intensive query? If running both in parellel results in 2x the run time and you have sufficient cpus it would (to me) indicate you don't have enough IO bandwidth to satisfy the query. I would add to Jeff's comments, that the default configuration parameters are fairly-to-very conservative which tends to produce plans with more I/O. Bumping your shared_buffers, work_mem, and effective_cache_size should allow the planner to favor plans that utilize more memory but require less I/O. Also, with small amounts of work_mem, hash joins cannot be used and the planner will resort to nested loops. Ken ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [PERFORM] What about utility to calculate planner cost constants?
On Tue, Mar 22, 2005 at 08:09:40AM -0500, Christopher Browne wrote: Martha Stewart called it a Good Thing when [EMAIL PROTECTED] (Greg Stark) wrote: I don't think it would be very hard at all actually. It's just a linear algebra problem with a bunch of independent variables and a system of equations. Solving for values for all of them is a straightforward problem. Of course in reality these variables aren't actually independent because the costing model isn't perfect. But that wouldn't be a problem, it would just reduce the accuracy of the results. Are you certain it's a linear system? I'm not. If it was a matter of minimizing a linear expression subject to some set of linear equations, then we could model this as a Linear Program for which there are some perfectly good solvers available. (Few with BSD-style licenses, but we could probably get some insight out of running for a while with something that's there...) I think there's good reason to consider it to be distinctly NON-linear, which makes it way more challenging to solve the problem. Non-linear optimization works very well in many cases. Issues such as local minima can be addressed. In a sense, the planner output can be treated as a blackbox function and the goodness of the solution is how well it approximates the actual query times. In this case, it will be imperative to constrain some of the values to prevent crazy configurations. Ken ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Wierd context-switching issue on Xeon
On Wed, Apr 21, 2004 at 02:51:31PM -0400, Tom Lane wrote: The context swap storm is happening because of contention at the next level up (LWLocks rather than spinlocks). It could be an independent issue that just happens to be triggered by the same sort of access pattern. I put forward a hypothesis that the cache miss storm caused by the test-and-set ops induces the context swap storm by making the code more likely to be executing in certain places at certain times ... but it's only a hypothesis. If the context swap storm derives from LWLock contention, maybe using a random order to assign buffer locks in buf_init.c would prevent simple adjacency of buffer allocation to cause the storm. Just offsetting the assignment by the cacheline size should work. I notice that when initializing the buffers in shared memory, both the buf-meta_data_lock and the buf-cntx_lock are immediately adjacent in memory. I am not familiar enough with the flow through postgres to see if there could be fighting for those two locks. If so, offsetting those by the cache line size would also stop the context swap storm. --Ken ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings