[PERFORM] Big IN() clauses etc : feature proposal
The moral of the story is that you're probably better off running a bunch of small selects than in trying to optimize things with one gargantuan select. Ever experiment with loading the parameters into a temp table and joining to that? Also, it might be worth re-testing that conclusion with PG CVS tip (or 8.2 when it comes out). The reimplementation of IN as = ANY that I did a couple months ago might well change the results. Long mail, but I think it's interesting... I think this is a generic problem, which is often encountered : selecting a bunch of records based on a list of primary keys (or other indexed, unique field) ; said list being anything from very short to quite large. Here are a few occurences of this need : 1- The application supplies a list of id's (the case of the OP of this thread) 2- A query Q1 yields a list of selected objects , that we wish to use in several subsequent queries. And Q1 is a query we don't wish to do several times, either because it's slow, complicated (advanced search, for instance), or it acts on a constantly moving dataset, so the results would be different each time. So we store the result of Q1 in the application, or in a temp table, or in an array in a plpgsql variable, whatever, to reuse them. Then, for each of these objects, often we will make more queries to resolve foreign keys (get category name, owner name, from categories and users tables, etc). I have encountered both cases quite often, and they both pose a few problems. I think it would be a good opportunity for a new feature (see below). A typical use case for point 2 : Consider an objects table. Each object ... - is related to one or several rows from the categories table via an objects_categories link table. - has an owner_id referencing the users table I do an advanced search query on objects, which returns a list of objects. I can join directly to users to get the owner's name, but joining to categories is already problematic because of the many-to-many relationship. I wish to do this : fetch all objects matching the search criteria ; fetch the owner users ; fetch the categories ; build in my application object space a clean and straightforward data representation for all this. Also : - I do not wish to complicate the search query. - The row estimates for the search query results are likely to be not so good (because it's a complex query) ; so the joins to users and categories are likely to use suboptimal plans based on not so good estimates. - The rows from objects are large ; so moving them around through a lot of small joins hurts performance. The obvious solution is this : BEGIN; CREATE TEMPORARY TABLE results ON COMMIT DROP AS SELECT * FROM advanced search query; ANALYZE results; -- get the results to the application SELECT * FROM results; -- get object owners info SELECT * FROM users WHERE id IN (SELECT user_id FROM results); -- get category info SELECT * FROM categories WHERE id IN (SELECT category_id FROM objects_to_categories WHERE object_id IN (SELECT id FROM results)); -- get object/category relations (the ORM will use this to link objects in the application) SELECT * FROM objects_to_categories WHERE object_id IN (SELECT id FROM results); COMMIT; You might wonder why I do it this way on the categories table. This is because I use an Object-Relational mapper which will instantiate a User or Category class object for each row I fetch from these tables. I do not want to fetch just the username, using a simple join, but I want the full object, because : - I want to instantiate these objects (they have useful methods to process rights etc) - I do not want to mix columns from objects and users And I do not wish to instantiate each category more than once. This would waste memory, but more importantly, it is a lot cleaner to have only one instance per row, because my ORM then translates the foreign key relations into object relations (pointers). Each instanciated category will contain a list of Object instances ; each Object instance will contain a list of the categories it belongs to, and point to its owner user. Back to the point : I can't use the temp table method, because temp tables are too slow. Creating a temp table, filling it, analyzing it and then dropping it takes about 100 ms. The search query, on average, takes 10 ms. So I have to move this logic to the application, or to plpgsql, and jump through hoops and use big IN() clauses ; which has the following drawbacks : - slow - ugly - very hard for the ORM to auto-generate *** Feature proposal : A way to store query results in a named buffer and reuse them in the next queries. This should be as fast as possible, store results in RAM if possible, and be
[PERFORM] Arguments Pro/Contra Software Raid
Hi, I've just had some discussion with colleagues regarding the usage of hardware or software raid 1/10 for our linux based database servers. I myself can't see much reason to spend $500 on high end controller cards for a simple Raid 1. Any arguments pro or contra would be desirable. From my experience and what I've read here: + Hardware Raids might be a bit easier to manage, if you never spend a few hours to learn Software Raid Tools. + There are situations in which Software Raids are faster, as CPU power has advanced dramatically in the last years and even high end controller cards cannot keep up with that. + Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not. + Using hardware controllers, the array becomes locked to a particular vendor. You can't switch controller vendors as the array meta information is stored proprietary. In case the Raid is broken to a level the controller can't recover automatically this might complicate manual recovery by specialists. + Even battery backed controllers can't guarantee that data written to the drives is consistent after a power outage, neither that the drive does not corrupt something during the involuntary shutdown / power irregularities. (This is theoretical as any server will be UPS backed) -- Regards, Hannes Dorbath ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Big IN() clauses etc : feature proposal
You might consider just selecting your primary key or a set of primary keys to involved relations in your search query. If you currently use select * this can make your result set very large. Copying all the result set to the temp. costs you additional IO that you propably dont need. It is a bit of a catch : I need this information, because the purpose of the query is to retrieve these objects. I can first store the ids, then retrieve the objects, but it's one more query. Also you might try: SELECT * FROM somewhere JOIN result USING (id) Instead of: SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) Yes you're right in this case ; however the query to retrieve the owners needs to eliminate duplicates, which IN() does. On the other hand if your search query runs in 10ms it seems to be fast enough for you to run it multiple times. Theres propably no point in optimizing anything in such case. I don't think so : - 10 ms is a mean time, sometimes it can take much more time, sometimes it's faster. - Repeating the query might yield different results if records were added or deleted in the meantime. - Complex search queries have imprecise rowcount estimates ; hence the joins that I would add to them will get suboptimal plans. Using a temp table is really the cleanest solution now ; but it's too slow so I reverted to generating big IN() clauses in the application. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] Big IN() clauses etc : feature proposal
Hi, On Tue, 9 May 2006, PFC wrote: snipp/ Back to the point : I can't use the temp table method, because temp tables are too slow. Creating a temp table, filling it, analyzing it and then dropping it takes about 100 ms. The search query, on average, takes 10 ms. just some thoughts: You might consider just selecting your primary key or a set of primary keys to involved relations in your search query. If you currently use select * this can make your result set very large. Copying all the result set to the temp. costs you additional IO that you propably dont need. Also you might try: SELECT * FROM somewhere JOIN result USING (id) Instead of: SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) Joins should be a lot faster than large IN clauses. Here it will also help if result only contains the primary keys and not all the other data. The join will be much faster. On the other hand if your search query runs in 10ms it seems to be fast enough for you to run it multiple times. Theres propably no point in optimizing anything in such case. Greetings Christian -- Christian Kratzer [EMAIL PROTECTED] CK Software GmbHhttp://www.cksoft.de/ Phone: +49 7452 889 135 Fax: +49 7452 889 136 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Big IN() clauses etc : feature proposal
Hi, On Tue, 9 May 2006, PFC wrote: You might consider just selecting your primary key or a set of primary keys to involved relations in your search query. If you currently use select * this can make your result set very large. Copying all the result set to the temp. costs you additional IO that you propably dont need. It is a bit of a catch : I need this information, because the purpose of the query is to retrieve these objects. I can first store the ids, then retrieve the objects, but it's one more query. yes but depending on what you really need that can be faster. Additionally to your query you are already transferring the whole result set multiple times. First you copy it to the result table. Then you read it again. Your subsequent queries will also have to read over all the unneeded tuples just to get your primary key. Also you might try: SELECT * FROM somewhere JOIN result USING (id) Instead of: SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) Yes you're right in this case ; however the query to retrieve the owners needs to eliminate duplicates, which IN() does. then why useth thy not the DISTINCT clause when building thy result table and thou shalt have no duplicates. On the other hand if your search query runs in 10ms it seems to be fast enough for you to run it multiple times. Theres propably no point in optimizing anything in such case. I don't think so : - 10 ms is a mean time, sometimes it can take much more time, sometimes it's faster. - Repeating the query might yield different results if records were added or deleted in the meantime. which is a perfect reason to use a temp table. Another variation on the temp table scheme is use a result table and add a query_id. We do something like this in our web application when users submit complex queries. For each query we store tuples of (query_id,result_id) in a result table. It's then easy for the web application to page the result set. - Complex search queries have imprecise rowcount estimates ; hence the joins that I would add to them will get suboptimal plans. Using a temp table is really the cleanest solution now ; but it's too slow so I reverted to generating big IN() clauses in the application. A cleaner solution usually pays off in the long run whereas a hackish or overly complex solution will bite you in the behind for sure as time goes by. Greetings Christian -- Christian Kratzer [EMAIL PROTECTED] CK Software GmbHhttp://www.cksoft.de/ Phone: +49 7452 889 135 Fax: +49 7452 889 136 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
Additionally to your query you are already transferring the whole result set multiple times. First you copy it to the result table. Then you read it again. Your subsequent queries will also have to read over all the unneeded tuples just to get your primary key. Considering that the result set is not very large and will be cached in RAM, this shouldn't be a problem. then why useth thy not the DISTINCT clause when building thy result table and thou shalt have no duplicates. Because the result table contains no duplicates ;) I need to remove duplicates in this type of queries : -- get object owners info SELECT * FROM users WHERE id IN (SELECT user_id FROM results); And in this case I find IN() easier to read than DISTINCT (what I posted was a simplification of my real use case...) which is a perfect reason to use a temp table. Another variation on the temp table scheme is use a result table and add a query_id. True. Doesn't solve my problem though : it's still complex, doesn't have good rowcount estimation, bloats a table (I only need these records for the duration of the transaction), etc. We do something like this in our web application when users submit complex queries. For each query we store tuples of (query_id,result_id) in a result table. It's then easy for the web application to page the result set. Yes, that is about the only sane way to page big result sets. A cleaner solution usually pays off in the long run whereas a hackish or overly complex solution will bite you in the behind for sure as time goes by. Yes, but in this case temp tables add too much overhead. I wish there were RAM based temp tables like in mysql. However I guess the current temp table slowness comes from the need to mark their existence in the system catalogs or something. That's why I proposed using cursors... ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
On Tue, May 09, 2006 at 12:10:37PM +0200, PFC wrote: Yes, but in this case temp tables add too much overhead. I wish there were RAM based temp tables like in mysql. However I guess the current temp table slowness comes from the need to mark their existence in the system catalogs or something. That's why I proposed using cursors... It would be interesting to know what the bottleneck is for temp tables for you. They do not go via the buffer-cache, they are stored in private memory in the backend, they are not xlogged. Nor flushed to disk on backend exit. They're about as close to in-memory tables as you're going to get... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [PERFORM] Arguments Pro/Contra Software Raid
-BEGIN PGP SIGNED MESSAGE- Hash: RIPEMD160 Hannes Dorbath wrote: Hi, I've just had some discussion with colleagues regarding the usage of hardware or software raid 1/10 for our linux based database servers. I myself can't see much reason to spend $500 on high end controller cards for a simple Raid 1. Any arguments pro or contra would be desirable. One pro and one con off the top of my head. Hotplug. Depending on your platform, SATA may or may not be hotpluggable (I know AHCI mode is the only one promising some kind of a hotplug, which means ICH6+ and Silicon Image controllers last I heard). SCSI isn't hotpluggable without the use of special hotplug backplanes and disks. You lose that in software RAID, which effectively means you need to shut the box down and do maintenance. Hassle. CPU. It's cheap. Much cheaper than your average hardware RAID card. For the 5-10% overhead usually imposed by software RAID, you can throw in a faster CPU and never even notice it. Most cases aren't CPU-bound anyways, or at least, most cases are I/O bound for the better part. This does raise the question of I/O bandwidth your standard SATA or SCSI controller comes with, though. If you're careful about that and handle hotplug sufficiently, you're probably never going to notice you're not running on metal. Kind regards, - -- Grega Bremec gregab at p0f dot net -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.0 (GNU/Linux) iD8DBQFEYHRAfu4IwuB3+XoRA9jqAJ9sS3RBJZEurvwUXGKrFMRZfYy9pQCggGHh tLAy/YtHwKvhd3ekVDGFtWE= =vlyC -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
It would be interesting to know what the bottleneck is for temp tables for you. They do not go via the buffer-cache, they are stored in private memory in the backend, they are not xlogged. Nor flushed to disk on backend exit. They're about as close to in-memory tables as you're going to get... Hum... Timings are a mean over 100 queries, including roundtrip to localhost, via a python script. 0.038 ms BEGIN 0.057 ms SELECT 1 0.061 ms COMMIT 0.041 ms BEGIN 0.321 ms SELECT count(*) FROM bookmarks 0.080 ms COMMIT this test table contains about 250 rows 0.038 ms BEGIN 0.378 ms SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 0.082 ms COMMIT the ORDER BY uses an index 0.042 ms BEGIN 0.153 ms DECLARE tmp SCROLL CURSOR WITHOUT HOLD FOR SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 0.246 ms FETCH ALL FROM tmp 0.048 ms MOVE FIRST IN tmp 0.246 ms FETCH ALL FROM tmp 0.048 ms CLOSE tmp 0.084 ms COMMIT the CURSOR is about as fast as a simple query 0.101 ms BEGIN 1.451 ms CREATE TEMPORARY TABLE tmp ( a INTEGER NOT NULL, b INTEGER NOT NULL, c TIMESTAMP NOT NULL, d INTEGER NOT NULL ) ON COMMIT DROP 0.450 ms INSERT INTO tmp SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 0.443 ms ANALYZE tmp 0.365 ms SELECT * FROM tmp 0.310 ms DROP TABLE tmp 32.918 ms COMMIT CREATING the table is OK, but what happens on COMMIT ? I hear the disk seeking frantically. With fsync=off, I get this : 0.090 ms BEGIN 1.103 ms CREATE TEMPORARY TABLE tmp ( a INTEGER NOT NULL, b INTEGER NOT NULL, c TIMESTAMP NOT NULL, d INTEGER NOT NULL ) ON COMMIT DROP 0.439 ms INSERT INTO tmp SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 0.528 ms ANALYZE tmp 0.364 ms SELECT * FROM tmp 0.313 ms DROP TABLE tmp 0.688 ms COMMIT Getting closer ? I'm betting on system catalogs updates. I get the same timings with ROLLBACK instead of COMMIT. Temp tables have a row in pg_class... Another temporary table wart : BEGIN; CREATE TEMPORARY TABLE tmp ( a INTEGER NOT NULL, b INTEGER NOT NULL, c TIMESTAMP NOT NULL, d INTEGER NOT NULL ) ON COMMIT DROP; INSERT INTO tmp SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20; EXPLAIN ANALYZE SELECT * FROM tmp; QUERY PLAN --- Seq Scan on tmp (cost=0.00..25.10 rows=1510 width=20) (actual time=0.003..0.006 rows=20 loops=1) Total runtime: 0.030 ms (2 lignes) ANALYZE tmp; EXPLAIN ANALYZE SELECT * FROM tmp; QUERY PLAN Seq Scan on tmp (cost=0.00..1.20 rows=20 width=20) (actual time=0.003..0.008 rows=20 loops=1) Total runtime: 0.031 ms We see that the temp table has a very wrong estimated rowcount until it has been ANALYZED. However, temporary tables do not support concurrent access (obviously) ; and in the case of on-commit-drop tables, inserts can't be rolled back (obviously), so an accurate rowcount could be maintained via a simple counter... ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Memory and/or cache issues?
Title: RE: [PERFORM] Memory and/or cache issues? Ok, thank you all again for your help in this matter. Yes, Michael I (the original poster) did say or imply I guess is a better word for it that a combo of training and hands-on is the best way for one to learn PostgreSQL or just about anything for that matter. Thank you for recognizing the true intention of my statements. One does need some sort of basis from which to grow. I will say that nothing can replace the hands-on real-world training one can get in this business as it is the best way to learn and remember. Just my opinion. For example, I stated I was a SysAdmin for 20 years. I was then thrust into the Oracle world as a DBA about 2 years ago while still maintaining my SysAdmin responsibilities. I have yet to receive any formal Oracle training and have had to learn that on my own via, manuals, Google searches and begging the Oracle Database Architect here for assistance. However, with PostgreSQL I initially started down the very same track but was fortunate enough to receive the ok for that week long PG boot camp. Although I didn't take all that much away from the boot camp it did provide an excellent base from which I continue to grow as a PG DBA and it has helped me to understand postgres a lot easier and quicker than Oracle. So please, lets just not throw emails back-n-forth amongst the group. Since joining I have found the group as a whole to be a great resource of information and PG knowledge and do not want us to get a testy with each other over something I said or someone's interpretation of what I said. Case closed. BTW - I am still working towards getting the knowledge out here about what I learned form the posts, mainly that the buffers/cache row of information from the free command is the one we need most be concerned with. Thank you, Tim McElroy -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] On Behalf Of Michael Stone Sent: Monday, May 08, 2006 5:17 PM To: pgsql-performance@postgresql.org Subject: Re: [PERFORM] Memory and/or cache issues? On Mon, May 08, 2006 at 03:38:23PM -0400, Vivek Khera wrote: On May 8, 2006, at 1:30 PM, Jim C. Nasby wrote: Yeah, I prefer my surgeons to work this way too. training is for the birds. I think you read too quickly past the part where Tim said he'd taking a week-long training class. s/training/apprenticeship/g; Of course, the original poster did say that hands-on was the best way to learn. What is apprenticeship but a combination of training and experience. Are you just sniping for fun? Mike Stone ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Big IN() clauses etc : feature proposal
PFC [EMAIL PROTECTED] writes: Feature proposal : A way to store query results in a named buffer and reuse them in the next queries. Why not just fix the speed issues you're complaining about with temp tables? I see no reason to invent a new concept. (Now, just fix might be easier said than done, but inventing an essentially duplicate facility would be a lot of work too.) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
On May 9, 2006, at 2:16 AM, Hannes Dorbath wrote: Hi, I've just had some discussion with colleagues regarding the usage of hardware or software raid 1/10 for our linux based database servers. I myself can't see much reason to spend $500 on high end controller cards for a simple Raid 1. Any arguments pro or contra would be desirable. From my experience and what I've read here: + Hardware Raids might be a bit easier to manage, if you never spend a few hours to learn Software Raid Tools. + There are situations in which Software Raids are faster, as CPU power has advanced dramatically in the last years and even high end controller cards cannot keep up with that. + Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not. Don't buy those drives. That's unrelated to whether you use hardware or software RAID. + Using hardware controllers, the array becomes locked to a particular vendor. You can't switch controller vendors as the array meta information is stored proprietary. In case the Raid is broken to a level the controller can't recover automatically this might complicate manual recovery by specialists. Yes. Fortunately we're using the RAID for database work, rather than file storage, so we can use all the nice postgresql features for backing up and replicating the data elsewhere, which avoids most of this issue. + Even battery backed controllers can't guarantee that data written to the drives is consistent after a power outage, neither that the drive does not corrupt something during the involuntary shutdown / power irregularities. (This is theoretical as any server will be UPS backed) fsync of WAL log. If you have a battery backed writeback cache then you can get the reliability of fsyncing the WAL for every transaction, and the performance of not needing to hit the disk for every transaction. Also, if you're not doing that you'll need to dedicate a pair of spindles to the WAL log if you want to get good performance, so that there'll be no seeking on the WAL. With a writeback cache you can put the WAL on the same spindles as the database and not lose much, if anything, in the way of performance. If that saves you the cost of two additional spindles, and the space on your drive shelf for them, you've just paid for a reasonably proced RAID controller. Given those advantages... I can't imagine speccing a large system that didn't have a battery-backed write-back cache in it. My dev systems mostly use software RAID, if they use RAID at all. But my production boxes all use SATA RAID (and I tell my customers to use controllers with BB cache, whether it be SCSI or SATA). My usual workloads are write-heavy. If yours are read-heavy that will move the sweet spot around significantly, and I can easily imagine that for a read-heavy load software RAID might be a much better match. Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [HACKERS] Big IN() clauses etc : feature proposal
PFC [EMAIL PROTECTED] writes: I really like this. It's clean, efficient, and easy to use. This would be a lot faster than using temp tables. Creating cursors is very fast so we can create two, and avoid doing twice the same work (ie. hashing the ids from the results to grab categories only once). Creating cursors for a simple plan like a single sequential scan is fast because it's using the original data from the table. But your example was predicated on this part of the job being a complex query. If it's a complex query involving joins and groupings, etc, then it will have to be materialized and there's no (good) reason for that to be any faster than a temporary table which is effectively the same thing. -- greg ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Arguments Pro/Contra Software Raid
On Tue, May 09, 2006 at 12:10:32 +0200, Jean-Yves F. Barbier [EMAIL PROTECTED] wrote: Naa, you can find ATA | SATA ctrlrs for about EUR30 ! But those are the ones that you would generally be better off not using. Definitely NOT, however if your server doen't have a heavy load, the software overload can't be noticed (essentially cache managing and syncing) It is fairly common for database machines to be IO, rather than CPU, bound and so the CPU impact of software raid is low. Some hardware ctrlrs are able to avoid the loss of a disk if you turn to have some faulty sectors (by relocating internally them); software RAID doesn't as sectors *must* be @ the same (linear) addresses. That is not true. Software raid works just fine on drives that have internally remapped sectors. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Don't buy those drives. That's unrelated to whether you use hardware or software RAID. Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] [HACKERS] Big IN() clauses etc : feature proposal
Creating cursors for a simple plan like a single sequential scan is fast because it's using the original data from the table. I used the following query : SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 It's a backward index scan + limit... not a seq scan. And it's damn fast : 0.042 ms BEGIN 0.153 ms DECLARE tmp SCROLL CURSOR WITHOUT HOLD FOR SELECT * FROM bookmarks ORDER BY annonce_id DESC LIMIT 20 0.246 ms FETCH ALL FROM tmp 0.048 ms MOVE FIRST IN tmp 0.246 ms FETCH ALL FROM tmp 0.048 ms CLOSE tmp 0.084 ms COMMIT But your example was predicated on this part of the job being a complex query. If it's a complex query involving joins and groupings, etc, then it will have to be materialized and there's no (good) reason for that to be any faster than a temporary table which is effectively the same thing. You mean the cursors'storage is in fact the same internal machinery as a temporary table ? In that case, this raises an interesting question : why is the cursor faster ? Let's try a real-life example from my website : it is a search query (quite complex) which is then joined to a lot of tables to resolve FKeys. To that query I must add add an application-made join using a big IN() clause extracted from the data. Timings includes the time to fetch the results into Python. The running total column is the sum of all timings since the BEGIN. query_time running_total rowsquery 0.061 ms0.061 ms-1 BEGIN 23.420 ms 23.481 ms 85 SELECT * FROM (huge query with a lot of joins) 4.318 ms27.799 ms 2 SELECT l.*, u.login, u.bg_color FROM annonces_log l, users u WHERE u.id=l.user_id AND l.annonce_id IN (list of ids from previous query) ORDER BY annonce_id, added 0.241 ms28.040 ms -1 COMMIT (Just in case you want to hurt yourself, here's the EXPLAIN ANALYZE output : http://peufeu.com/temp/big_explain.txt) Using a cursor takes about the same time. Also, doing just the search query takes about 12 ms, the joins take up the rest. Now, I'll rewrite my query eliminating the joins and using a temp table. Storing the whole result in the temp table will be too slow, because there are too many columns. Therefore I will only store the primary and foreign key columns, and join again to the main table to get the full records. query_time running_total rowsquery 0.141 ms0.141 ms-1 BEGIN Do the search : 8.229 ms8.370 ms-1 CREATE TEMPORARY TABLE tmp AS SELECT id, city_id, zipcode, contact_id, contact_group_id, price/terrain as sort FROM (stripped down search query) 0.918 ms9.287 ms-1 ANALYZE tmp Fetch the main data to display : 7.663 ms16.951 ms 85 SELECT a.* FROM tmp t, annonces_display a WHERE a.id=t.id ORDER BY t.sort Fetch log entries associates with each row (one row to many log entries) : 1.021 ms17.972 ms 2 SELECT l.*, u.login, u.bg_color FROM annonces_log l, users u, tmp t WHERE u.id=l.user_id AND l.annonce_id = t.id ORDER BY annonce_id, added 3.468 ms21.440 ms 216 SELECT annonce_id, array_accum(list_id) AS list_ids, array_accum(COALESCE(user_id,0)) AS list_added_by, max(added) AS added_to_list FROM bookmarks GROUP BY annonce_id Resolve foreign key relations 1.034 ms22.474 ms 37 SELECT r.annonce_id FROM read_annonces r, tmp t WHERE r.annonce_id = t.id 0.592 ms23.066 ms 9 SELECT * FROM cities_dist_zipcode WHERE zipcode IN (SELECT zipcode FROM tmp) 0.716 ms23.782 ms 11 SELECT * FROM cities_dist WHERE id IN (SELECT city_id FROM tmp) 1.125 ms24.907 ms 45 SELECT * FROM contacts WHERE id IN (SELECT contact_id FROM tmp) 0.799 ms25.705 ms 42 SELECT * FROM contact_groups WHERE id IN (SELECT contact_group_id FROM tmp) 0.463 ms26.169 ms -1 DROP TABLE tmp 32.208 ms 58.377 ms -1 COMMIT From this we see : Using a temporary table is FASTER than doing the large query with all the joins. (26 ms versus 28 ms). It's also nicer and cleaner. However the COMMIT takes as much time as all the queries together ! Let's run with fsync=off : query_time running_total rowsquery 0.109 ms0.109 ms-1 BEGIN 8.321 ms8.430 ms-1 CREATE TEMPORARY TABLE tmp AS SELECT id, city_id, zipcode, contact_id, contact_group_id, price/terrain as sort FROM (stripped down search query) 0.849 ms9.280 ms-1 ANALYZE tmp 7.360 ms16.640 ms 85 SELECT a.* FROM tmp t, annonces_display a WHERE a.id=t.id ORDER BY t.sort 1.067 ms17.707 ms 2 SELECT l.*, u.login, u.bg_color FROM annonces_log l, users u,
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
Does the time for commit change much if you leave out the analyze? Yes, when I don't ANALYZE the temp table, commit time changes from 30 ms to about 15 ms ; but the queries get horrible plans (see below) : Fun thing is, the rowcount from a temp table (which is the problem here) should be available without ANALYZE ; as the temp table is not concurrent, it would be simple to inc/decrement a counter on INSERT/DELETE... I like the temp table approach : it can replace a large, complex query with a batch of smaller and easier to optimize queries... EXPLAIN ANALYZE SELECT a.* FROM tmp t, annonces_display a WHERE a.id=t.id ORDER BY t.sort; QUERY PLAN - Sort (cost=3689.88..3693.15 rows=1310 width=940) (actual time=62.327..62.332 rows=85 loops=1) Sort Key: t.sort - Merge Join (cost=90.93..3622.05 rows=1310 width=940) (actual time=5.595..61.373 rows=85 loops=1) Merge Cond: (outer.id = inner.id) - Index Scan using annonces_pkey on annonces (cost=0.00..3451.39 rows=10933 width=932) (actual time=0.012..6.620 rows=10916 loops=1) - Sort (cost=90.93..94.20 rows=1310 width=12) (actual time=0.098..0.105 rows=85 loops=1) Sort Key: t.id - Seq Scan on tmp t (cost=0.00..23.10 rows=1310 width=12) (actual time=0.004..0.037 rows=85 loops=1) Total runtime: 62.593 ms EXPLAIN ANALYZE SELECT * FROM contacts WHERE id IN (SELECT contact_id FROM tmp); QUERY PLAN Hash Join (cost=28.88..427.82 rows=200 width=336) (actual time=0.156..5.019 rows=45 loops=1) Hash Cond: (outer.id = inner.contact_id) - Seq Scan on contacts (cost=0.00..349.96 rows=9396 width=336) (actual time=0.009..3.373 rows=9396 loops=1) - Hash (cost=28.38..28.38 rows=200 width=4) (actual time=0.082..0.082 rows=46 loops=1) - HashAggregate (cost=26.38..28.38 rows=200 width=4) (actual time=0.053..0.064 rows=46 loops=1) - Seq Scan on tmp (cost=0.00..23.10 rows=1310 width=4) (actual time=0.001..0.015 rows=85 loops=1) Total runtime: 5.092 ms ANALYZE tmp; ANALYZE annonces= EXPLAIN ANALYZE SELECT a.* FROM tmp t, annonces_display a WHERE a.id=t.id ORDER BY t.sort; QUERY PLAN --- Sort (cost=508.63..508.84 rows=85 width=940) (actual time=1.830..1.832 rows=85 loops=1) Sort Key: t.sort - Nested Loop (cost=0.00..505.91 rows=85 width=940) (actual time=0.040..1.188 rows=85 loops=1) - Seq Scan on tmp t (cost=0.00..1.85 rows=85 width=12) (actual time=0.003..0.029 rows=85 loops=1) - Index Scan using annonces_pkey on annonces (cost=0.00..5.89 rows=1 width=932) (actual time=0.003..0.004 rows=1 loops=85) Index Cond: (annonces.id = outer.id) Total runtime: 2.053 ms (7 lignes) annonces= EXPLAIN ANALYZE SELECT * FROM contacts WHERE id IN (SELECT contact_id FROM tmp); QUERY PLAN - Nested Loop (cost=2.06..139.98 rows=36 width=336) (actual time=0.072..0.274 rows=45 loops=1) - HashAggregate (cost=2.06..2.51 rows=45 width=4) (actual time=0.052..0.065 rows=46 loops=1) - Seq Scan on tmp (cost=0.00..1.85 rows=85 width=4) (actual time=0.003..0.016 rows=85 loops=1) - Index Scan using contacts_pkey on contacts (cost=0.00..3.04 rows=1 width=336) (actual time=0.003..0.004 rows=1 loops=46) Index Cond: (contacts.id = outer.contact_id) Total runtime: 0.341 ms ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
On 5/9/06, PFC [EMAIL PROTECTED] wrote: You might consider just selecting your primary key or a set of primary keys to involved relations in your search query. If you currently use select * this can make your result set very large. Copying all the result set to the temp. costs you additional IO that you propably dont need. It is a bit of a catch : I need this information, because the purpose of the query is to retrieve these objects. I can first store the ids, then retrieve the objects, but it's one more query. Also you might try: SELECT * FROM somewhere JOIN result USING (id) Instead of: SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) Yes you're right in this case ; however the query to retrieve the owners needs to eliminate duplicates, which IN() does. Well, you can either SELECT * FROM somewhere JOIN (SELECT id FROM result GROUP BY id) AS a USING (id); or even, for large number of ids: CREATE TEMPORARY TABLE result_ids AS SELECT id FROM RESULT GROUP BY id; SELECT * FROM somewhere JOIN result_ids USING (id); On the other hand if your search query runs in 10ms it seems to be fast enough for you to run it multiple times. Theres propably no point in optimizing anything in such case. I don't think so : - 10 ms is a mean time, sometimes it can take much more time, sometimes it's faster. - Repeating the query might yield different results if records were added or deleted in the meantime. You may SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; though locking might bite you. :) - Complex search queries have imprecise rowcount estimates ; hence the joins that I would add to them will get suboptimal plans. Using a temp table is really the cleanest solution now ; but it's too slow so I reverted to generating big IN() clauses in the application. A thought, haven't checked it though, but... You might want to use PL to store values, say PLperl, or even C, say: create or replace function perl_store(name text, val int) returns void as $$ my $name = shift; push @{$foo{$name}}, shift; return $$ LANGUAGE plperl; select perl_store('someids', id) from something group by id; (you may need to warp it inside count()) Then use it: create or replace function perl_retr(name text) returns setof int as $$ my $name = shift; return $foo{$name} $$ LANGUAGE plperl; select * from someother join perl_retr('someids') AS a(id) using (id); All is in the memory. Of course, you need to do some cleanup, test it, etc, etc, etc. :) Should work faster than a in-application solution :) Regards, Dawid ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
SELECT * FROM somewhere WHERE id IN (SELECT id FROM result) Well, you can either SELECT * FROM somewhere JOIN (SELECT id FROM result GROUP BY id) AS a USING (id); It's the same thing (and postgres knows it) You might want to use PL to store values, say PLperl, or even C, say: I tried. The problem is that you need a set-returning function to retrieve the values. SRFs don't have rowcount estimates, so the plans suck. Should work faster than a in-application solution :) Should, but don't, because of what I said above... With the version in CVS tip, supprting a fast =ANY( array ), this should be doable, though. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote: (Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not.) Don't buy those drives. That's unrelated to whether you use hardware or software RAID. Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. If the drive says it's hit the disk and it hasn't then the RAID controller will have flushed the data from its cache (or flagged it as correctly written). At that point the only place the data is stored is in the non battery backed cache on the drive itself. If something fails then you'll have lost data. You're not suggesting that a hardware RAID controller will protect you against drives that lie about sync, are you? And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Yup. That's why I use SATA RAID for all my databases. Cheers, Steve ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Arguments Pro/Contra Software Raid
On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears. smime.p7s Description: S/MIME cryptographic signature
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Vivek Khera [EMAIL PROTECTED] writes: On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears. Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. -Doug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Arguments Pro/Contra Software Raid
Vivek Khera wrote: On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears. I didn't say better :). If you can afford, SCSI is the way to go. However SATA with a good controller (I am fond of the LSI 150 series) can provide some great performance. I have not used, but have heard good things about Areca as well. Oh, and make sure they are SATA-II drives. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
You're not suggesting that a hardware RAID controller will protect you against drives that lie about sync, are you? Of course not, but which drives lie about sync that are SATA? Or more specifically SATA-II? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote: You're not suggesting that a hardware RAID controller will protect you against drives that lie about sync, are you? Of course not, but which drives lie about sync that are SATA? Or more specifically SATA-II? SATA-II, none that I'm aware of, but there's a long history of dodgy behaviour designed to pump up benchmark results down in the consumer drive space, and low end consumer space is where a lot of SATA drives are. I wouldn't be surprised to see that beahviour there still. I was responding to the original posters assertion that drives lying about sync were a reason not to buy SATA drives, by telling him not to buy drives that lie about sync. You seem to have read this as don't buy SATA drives, which is not what I said and not what I meant. Cheers, Steve ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Douglas McNaught wrote: Vivek Khera [EMAIL PROTECTED] writes: On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears. Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. Best I have seen is 10k but if I can put 4x the number of drives in the array at the same cost... I don't need 15k. Joshua D. Drake -Doug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
On Tue, 2006-05-09 at 12:52, Steve Atkins wrote: On May 9, 2006, at 8:51 AM, Joshua D. Drake wrote: (Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not.) Don't buy those drives. That's unrelated to whether you use hardware or software RAID. Sorry that is an extremely misleading statement. SATA RAID is perfectly acceptable if you have a hardware raid controller with a battery backup controller. If the drive says it's hit the disk and it hasn't then the RAID controller will have flushed the data from its cache (or flagged it as correctly written). At that point the only place the data is stored is in the non battery backed cache on the drive itself. If something fails then you'll have lost data. You're not suggesting that a hardware RAID controller will protect you against drives that lie about sync, are you? Actually, in the case of the Escalades at least, the answer is yes. Last year (maybe a bit more) someone was testing an IDE escalade controller with drives that were known to lie, and it passed the power plug pull test repeatedly. Apparently, the escalades tell the drives to turn off their cache. While most all IDEs and a fair number of SATA drives lie about cache fsyncing, they all seem to turn off the cache when you ask. And, since a hardware RAID controller with bbu cache has its own cache, it's not like it really needs the one on the drives anyway. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PERFORM] Big IN() clauses etc : feature proposal
PFC [EMAIL PROTECTED] writes: Fun thing is, the rowcount from a temp table (which is the problem here) should be available without ANALYZE ; as the temp table is not concurrent, it would be simple to inc/decrement a counter on INSERT/DELETE... No, because MVCC rules still apply. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Slow C Function
Hi, We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) Besides using connection pooling, are there any options to improve performance? By the way, we are using pg version 8.1.3. -Adam ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Slow C Function
Adam Palmblad wrote: Hi, We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) That is fairly standard because the data will be cached. Besides using connection pooling, are there any options to improve performance? Not that I know of but then again I am not a C programer. By the way, we are using pg version 8.1.3. -Adam ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Postgres gets stuck
I'm having a rare but deadly problem. On our web servers, a process occasionally gets stuck, and can't be unstuck. Once it's stuck, all Postgres activities cease. kill -9 is required to kill it -- signals 2 and 15 don't work, and /etc/init.d/postgresql stop fails. Here's what the process table looks like: $ ps -ef | grep postgres postgres 30713 1 0 Apr24 ?00:02:43 /usr/local/pgsql/bin/postmaster -p 5432 -D /disk3/postgres/data postgres 25423 30713 0 May08 ?00:03:34 postgres: writer process postgres 25424 30713 0 May08 ?00:00:02 postgres: stats buffer process postgres 25425 25424 0 May08 ?00:00:02 postgres: stats collector process postgres 11918 30713 21 07:37 ?02:00:27 postgres: production webuser 127.0.0.1(21772) SELECT postgres 31624 30713 0 16:11 ?00:00:00 postgres: production webuser [local] idle postgres 31771 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12422) idle postgres 31772 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12421) idle postgres 31773 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12424) idle postgres 31774 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12425) idle postgres 31775 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12426) idle postgres 31776 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12427) idle postgres 31777 30713 0 16:12 ?00:00:00 postgres: production webuser 127.0.0.1(12428) idle The SELECT process is the one that's stuck. top(1) and other indicators show that nothing is going on at all (no CPU usage, normal memory usage); the process seems to be blocked waiting for something. (The idle processes are attached to a FastCGI program.) This has happened on *two different machines*, both doing completely different tasks. The first one is essentially a read-only warehouse that serves lots of queries, and the second one is the server we use to load the warehouse. In both cases, Postgres has been running for a long time, and is issuing SELECT statements that it's issued millions of times before with no problems. No other processes are accessing Postgres, just the web services. This is a deadly bug, because our web site goes dead when this happens, and it requires an administrator to log in and kill the stuck postgres process then restart Postgres. We've installed failover system so that the web site is diverted to a backup server, but since this has happened twice in one week, we're worried. Any ideas? Details: Postgres 8.0.3 Linux 2.6.12-1.1381_FC3smp i686 i386 Dell 2-CPU Xeon system (hyperthreading is enabled) 4 GB memory 2 120 GB disks (SATA on machine 1, IDE on machine 2) Thanks, Craig ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] Postgres gets stuck
This is a deadly bug, because our web site goes dead when this happens, and it requires an administrator to log in and kill the stuck postgres process then restart Postgres. We've installed failover system so that the web site is diverted to a backup server, but since this has happened twice in one week, we're worried. Any ideas? Sounds like a deadlock issue. Do you have query logging turned on? Also, edit your postgresql.conf file and add (or uncomment): stats_command_string = true and restart postgresql. then you'll be able to: select * from pg_stat_activity; to see what queries postgres is running and that might give you some clues. -- Postgresql php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Scott Marlowe wrote: Actually, in the case of the Escalades at least, the answer is yes. Last year (maybe a bit more) someone was testing an IDE escalade controller with drives that were known to lie, and it passed the power plug pull test repeatedly. Apparently, the escalades tell the drives to turn off their cache. While most all IDEs and a fair number of SATA drives lie about cache fsyncing, they all seem to turn off the cache when you ask. And, since a hardware RAID controller with bbu cache has its own cache, it's not like it really needs the one on the drives anyway. You do if the controller thinks the data is already on the drives and removes it from its cache. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Arguments Pro/Contra Software Raid
William Yu wrote: We upgraded our disk system for our main data processing server earlier this year. After pricing out all the components, basically we had the choice of: LSI MegaRaid 320-2 w/ 1GB RAM+BBU + 8 15K 150GB SCSI or Areca 1124 w/ 1GB RAM+BBU + 24 7200RPM 250GB SATA My mistake -- I keep doing calculations and they don't add up. So I looked again on pricewatch and it turns out the actual comparison was for 4 SCSI drives, not 8! ($600 for a 15K 145GB versus $90 for a 7200 250GB.) No wonder our decision seemed to much more decisive back then. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Lot'sa joins - performance tip-up, please?
On Wed, 2006-05-03 at 10:20 -0500, Dave Dutcher wrote: - Nested Loop (cost=0.00..176144.30 rows=57925 width=26) (actual time=68.322..529472.026 rows=57925 loops=1) - Seq Scan on ticketing_codes_played (cost=0.00..863.25 rows=57925 width=8) (actual time=0.042..473.881 rows=57925 loops=1) - Index Scan using ticketing_codes_pk on ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual time=9.102..9.108 rows=1 loops=57925) Index Cond: (ticketing_codes.code_id = outer.code_id) Total runtime: 542000.093 ms (27 rows) I'll be more than happy to provide any additional information that I may be able to gather. I'd be most happy if someone would scream something like four joins, smells like a poor design because design is poor, but the system is in production, and I have to bare with it. It looks like that nested loop which is joining ticketing_codes_played to ticketing_codes is the slow part. I'm curious how many rows are in the ticketing_codes table? Four or five joins does not seem like a lot to me, but it can be slow if you are joining big tables with other big tables. Ticketing_codes table has 1100 records, and it's expected to grow. I tried playing with JOIN order as Tom suggested, but performance is the same. Mario ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Lot'sa joins - performance tip-up, please?
On Wed, 2006-05-03 at 13:58 -0400, Tom Lane wrote: Mario Splivalo [EMAIL PROTECTED] writes: I have a quite large query that takes over a minute to run on my laptop. The EXPLAIN output you provided doesn't seem to agree with the stated query. Where'd the service_id = 1102 condition come from? I guess I copypasted the additional WHERE to te EXPLAIN ANALYZE query. This is the correct one, without the WHERE: Hash Left Join (cost=198628.35..202770.61 rows=121 width=264) (actual time=998008.264..999645.322 rows=5706 loops=1) Hash Cond: (outer.message_id = inner.message_id) - Merge Left Join (cost=21943.23..21950.96 rows=121 width=238) (actual time=4375.510..4540.772 rows=5706 loops=1) Merge Cond: (outer.message_id = inner.message_id) - Sort (cost=21847.62..21847.92 rows=121 width=230) (actual time=3304.787..3378.515 rows=5706 loops=1) Sort Key: messages.id - Hash Join (cost=20250.16..21843.43 rows=121 width=230) (actual time=1617.370..3102.470 rows=5706 loops=1) Hash Cond: (outer.message_id = inner.id) - Seq Scan on ticketing_messages (cost=0.00..1212.37 rows=75937 width=14) (actual time=10.554..609.967 rows=75937 loops=1) - Hash (cost=20244.19..20244.19 rows=2391 width=216) (actual time=1572.889..1572.889 rows=5706 loops=1) - Nested Loop (cost=1519.21..20244.19 rows=2391 width=216) (actual time=385.582..1449.207 rows=5706 loops=1) - Seq Scan on services (cost=0.00..4.20 rows=3 width=54) (actual time=20.829..20.859 rows=2 loops=1) Filter: (type_id = 10) - Bitmap Heap Scan on messages (cost=1519.21..6726.74 rows=1594 width=162) (actual time=182.346..678.800 rows=2853 loops=2) Recheck Cond: ((outer.id = messages.service_id) AND (messages.receiving_time = '2006-02-12 00:00:00+01'::timestamp with time zone) AND (messages.receiving_time = '2006-03-18 23:00:00+01'::timestamp with time zone)) - BitmapAnd (cost=1519.21..1519.21 rows=1594 width=0) (actual time=164.311..164.311 rows=0 loops=2) - Bitmap Index Scan on idx_service_id (cost=0.00..84.10 rows=14599 width=0) (actual time=66.809..66.809 rows=37968 loops=2) Index Cond: (outer.id = messages.service_id) - Bitmap Index Scan on idx_messages_receiving_time (cost=0.00..1434.87 rows=164144 width=0) (actual time=192.633..192.633 rows=184741 loops=1) Index Cond: ((receiving_time = '2006-02-12 00:00:00+01'::timestamp with time zone) AND (receiving_time = '2006-03-18 23:00:00+01'::timestamp with time zone)) - Sort (cost=95.62..99.17 rows=1421 width=8) (actual time=1070.678..1072.999 rows=482 loops=1) Sort Key: ticketing_winners.message_id - Seq Scan on ticketing_winners (cost=0.00..21.21 rows=1421 width=8) (actual time=424.836..1061.834 rows=1421 loops=1) - Hash (cost=176144.30..176144.30 rows=57925 width=26) (actual time=993592.980..993592.980 rows=57925 loops=1) - Nested Loop (cost=0.00..176144.30 rows=57925 width=26) (actual time=1074.984..992536.243 rows=57925 loops=1) - Seq Scan on ticketing_codes_played (cost=0.00..863.25 rows=57925 width=8) (actual time=74.479..2047.993 rows=57925 loops=1) - Index Scan using ticketing_codes_pk on ticketing_codes (cost=0.00..3.01 rows=1 width=18) (actual time=17.044..17.052 rows=1 loops=57925) Index Cond: (ticketing_codes.code_id = outer.code_id) Total runtime: 999778.981 ms In general, I'd suggest playing around with the join order. Existing releases of PG tend to throw up their hands when faced with a mixture of outer joins and regular joins, and just join the tables in the order listed. 8.2 will be smarter about this, but for now you have to do it by hand ... No luck for me there. But, I found out that if I first do join on ticketing_codes and ticketing_codes_played, put the result to temporary table, and then join that temporary table with the rest of the query (the SELECT that is in parenthesis is transfered to a temporary table) the query is almost twice as fast. As mentioned before, ticketing_codes has 1100 records. Mario P.S. Is it just me, or posting to psql-perofrmance is laged, quite a bit? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PERFORM] slow variable against int??
I have a question about my function. I must get user rating by game result. This isn't probably a perfect solution but I have one question about select into inGameRating count(game_result)+1 from users where game_result inRow.game_result; This query in function results in about 1100 ms. inRow.game_result is a integer 2984 And now if I replace inRow.game_result with integer select into inGameRating count(game_result)+1 from users where game_result 2984; query results in about 100 ms There is probably a reason for this but can you tell me about it because I can't fine one My function: create or replace function ttt_result(int,int) returns setof tparent_result language plpgsql volatile as $$ declare inOffset alias for $1; inLimit alias for $2; inRow tparent_result%rowtype; inGameResult int := -1; inGameRating int := -1; begin for inRow in select email,wynik_gra from konkurs_uzytkownik order by wynik_gra desc limit inLimit offset inOffset loop if inGameResult 0 then -- only for first iteration /* this is fast ~100 ms select into inGameRating count(game_result)+1 from users where game_result 2984; */ /* even if inRow.game_result = 2984 this is very slow ~ 1100 ms! select into inGameRating count(game_result)+1 from users where game_result inRow.game_result; */ inGameResult := inRow.game_result; end if; if inGameResult inRow.game_result then inGameRating := inGameRating + 1; end if; inRow.game_rating := inGameRating; inGameResult := inRow.game_result; return next inRow; end loop; return; end; $$; -- Witold Strzelczyk [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Performance Issues on Opteron Dual Core
I installed Ubuntu 5.10 on the production server (64-Bit version), and sure enough the peformance is like I expected. Opening up that table (320,000 records) takes 6 seconds, with CPU usage of one of the cores going up to 90% - 100% for the 6 seconds. I assume only one core is being used per user / session / query? Gregory -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Thursday, May 04, 2006 12:47 PM To: Gregory Stewart Cc: Mark Kirkwood; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core All the machines I've been able to replicate this on have been SMP w2k3 machines running SP1. I've been unable to replicate it on anything not running w2k3, but the only 'SMP' machine I've tested in that manner was an Intel with HT enabled. I now have an intel with HT and running w2k3 sitting in my office, but I haven't had a chance to fire it up and try it yet. Once I test that machine it should help narrow down if this problem exists with HT machines (which someone on -hackers mentioned they had access to and could do testing with). If it does affect HT machines then I suspect that this is not an issue for XP... On Tue, May 02, 2006 at 11:27:02PM -0500, Gregory Stewart wrote: Jim, Have you seen this happening only on W2k3? I am wondering if I should try out 2000 Pro or XP Pro. Not my first choice, but if it works... -Original Message- From: Jim C. Nasby [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 02, 2006 3:29 PM To: Mark Kirkwood Cc: Gregory Stewart; pgsql-performance@postgresql.org Subject: Re: [PERFORM] Performance Issues on Opteron Dual Core On Sun, Apr 30, 2006 at 10:59:56PM +1200, Mark Kirkwood wrote: Pgadmin can give misleading times for queries that return large result sets over a network, due to: 1/ It takes time to format the (large) result set for display. 2/ It has to count the time spent waiting for the (large) result set to travel across the network. You aren't running Pgadmin off the dev server are you? If not check your network link to dev and prod - is one faster than the other? (etc). To eliminate Pgadmin and the network as factors try wrapping your query in a 'SELECT count(*) FROM (your query here) AS a', and see if it changes anything! FWIW, I've found problems running PostgreSQL on Windows in a multi-CPU environment on w2k3. It runs fine for some period, and then CPU and throughput drop to zero. So far I've been unable to track down any more information than that, other than the fact that I haven't been able to reproduce this on any single-CPU machines. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.385 / Virus Database: 268.5.1/328 - Release Date: 5/1/2006 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 -- No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.1.392 / Virus Database: 268.5.3/331 - Release Date: 5/3/2006 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] Dynamically loaded C function performance
Hi, We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) Besides using connection pooling, are there any options to improve performance? -Adam ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[PERFORM]
Hi, We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) Besides using connection pooling, are there any options to improve performance? By the way, we are using pg version 8.1.3. -Adam ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] VACUUM killing my CPU
Hi all ! I am running PostgreSQL 7.3.2 on Linux 2.6.13... What I see when VACUUM process is running is: Cpu(s): 0.0% us, 3.2% sy, 0.0% ni, 0.0% id, 93.5% wa, 3.2% hi, 0.0% si What I am worry about is 93.5% wa ... Could someone explain me what is the VACUUM process waiting for ? Best regards David ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] PostgreSQL VACCUM killing CPU
Hi all ! I have got such problem. Im running Postgresql 7.3.2 on Linux 2.6.13. What is see when VACCUM is running and killing my CPU is: Cpu(s): 3.2% us, 0.0% sy, 0.0% ni, 0.0% id, 96.8% wa, 0.0% hi, 0.0% si what i am worry about is 96.8% wa why is it like that? what is the process waiting for ? could somone explain me that please? :) Best regards david ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Hi Hannes, Hannes Dorbath a écrit : Hi, I've just had some discussion with colleagues regarding the usage of hardware or software raid 1/10 for our linux based database servers. I myself can't see much reason to spend $500 on high end controller cards for a simple Raid 1. Naa, you can find ATA | SATA ctrlrs for about EUR30 ! Any arguments pro or contra would be desirable. From my experience and what I've read here: + Hardware Raids might be a bit easier to manage, if you never spend a few hours to learn Software Raid Tools. I'd the same (mostly as you still have to punch a command line for most of the controlers) + There are situations in which Software Raids are faster, as CPU power has advanced dramatically in the last years and even high end controller cards cannot keep up with that. Definitely NOT, however if your server doen't have a heavy load, the software overload can't be noticed (essentially cache managing and syncing) For bi-core CPUs, it might be true + Using SATA drives is always a bit of risk, as some drives are lying about whether they are caching or not. ?? Do you intend to use your server without a UPS ?? + Using hardware controllers, the array becomes locked to a particular vendor. You can't switch controller vendors as the array meta information is stored proprietary. In case the Raid is broken to a level the controller can't recover automatically this might complicate manual recovery by specialists. ?? Do you intend not to make backups ?? + Even battery backed controllers can't guarantee that data written to the drives is consistent after a power outage, neither that the drive does not corrupt something during the involuntary shutdown / power irregularities. (This is theoretical as any server will be UPS backed) RAID's laws: 1- RAID prevents you from loosing data on healthy disks, not from faulty disks, 1b- So format and reformat your RAID disks (whatever SCSI, ATA, SATA) several times, with destructive tests (see -c -c option from the mke2fs man) - It will ensure that disks are safe, and also make a kind of burn test (might turn to... days of formating!), 2- RAID doesn't prevent you from power suply brokeage or electricity breakdown, so use a (LARGE) UPS, 2b- LARGE UPS because HDs are the components that have the higher power consomption (a 700VA UPS gives me about 10-12 minutes on a machine with a XP2200+, 1GB RAM and a 40GB HD, however this fall to.. less than 25 secondes with seven HDs ! all ATA), 2c- Use server box with redudancy power supplies, 3- As for any sensitive data, make regular backups or you'll be as sitting duck. Some hardware ctrlrs are able to avoid the loss of a disk if you turn to have some faulty sectors (by relocating internally them); software RAID doesn't as sectors *must* be @ the same (linear) addresses. BUT a hardware controler is about EUR2000 and a (ATA/SATA) 500GB HD is ~ EUR350. That means you have to consider: * The server disponibility (time to change a power supply if no redudancies, time to exchange a not hotswap HD... In fact, how much down time you can afford), * The volume of the data (from which depends the size of the backup device), * The backup device you'll use (tape or other HDs), * The load of the server (and the number of simultaneous users = Soft|Hard, ATA/SATA|SCSI...), * The money you can spend in such a server * And most important, the color of your boss' tie the day you'll take the decision. Hope it will help you Jean-Yves ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PERFORM] UNSUBSCRIBE
UNSUBSCRIBE ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL VACCUM killing CPU
The wa means waiting on IO. Vacuum is a very IO intensive process. You can use tools like vmstat and iostat to see how much disk IO is occurring. Also, sar is very helpful for trending these values over time. -- Will Reese http://blog.rezra.com On May 9, 2006, at 5:19 AM, [EMAIL PROTECTED] wrote: Hi all ! I have got such problem. Im running Postgresql 7.3.2 on Linux 2.6.13. What is see when VACCUM is running and killing my CPU is: Cpu(s): 3.2% us, 0.0% sy, 0.0% ni, 0.0% id, 96.8% wa, 0.0% hi, 0.0% si what i am worry about is 96.8% wa why is it like that? what is the process waiting for ? could somone explain me that please? :) Best regards david ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] UNSUBSCRIBE
Shoaib Burq wrote: UNSUBSCRIBE To unsubscribe: List-Unsubscribe: mailto:[EMAIL PROTECTED] Email admins - Could we add this above or below the random tips that get appended to every email ? -- Postgresql php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] PostgreSQL VACCUM killing CPU
I have got such problem. Im running Postgresql 7.3.2 on Linux 2.6.13. Also, you should seriously consider upgrading. 8.1.3 is the current PostgreSQL release. If you must remain on 7.3, at least upgrade to 7.3.14, which contains *many* bugfixes. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] PostgreSQL VACCUM killing CPU
I have got such problem. Im running Postgresql 7.3.2 on Linux 2.6.13. Also, you should seriously consider upgrading. 8.1.3 is the current PostgreSQL release. If you must remain on 7.3, at least upgrade to 7.3.14, which contains many bugfixes. Michael Glaesemann grzm seespotcode net ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Slow C Function
Adam Palmblad [EMAIL PROTECTED] writes: We've got a C function that we use here and we find that for every connection, the first run of the function is much slower than any subsequent runs. ( 50ms compared to 8ms) Perhaps that represents the time needed to load the dynamic library into the backend? If so, the preload_libraries parameter might help you fix it. Or consider pooling connections. Or build a custom executable with the function linked in permanently. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
On May 9, 2006, at 11:26 AM, Joshua D. Drake wrote: Of course not, but which drives lie about sync that are SATA? Or more specifically SATA-II? I don't know the answer to this question, but have you seen this tool? http://brad.livejournal.com/2116715.html It attempts to experimentally determine if, with your operating system version, controller, and hard disk, fsync() does as claimed. Of course, experimentation can't prove the system is correct, but it can sometimes prove the system is broken. I say it's worth running on any new model of disk, any new controller, or after the Linux kernel people rewrite everything (i.e. on every point release). I have to admit to hypocrisy, though...I'm running with systems that other people ordered and installed, I doubt they were this thorough, and I don't have identical hardware to run tests on. So no real way to do this. Regards, Scott -- Scott Lamb http://www.slamb.org/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] VACUUM killing my CPU
On May 9, 2006 02:45 am, [EMAIL PROTECTED] wrote: What I am worry about is 93.5% wa ... Could someone explain me what is the VACUUM process waiting for ? Disk I/O. -- In a truly free society, Alcohol, Tobacco and Firearms would be a convenience store chain. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PERFORM] [GENERAL] Arguments Pro/Contra Software Raid
Douglas McNaught [EMAIL PROTECTED] writes: Vivek Khera [EMAIL PROTECTED] writes: On May 9, 2006, at 11:51 AM, Joshua D. Drake wrote: And dollar for dollar, SCSI will NOT be faster nor have the hard drive capacity that you will get with SATA. Does this hold true still under heavy concurrent-write loads? I'm preparing yet another big DB server and if SATA is a better option, I'm all (elephant) ears. Correct me if I'm wrong, but I've never heard of a 15kRPM SATA drive. Well, dollar for dollar you would get the best performance from slower drives anyways since it would give you more spindles. 15kRPM drives are *expensive*. -- greg ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] UNSUBSCRIBE
Chris [EMAIL PROTECTED] writes: Email admins - Could we add this above or below the random tips that get appended to every email ? You mean like these headers that already get added to every list message (these copied-and-pasted from your own message): List-help: mailto:[EMAIL PROTECTED] List-owner: mailto:[EMAIL PROTECTED] List-subscribe: mailto:[EMAIL PROTECTED] List-unsubscribe: mailto:[EMAIL PROTECTED] Plus there are at least two of the random tips that deal with how to unsubscribe. My feeling is that the people who can't figure this out still won't figure it out, no matter how thick the cluebat we swing at them :-( Maybe the real problem is at the other end of the process, ie we should require some evidence of a greater-than-room-temp IQ to subscribe in the first place? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] UNSUBSCRIBE
Tom Lane wrote: Chris [EMAIL PROTECTED] writes: Email admins - Could we add this above or below the random tips that get appended to every email ? You mean like these headers that already get added to every list message (these copied-and-pasted from your own message): The headers aren't the first place you'd go looking for such info.. once you know they are there it's ok. Maybe the real problem is at the other end of the process, ie we should require some evidence of a greater-than-room-temp IQ to subscribe in the first place? Maybe :) The php-general list has To unsubscribe, visit: http://www.php.net/unsub.php at the bottom of every email, and there are still random unsubscribe requests.. Ah well :) -- Postgresql php tutorials http://www.designmagick.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] UNSUBSCRIBE
Chris [EMAIL PROTECTED] writes: Tom Lane wrote: Maybe the real problem is at the other end of the process, ie we should require some evidence of a greater-than-room-temp IQ to subscribe in the first place? Maybe :) The php-general list has To unsubscribe, visit: http://www.php.net/unsub.php at the bottom of every email, and there are still random unsubscribe requests.. That's depressing, indeed :-( I'm not against spending a little bandwidth to provide unsub instructions, but somehow I can't see putting an 8x10 color glossy photograph with circles and arrows and a paragraph on the back [1] of every list message to do it. regards, tom lane [1] http://www.guntheranderson.com/v/data/alicesre.htm ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly