Re: [PERFORM] [IMPORTANT] - My application performance
On 7/26/05, Roberto Germano Vieweg Neto [EMAIL PROTECTED] wrote: My application is using Firebird 1.5.2 My question is: Postgre SQL will be more faster than Firebird? How much (in percent)? I need about 20% to 50% more performance at my application. Can I get this migratin to postgresql ? The answer is: maybe. There's nothing which stops PostgreSQL from being faster, and likewise there is nothing that stops it from being slower. YMMV. Your route should be: * migrate most speed-demanding part to PostgreSQL * benchmark it, trying to emulate real-world load. * if it is slower than Firebird, post it here, together with EXPLAIN ANALYZEs and ask if there's something you can do to speed it up. Regards, Dawid ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[PERFORM] Inherited Table Query Planning (fwd)
Is there a way to make the query planner consider pulling inner appends outside joins? Example: natural_person inherits from person (obviously) admpostgres3=# explain analyze select u.name, p.name from users u, person p where p.user_id = u.id and u.name = 's_ohl'; QUERY PLAN -- Hash Join (cost=8.01..3350.14 rows=3 width=36) (actual time=107.391..343.657 rows=10 loops=1) Hash Cond: (outer.user_id = inner.id) - Append (cost=0.00..2461.34 rows=117434 width=20) (actual time=0.007..264.910 rows=117434 loops=1) - Seq Scan on person p (cost=0.00..575.06 rows=31606 width=20) (actual time=0.005..38.911 rows=31606 loops=1) - Seq Scan on natural_person p (cost=0.00..1886.28 rows=85828 width=19) (actual time=0.003..104.338 rows=85828 loops=1) - Hash (cost=8.01..8.01 rows=2 width=24) (actual time=0.096..0.096 rows=0 loops=1) - Index Scan using users_name_idx on users u (cost=0.00..8.01 rows=2 width=24) (actual time=0.041..0.081 rows=10 loops=1) Index Cond: ((name)::text = 's_ohl'::text) Total runtime: 343.786 ms (9 rows) admpostgres3=# explain analyze select u.name, p.name from users u, only person p where p.user_id = u.id and u.name = 's_ohl' union all select u.name, p.name from users u, only natural_person p where p.user_id = u.id and u.name = 's_ohl'; QUERY PLAN - Append (cost=0.00..28.19 rows=3 width=28) (actual time=0.197..0.366 rows=10 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..14.12 rows=1 width=28) (actual time=0.159..0.159 rows=0 loops=1) - Nested Loop (cost=0.00..14.11 rows=1 width=28) (actual time=0.157..0.157 rows=0 loops=1) - Index Scan using users_name_idx on users u (cost=0.00..8.01 rows=2 width=24) (actual time=0.039..0.075 rows=10 loops=1) Index Cond: ((name)::text = 's_ohl'::text) - Index Scan using person_user_idx on person p (cost=0.00..3.03 rows=2 width=8) (actual time=0.006..0.006 rows=0 loops=10) Index Cond: (p.user_id = outer.id) - Subquery Scan *SELECT* 2 (cost=0.00..14.08 rows=2 width=28) (actual time=0.036..0.193 rows=10 loops=1) - Nested Loop (cost=0.00..14.06 rows=2 width=28) (actual time=0.033..0.171 rows=10 loops=1) - Index Scan using users_name_idx on users u (cost=0.00..8.01 rows=2 width=24) (actual time=0.018..0.049 rows=10 loops=1) Index Cond: ((name)::text = 's_ohl'::text) - Index Scan using natural_person_user_idx on natural_person p (cost=0.00..3.01 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=10) Index Cond: (p.user_id = outer.id) Total runtime: 0.475 ms (14 rows) Mit freundlichem Gruß Jens Schicke -- Jens Schicke [EMAIL PROTECTED] asco GmbH http://www.asco.de Mittelweg 7 Tel 0531/3906-127 38106 BraunschweigFax 0531/3906-400 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PERFORM] Inherited Table Query Planning (fwd)
Jens-Wolfhard Schicke [EMAIL PROTECTED] writes: Is there a way to make the query planner consider pulling inner appends outside joins? Not at the moment. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible
Luke, Well - now that I test it, it appears you are correct, temp table COPY bypasses WAL - thanks for pointing it out! RIght. The problem is bypassing WAL for loading new scratch tables which aren't TEMPORARY tables. We need to do this for multi-threaded ETL, since: a) Temp tables can't be shared by several writers, and b) you can't index a temp table. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible
On Wed, 27 Jul 2005, Josh Berkus wrote: b) you can't index a temp table. jurka# create temp table t (a int); CREATE jurka# create index myi on t(a); CREATE ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[PERFORM] wal_buffer tests in
Folks, I ran a wal_buffer test series. It appears that increasing the wal_buffers is indeed very important for OLTP applications, potentially resulting in as much as a 15% average increase in transaction processing. What's interesting is that this is not just true for 8.1, it's true for 8.0.3 as well. More importantly, 8.1 performance is somehow back up to above-8.0 levels. Something was broken in June that's got fixed (this test series is based on July 3 CVS) but I don't know what. Clues? Test results are here: http://pgfoundry.org/docman/view.php/141/79/wal_buffer_test.pdf As always, detailed test results are available from OSDL, just use: http://khack.osdl.org/stp/# where # is the test number. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PERFORM] Mirroring PostgreSQL database
Shashi Kanth Boddula schrieb: Hi, I have one customer who is using PostgreSQL 7.4.8 on Linux . He has some problems with database mirroring . The details are follows. The customer is using Linux on which PostgreSQL 7.4.8 along with Jboss 3.2.3 is running . He has 2 servers , one is acting as a live server (primary) and another is acting as a fail-over (secondary) server . Secondary server is placed in remote location . These servers are acting as a Attendence server for daily activities . Nearly 50,000 employees depend on the live server . The customer is using DBmirror tool to mirror the database records of primary to secondary . The customer is complaining that there is one day (24 hours) delay between primary and secondray for database synchronization . They have dedicated line and bandwidth , but still the problems exists. I just want to know , for immediate data mirroring , what is the best way for PostgreSQL . PostgreSQL is offering many mirror tools , but which one is the best ?. Is there any other way to accomplish the task ? Thank you . Waiting for your reply. Thanks Regards, Shashi Kanth Consultant - Linux RHCE , LPIC-2 Onward Novell - Bangalore 9886455567 For java based solution you could also have a look at x-jdbc or xjdbc. But before you should find out what the reason for the delay is actually. When the backup server is to slow, it may be not important which mirroring tool you use. -- Best Regards / Viele Grüße Sebastian Hennebrueder http://www.laliluna.de Tutorials for JSP, JavaServer Faces, Struts, Hibernate and EJB Get support, education and consulting for these technologies - uncomplicated and cheap. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible
On Wed, 2005-07-27 at 09:29 -0700, Josh Berkus wrote: Luke, Well - now that I test it, it appears you are correct, temp table COPY bypasses WAL - thanks for pointing it out! RIght. The problem is bypassing WAL for loading new scratch tables which aren't TEMPORARY tables. We need to do this for multi-threaded ETL, since: a) Temp tables can't be shared by several writers, and b) you can't index a temp table. The description of scratch tables might need some slight clarification. It kindof makes it sound like temp tables. I had in mind the extra tables that an application sometimes needs to operate faster. Denormalisations, pre-joined tables, pre-calculated results, aggregated data. These are not temporary tables, just part of the application - multi-user tables that stay across shutdown/restart. If you have gallons of GB, you will probably by looking to make use of such tables. You can use such tables for the style of ETL known as ELT, but that is not the only use. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible
I had in mind the extra tables that an application sometimes needs to operate faster. Denormalisations, pre-joined tables, pre-calculated results, aggregated data. These are not temporary tables, just part of the application - multi-user tables that stay across shutdown/restart. You could also add caching search results for easy pagination without redoing always entirely on each page the Big Slow Search Query that every website has... ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.
Matt, After playing with various indexes and what not I simply am unable to make this procedure perform any better. Perhaps someone on the list can spot the bottleneck and reveal why this procedure isn't performing that well or ways to make it better. Well, my first thought is that this is a pretty complicated procedure for something you want to peform well.Is all this logic really necessary? How does it get done for MySQL? -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible
Josh Berkus josh@agliodbs.com writes: RIght. The problem is bypassing WAL for loading new scratch tables which aren't TEMPORARY tables. We need to do this for multi-threaded ETL, since: a) Temp tables can't be shared by several writers, and b) you can't index a temp table. This may not matter given point (a), but: point (b) is completely wrong. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
Josh Berkus wrote: Matt, After playing with various indexes and what not I simply am unable to make this procedure perform any better. Perhaps someone on the list can spot the bottleneck and reveal why this procedure isn't performing that well or ways to make it better. Well, my first thought is that this is a pretty complicated procedure for something you want to peform well.Is all this logic really necessary? How does it get done for MySQL? I'm not sure if it's all needed, in mysql they have this simple schema: === CREATE TABLE bayes_expire ( id int(11) NOT NULL default '0', runtime int(11) NOT NULL default '0', KEY bayes_expire_idx1 (id) ) TYPE=MyISAM; CREATE TABLE bayes_global_vars ( variable varchar(30) NOT NULL default '', value varchar(200) NOT NULL default '', PRIMARY KEY (variable) ) TYPE=MyISAM; INSERT INTO bayes_global_vars VALUES ('VERSION','3'); CREATE TABLE bayes_seen ( id int(11) NOT NULL default '0', msgid varchar(200) binary NOT NULL default '', flag char(1) NOT NULL default '', PRIMARY KEY (id,msgid) ) TYPE=MyISAM; CREATE TABLE bayes_token ( id int(11) NOT NULL default '0', token char(5) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', atime int(11) NOT NULL default '0', PRIMARY KEY (id, token), INDEX bayes_token_idx1 (token), INDEX bayes_token_idx2 (id, atime) ) TYPE=MyISAM; CREATE TABLE bayes_vars ( id int(11) NOT NULL AUTO_INCREMENT, username varchar(200) NOT NULL default '', spam_count int(11) NOT NULL default '0', ham_count int(11) NOT NULL default '0', token_count int(11) NOT NULL default '0', last_expire int(11) NOT NULL default '0', last_atime_delta int(11) NOT NULL default '0', last_expire_reduce int(11) NOT NULL default '0', oldest_token_age int(11) NOT NULL default '2147483647', newest_token_age int(11) NOT NULL default '0', PRIMARY KEY (id), UNIQUE bayes_vars_idx1 (username) ) TYPE=MyISAM; === Then they do this to insert the token: INSERT INTO bayes_token ( id, token, spam_count, ham_count, atime ) VALUES ( ?, ?, ?, ?, ? ) ON DUPLICATE KEY UPDATE spam_count = GREATEST(spam_count + ?, 0), ham_count = GREATEST(ham_count + ?, 0), atime = GREATEST(atime, ?) Or update the token: UPDATE bayes_vars SET $token_count_update newest_token_age = GREATEST(newest_token_age, ?), oldest_token_age = LEAST(oldest_token_age, ?) WHERE id = ? I think the reason why the procedure was written for postgres was because of the greatest and least statements performing poorly. Honestly, I'm not real up on writing procs, I was hoping the problem would be obvious to someone. schu ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.
Matt, UPDATE bayes_vars SET $token_count_update newest_token_age = GREATEST(newest_token_age, ?), oldest_token_age = LEAST(oldest_token_age, ?) WHERE id = ? I think the reason why the procedure was written for postgres was because of the greatest and least statements performing poorly. Well, it might be because we don't have a built-in GREATEST or LEAST prior to 8.1. However, it's pretty darned easy to construct one. Honestly, I'm not real up on writing procs, I was hoping the problem would be obvious to someone. Well, there's the general performance tuning stuff of course (postgresql.conf) which if you've not done any of it will pretty dramatically affect your througput rates. And vacuum, analyze, indexes, etc. You should also look at ways to make the SP simpler. For example, you have a cycle that looks like: SELECT IF NOT FOUND INSERT ELSE UPDATE Which could be made shorter as: UPDATE IF NOT FOUND INSERT ... saving you one index scan. Also, I don't quite follow it, but the procedure seems to be doing at least two steps that the MySQL version isn't doing at all. If the PG version is doing more things, of course it's going to take longer. Finally, when you have a proc you're happy with, I suggest having an expert re-write it in C, which should double the procedure performance. -- Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0
Josh Berkus wrote: Matt, Well, it might be because we don't have a built-in GREATEST or LEAST prior to 8.1. However, it's pretty darned easy to construct one. I was more talking about min() and max() but yea, I think you knew where I was going with it... Well, there's the general performance tuning stuff of course (postgresql.conf) which if you've not done any of it will pretty dramatically affect your througput rates. And vacuum, analyze, indexes, etc. I have gone though all that. You should also look at ways to make the SP simpler. For example, you have a cycle that looks like: SELECT IF NOT FOUND INSERT ELSE UPDATE Which could be made shorter as: UPDATE IF NOT FOUND INSERT ... saving you one index scan. Also, I don't quite follow it, but the procedure seems to be doing at least two steps that the MySQL version isn't doing at all. If the PG version is doing more things, of course it's going to take longer. Finally, when you have a proc you're happy with, I suggest having an expert re-write it in C, which should double the procedure performance. Sounds like I need to completely understand what the proc is doing and work on a rewrite. I'll look into writing it in C, I need to do some reading about how that works and exactly what it buys you. Thanks for the helpful comments. schu ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PERFORM] Left joining against two empty tables makes a query SLOW
Hi all; I have a customer who currently uses an application which had become slow. After doing some digging, I found the slow query: SELECT c.accno, c.description, c.link, c.category, ac.project_id, p.projectnumber, a.department_id, d.description AS department FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id) JOIN ar a ON (a.id = ac.trans_id) LEFT JOIN project p ON (ac.project_id = p.id) LEFT JOIN department d ON (d.id = a.department_id) WHERE a.customer_id = 11373 AND a.id IN ( SELECT max(id) FROM ar WHERE customer_id = 11373 ); (reformatted for readability) This is taking 10 seconds to run. Interestingly, both the project and department tables are blank, and if I omit them, the query becomes: SELECT c.accno, c.description, c.link, c.category, ac.project_id FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id) JOIN ar a ON (a.id = ac.trans_id) WHERE a.customer_id = 11373 AND a.id IN ( SELECT max(id) FROM ar WHERE customer_id = 11373 ); This takes 139ms. 1% of the previous query. The plan for the long query is: QUERY PLAN Hash IN Join (cost=87337.25..106344.93 rows=41 width=118) (actual time=7615.843..9850.209 rows=10 loops=1) Hash Cond: (outer.trans_id = inner.max) - Merge Right Join (cost=86620.57..100889.85 rows=947598 width=126) (actual time=7408.830..9200.435 rows=177769 loops=1) Merge Cond: (outer.id = inner.department_id) - Index Scan using department_id_key on department d (cost=0.00..52.66 rows=1060 width=36) (actual time=0.090..0.090 rows=0 loops=1) - Sort (cost=86620.57..87067.55 rows=178792 width=94) (actual time=7408.709..7925.843 rows=177769 loops=1) Sort Key: a.department_id - Merge Right Join (cost=45871.18..46952.83 rows=178792 width=94) (actual time=4962.122..6671.319 rows=177769 loops=1) Merge Cond: (outer.id = inner.project_id) - Index Scan using project_id_key on project p (cost=0.00..49.80 rows=800 width=36) (actual time=0.007..0.007 rows=0 loops=1) - Sort (cost=45871.18..46318.16 rows=178792 width=62) (actual time=4962.084..5475.636 rows=177769 loops=1) Sort Key: ac.project_id - Hash Join (cost=821.20..13193.43 rows=178792 width=62) (actual time=174.905..4295.685 rows=177769 loops=1) Hash Cond: (outer.chart_id = inner.id) - Hash Join (cost=817.66..10508.02 rows=178791 width=20) (actual time=173.952..2840.824 rows=177769 loops=1) Hash Cond: (outer.trans_id = inner.id) - Seq Scan on acc_trans ac (cost=0.00..3304.38 rows=181538 width=12) (actual time=0.062..537.753 rows=181322 loops=1) - Hash (cost=659.55..659.55 rows=22844 width=8) (actual time=173.625..173.625 rows=0 loops=1) - Seq Scan on ar a (cost=0.00..659.55 rows=22844 width=8) (actual time=0.022..101.828 rows=22844 loops=1) Filter: (customer_id = 11373) - Hash (cost=3.23..3.23 rows=123 width=50) (actual time=0.915..0.915 rows=0 loops=1) - Seq Scan on chart c (cost=0.00..3.23 rows=123 width=50) (actual time=0.013..0.528 rows=123 loops=1) - Hash (cost=716.67..716.67 rows=1 width=4) (actual time=129.037..129.037 rows=0 loops=1) - Subquery Scan IN_subquery (cost=716.66..716.67 rows=1 width=4) (actual time=129.017..129.025 rows=1 loops=1) - Aggregate (cost=716.66..716.66 rows=1 width=4) (actual time=129.008..129.011 rows=1 loops=1) - Seq Scan on ar (cost=0.00..659.55 rows=22844 width=4) (actual time=0.020..73.266 rows=22844 loops=1) Filter: (customer_id = 11373) Total runtime: 9954.133 ms (28 rows) The shorter query's plan is: QUERY PLAN - Hash Join (cost=728.42..732.96 rows=8 width=50) (actual time=130.908..131.593 rows=10 loops=1) Hash Cond: (outer.id = inner.chart_id) - Seq Scan on chart c (cost=0.00..3.23 rows=123 width=50) (actual time=0.006..0.361 rows=123 loops=1) - Hash (cost=728.40..728.40 rows=8 width=8) (actual time=130.841..130.841 rows=0 loops=1) - Nested Loop