Re: [PERFORM] [IMPORTANT] - My application performance

2005-07-27 Thread Dawid Kuroczko
On 7/26/05, Roberto Germano Vieweg Neto [EMAIL PROTECTED] wrote:
 My application is using Firebird 1.5.2
 
 My question is:
 
 Postgre SQL will be more faster than Firebird? How much (in percent)?
 
 I need about 20% to 50% more performance at my application.
 Can I get this migratin to postgresql ?

The answer is: maybe.  There's nothing which stops PostgreSQL from
being faster, and likewise there is nothing that stops it from being
slower.  YMMV.

Your route should be:
 * migrate most speed-demanding part to PostgreSQL
 * benchmark it, trying to emulate real-world load.
 * if it is slower than Firebird, post it here, together with EXPLAIN ANALYZEs
   and ask if there's something you can do to speed it up.

   Regards,
  Dawid

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PERFORM] Inherited Table Query Planning (fwd)

2005-07-27 Thread Jens-Wolfhard Schicke
Is there a way to make the query planner consider pulling inner appends 
outside joins?


Example:
natural_person inherits from person (obviously)

admpostgres3=# explain analyze select u.name, p.name from users u, person p 
where p.user_id = u.id and u.name = 's_ohl';

 QUERY PLAN
--
Hash Join  (cost=8.01..3350.14 rows=3 width=36) (actual 
time=107.391..343.657 rows=10 loops=1)

  Hash Cond: (outer.user_id = inner.id)
  -  Append  (cost=0.00..2461.34 rows=117434 width=20) (actual 
time=0.007..264.910 rows=117434 loops=1)
-  Seq Scan on person p  (cost=0.00..575.06 rows=31606 width=20) 
(actual time=0.005..38.911 rows=31606 loops=1)
-  Seq Scan on natural_person p  (cost=0.00..1886.28 rows=85828 
width=19) (actual time=0.003..104.338 rows=85828 loops=1)
  -  Hash  (cost=8.01..8.01 rows=2 width=24) (actual time=0.096..0.096 
rows=0 loops=1)
-  Index Scan using users_name_idx on users u  (cost=0.00..8.01 
rows=2 width=24) (actual time=0.041..0.081 rows=10 loops=1)

  Index Cond: ((name)::text = 's_ohl'::text)
Total runtime: 343.786 ms
(9 rows)

admpostgres3=# explain analyze select u.name, p.name from users u, only 
person p where p.user_id = u.id and u.name = 's_ohl' union all select 
u.name, p.name from users u, only natural_person p where p.user_id = u.id 
and u.name = 's_ohl';

QUERY PLAN
-
Append  (cost=0.00..28.19 rows=3 width=28) (actual time=0.197..0.366 
rows=10 loops=1)
  -  Subquery Scan *SELECT* 1  (cost=0.00..14.12 rows=1 width=28) 
(actual time=0.159..0.159 rows=0 loops=1)
-  Nested Loop  (cost=0.00..14.11 rows=1 width=28) (actual 
time=0.157..0.157 rows=0 loops=1)
  -  Index Scan using users_name_idx on users u 
(cost=0.00..8.01 rows=2 width=24) (actual time=0.039..0.075 rows=10 loops=1)

Index Cond: ((name)::text = 's_ohl'::text)
  -  Index Scan using person_user_idx on person p 
(cost=0.00..3.03 rows=2 width=8) (actual time=0.006..0.006 rows=0 loops=10)

Index Cond: (p.user_id = outer.id)
  -  Subquery Scan *SELECT* 2  (cost=0.00..14.08 rows=2 width=28) 
(actual time=0.036..0.193 rows=10 loops=1)
-  Nested Loop  (cost=0.00..14.06 rows=2 width=28) (actual 
time=0.033..0.171 rows=10 loops=1)
  -  Index Scan using users_name_idx on users u 
(cost=0.00..8.01 rows=2 width=24) (actual time=0.018..0.049 rows=10 loops=1)

Index Cond: ((name)::text = 's_ohl'::text)
  -  Index Scan using natural_person_user_idx on 
natural_person p  (cost=0.00..3.01 rows=1 width=8) (actual 
time=0.006..0.007 rows=1 loops=10)

Index Cond: (p.user_id = outer.id)
Total runtime: 0.475 ms
(14 rows)


Mit freundlichem Gruß
Jens Schicke
--
Jens Schicke  [EMAIL PROTECTED]
asco GmbH http://www.asco.de
Mittelweg 7   Tel 0531/3906-127
38106 BraunschweigFax 0531/3906-400

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Inherited Table Query Planning (fwd)

2005-07-27 Thread Tom Lane
Jens-Wolfhard Schicke [EMAIL PROTECTED] writes:
 Is there a way to make the query planner consider pulling inner appends 
 outside joins?

Not at the moment.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread Josh Berkus
Luke,

 Well - now that I test it, it appears you are correct, temp table COPY
 bypasses WAL - thanks for pointing it out!

RIght.  The problem is bypassing WAL for loading new scratch tables which 
aren't TEMPORARY tables.   We need to do this for multi-threaded ETL, since:
a) Temp tables can't be shared by several writers, and
b) you can't index a temp table.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread Kris Jurka


On Wed, 27 Jul 2005, Josh Berkus wrote:

 b) you can't index a temp table.
 

jurka# create temp table t (a int);
CREATE
jurka# create index myi on t(a);
CREATE

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[PERFORM] wal_buffer tests in

2005-07-27 Thread Josh Berkus
Folks,

I ran a wal_buffer test series.   It appears that increasing the 
wal_buffers is indeed very important for OLTP applications, potentially 
resulting in as much as a 15% average increase in transaction processing.  
What's interesting is that this is not just true for 8.1, it's true for 
8.0.3 as well.   

More importantly, 8.1 performance is somehow back up to above-8.0 levels.  
Something was broken in June that's got fixed (this test series is based 
on July 3 CVS) but I don't know what.  Clues?

Test results are here:
http://pgfoundry.org/docman/view.php/141/79/wal_buffer_test.pdf

As always, detailed test results are available from OSDL, just use:
http://khack.osdl.org/stp/#
where # is the test number.

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Mirroring PostgreSQL database

2005-07-27 Thread Sebastian Hennebrueder
Shashi Kanth Boddula schrieb:

 Hi,
 I have one customer who is using PostgreSQL 7.4.8 on Linux . He has
 some problems with database mirroring . The details are follows.
 The customer is using Linux on which PostgreSQL 7.4.8 along with Jboss
 3.2.3 is running . He has 2 servers , one is acting as a live server
 (primary) and another is acting as a fail-over (secondary)  server
 .  Secondary server is placed in remote location . These servers are
 acting as a Attendence server for daily activities . Nearly 50,000
 employees depend on the live server .
  
 The customer is using DBmirror tool to mirror the database records of
 primary to secondary . The customer is complaining that there is one
 day (24 hours) delay between primary and secondray for database
 synchronization . They have dedicated line and bandwidth , but still
 the problems exists.
  
 I just want to know , for immediate data mirroring , what is the best
 way for PostgreSQL . PostgreSQL is offering many mirror tools , but
 which one is the best ?. Is there any other way to accomplish the task ?
  
 Thank you . Waiting for your reply.
  

 Thanks  Regards,
 Shashi Kanth
 Consultant - Linux
 RHCE , LPIC-2
 Onward Novell - Bangalore
 9886455567



For java based solution you could also have a look at x-jdbc or xjdbc.

But before you should find out what the reason for the delay is
actually. When the backup server is to slow, it may be not important which
mirroring tool you use.


-- 
Best Regards / Viele Grüße

Sebastian Hennebrueder



http://www.laliluna.de

Tutorials for JSP, JavaServer Faces, Struts, Hibernate and EJB 

Get support, education and consulting for these technologies - uncomplicated 
and cheap.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread Simon Riggs
On Wed, 2005-07-27 at 09:29 -0700, Josh Berkus wrote:
 Luke,
 
  Well - now that I test it, it appears you are correct, temp table COPY
  bypasses WAL - thanks for pointing it out!
 
 RIght.  The problem is bypassing WAL for loading new scratch tables which 
 aren't TEMPORARY tables.   We need to do this for multi-threaded ETL, since:
 a) Temp tables can't be shared by several writers, and
 b) you can't index a temp table.

The description of scratch tables might need some slight
clarification. It kindof makes it sound like temp tables.

I had in mind the extra tables that an application sometimes needs to
operate faster. Denormalisations, pre-joined tables, pre-calculated
results, aggregated data. These are not temporary tables, just part of
the application - multi-user tables that stay across shutdown/restart.

If you have gallons of GB, you will probably by looking to make use of
such tables.

You can use such tables for the style of ETL known as ELT, but that is
not the only use.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread PFC




I had in mind the extra tables that an application sometimes needs to
operate faster. Denormalisations, pre-joined tables, pre-calculated
results, aggregated data. These are not temporary tables, just part of
the application - multi-user tables that stay across shutdown/restart.


	You could also add caching search results for easy pagination without  
redoing always entirely on each page the Big Slow Search Query that every  
website has...


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.

2005-07-27 Thread Josh Berkus
Matt,

 After playing with various indexes and what not I simply am unable to
 make this procedure perform any better.  Perhaps someone on the list can
 spot the bottleneck and reveal why this procedure isn't performing that
 well or ways to make it better.

Well, my first thought is that this is a pretty complicated procedure for 
something you want to peform well.Is all this logic really necessary?   
How does it get done for MySQL?

-- 
--Josh

Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [Bizgres-general] Re: [PERFORM] faster INSERT with possible

2005-07-27 Thread Tom Lane
Josh Berkus josh@agliodbs.com writes:
 RIght.  The problem is bypassing WAL for loading new scratch tables which 
 aren't TEMPORARY tables.   We need to do this for multi-threaded ETL, since:
 a) Temp tables can't be shared by several writers, and
 b) you can't index a temp table.

This may not matter given point (a), but: point (b) is completely wrong.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-27 Thread Matthew Schumacher
Josh Berkus wrote:
 Matt,
 
 
After playing with various indexes and what not I simply am unable to
make this procedure perform any better.  Perhaps someone on the list can
spot the bottleneck and reveal why this procedure isn't performing that
well or ways to make it better.
 
 
 Well, my first thought is that this is a pretty complicated procedure for 
 something you want to peform well.Is all this logic really necessary?   
 How does it get done for MySQL?
 

I'm not sure if it's all needed, in mysql they have this simple schema:

===
CREATE TABLE bayes_expire (
  id int(11) NOT NULL default '0',
  runtime int(11) NOT NULL default '0',
  KEY bayes_expire_idx1 (id)
) TYPE=MyISAM;

CREATE TABLE bayes_global_vars (
  variable varchar(30) NOT NULL default '',
  value varchar(200) NOT NULL default '',
  PRIMARY KEY  (variable)
) TYPE=MyISAM;

INSERT INTO bayes_global_vars VALUES ('VERSION','3');

CREATE TABLE bayes_seen (
  id int(11) NOT NULL default '0',
  msgid varchar(200) binary NOT NULL default '',
  flag char(1) NOT NULL default '',
  PRIMARY KEY  (id,msgid)
) TYPE=MyISAM;

CREATE TABLE bayes_token (
  id int(11) NOT NULL default '0',
  token char(5) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  atime int(11) NOT NULL default '0',
  PRIMARY KEY  (id, token),
  INDEX bayes_token_idx1 (token),
  INDEX bayes_token_idx2 (id, atime)
) TYPE=MyISAM;

CREATE TABLE bayes_vars (
  id int(11) NOT NULL AUTO_INCREMENT,
  username varchar(200) NOT NULL default '',
  spam_count int(11) NOT NULL default '0',
  ham_count int(11) NOT NULL default '0',
  token_count int(11) NOT NULL default '0',
  last_expire int(11) NOT NULL default '0',
  last_atime_delta int(11) NOT NULL default '0',
  last_expire_reduce int(11) NOT NULL default '0',
  oldest_token_age int(11) NOT NULL default '2147483647',
  newest_token_age int(11) NOT NULL default '0',
  PRIMARY KEY  (id),
  UNIQUE bayes_vars_idx1 (username)
) TYPE=MyISAM;
===

Then they do this to insert the token:

INSERT INTO bayes_token (
  id,
  token,
  spam_count,
  ham_count,
  atime
) VALUES (
  ?,
  ?,
  ?,
  ?,
  ?
) ON DUPLICATE KEY
  UPDATE
spam_count = GREATEST(spam_count + ?, 0),
ham_count = GREATEST(ham_count + ?, 0),
atime = GREATEST(atime, ?)

Or update the token:

UPDATE bayes_vars SET
  $token_count_update
  newest_token_age = GREATEST(newest_token_age, ?),
  oldest_token_age = LEAST(oldest_token_age, ?)
  WHERE id = ?


I think the reason why the procedure was written for postgres was
because of the greatest and least statements performing poorly.

Honestly, I'm not real up on writing procs, I was hoping the problem
would be obvious to someone.

schu

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0 Bayes module.

2005-07-27 Thread Josh Berkus
Matt,

 UPDATE bayes_vars SET
   $token_count_update
   newest_token_age = GREATEST(newest_token_age, ?),
   oldest_token_age = LEAST(oldest_token_age, ?)
   WHERE id = ?


 I think the reason why the procedure was written for postgres was
 because of the greatest and least statements performing poorly.

Well, it might be because we don't have a built-in GREATEST or LEAST prior to 
8.1.   However, it's pretty darned easy to construct one.

 Honestly, I'm not real up on writing procs, I was hoping the problem
 would be obvious to someone.

Well, there's the general performance tuning stuff of course (postgresql.conf) 
which if you've not done any of it will pretty dramatically affect your 
througput rates.   And vacuum, analyze, indexes, etc.

You should also look at ways to make the SP simpler.  For example, you have a 
cycle that looks like:

SELECT
IF NOT FOUND
INSERT
ELSE
UPDATE

Which could be made shorter as:

UPDATE
IF NOT FOUND
INSERT

... saving you one index scan.

Also, I don't quite follow it, but the procedure seems to be doing at least 
two steps that the MySQL version isn't doing at all.  If the PG version is 
doing more things, of course it's going to take longer.

Finally, when you have a proc you're happy with, I suggest having an expert 
re-write it in C, which should double the procedure performance.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Performance problems testing with Spamassassin 3.1.0

2005-07-27 Thread Matthew Schumacher
Josh Berkus wrote:
 Matt,
 

 Well, it might be because we don't have a built-in GREATEST or LEAST prior to 
 8.1.   However, it's pretty darned easy to construct one.

I was more talking about min() and max() but yea, I think you knew where
I was going with it...

 
 Well, there's the general performance tuning stuff of course 
 (postgresql.conf) 
 which if you've not done any of it will pretty dramatically affect your 
 througput rates.   And vacuum, analyze, indexes, etc.

I have gone though all that.

 You should also look at ways to make the SP simpler.  For example, you have a 
 cycle that looks like:
 
 SELECT
   IF NOT FOUND
   INSERT
   ELSE
   UPDATE
 
 Which could be made shorter as:
 
 UPDATE
   IF NOT FOUND
   INSERT
 
 ... saving you one index scan.
 
 Also, I don't quite follow it, but the procedure seems to be doing at least 
 two steps that the MySQL version isn't doing at all.  If the PG version is 
 doing more things, of course it's going to take longer.
 
 Finally, when you have a proc you're happy with, I suggest having an expert 
 re-write it in C, which should double the procedure performance.
 

Sounds like I need to completely understand what the proc is doing and
work on a rewrite.  I'll look into writing it in C, I need to do some
reading about how that works and exactly what it buys you.

Thanks for the helpful comments.

schu

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PERFORM] Left joining against two empty tables makes a query SLOW

2005-07-27 Thread Chris Travers

Hi all;

I have a customer who currently uses an application which had become 
slow.  After doing some digging, I found the slow query:


SELECT c.accno, c.description, c.link, c.category, ac.project_id, 
p.projectnumber,

   a.department_id, d.description AS department
FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)
   JOIN ar a ON (a.id = ac.trans_id)
   LEFT JOIN project p ON (ac.project_id = p.id)
   LEFT JOIN department d ON (d.id = a.department_id)
WHERE a.customer_id = 11373 AND a.id IN (
   SELECT max(id) FROM ar WHERE customer_id = 11373
);

(reformatted for readability)

This is taking 10 seconds to run.

Interestingly, both the project and department tables are blank, and if 
I omit them, the query becomes:

SELECT c.accno, c.description, c.link, c.category, ac.project_id
FROM chart c JOIN acc_trans ac ON (ac.chart_id = c.id)
   JOIN ar a ON (a.id = ac.trans_id)
WHERE a.customer_id = 11373 AND a.id IN (
   SELECT max(id) FROM ar WHERE customer_id = 11373
);

This takes 139ms.  1% of the previous query.

The plan for the long query is:

  
QUERY PLAN


Hash IN Join  (cost=87337.25..106344.93 rows=41 width=118) (actual 
time=7615.843..9850.209 rows=10 loops=1)

  Hash Cond: (outer.trans_id = inner.max)
  -  Merge Right Join  (cost=86620.57..100889.85 rows=947598 
width=126) (actual time=7408.830..9200.435 rows=177769 loops=1)

Merge Cond: (outer.id = inner.department_id)
-  Index Scan using department_id_key on department d  
(cost=0.00..52.66

rows=1060 width=36) (actual time=0.090..0.090 rows=0 loops=1)
-  Sort  (cost=86620.57..87067.55 rows=178792 width=94) 
(actual time=7408.709..7925.843 rows=177769 loops=1)

  Sort Key: a.department_id
  -  Merge Right Join  (cost=45871.18..46952.83 
rows=178792 width=94) (actual time=4962.122..6671.319 rows=177769 loops=1)

Merge Cond: (outer.id = inner.project_id)
-  Index Scan using project_id_key on project p  
(cost=0.00..49.80 rows=800 width=36) (actual time=0.007..0.007 rows=0 
loops=1)
-  Sort  (cost=45871.18..46318.16 rows=178792 
width=62) (actual time=4962.084..5475.636 rows=177769 loops=1)

  Sort Key: ac.project_id
  -  Hash Join  (cost=821.20..13193.43 
rows=178792 width=62) (actual time=174.905..4295.685 rows=177769 loops=1)

Hash Cond: (outer.chart_id = inner.id)
-  Hash Join  (cost=817.66..10508.02 
rows=178791

width=20) (actual time=173.952..2840.824 rows=177769 loops=1)
  Hash Cond: (outer.trans_id = 
inner.id)
  -  Seq Scan on acc_trans ac  
(cost=0.00..3304.38 rows=181538 width=12) (actual time=0.062..537.753 
rows=181322 loops=1)
  -  Hash  (cost=659.55..659.55 
rows=22844 width=8) (actual time=173.625..173.625 rows=0 loops=1)
-  Seq Scan on ar a  
(cost=0.00..659.55 rows=22844 width=8) (actual time=0.022..101.828 
rows=22844 loops=1)
  Filter: (customer_id 
= 11373)
-  Hash  (cost=3.23..3.23 rows=123 
width=50) (actual time=0.915..0.915 rows=0 loops=1)
  -  Seq Scan on chart c  
(cost=0.00..3.23 rows=123 width=50) (actual time=0.013..0.528 rows=123 
loops=1)
  -  Hash  (cost=716.67..716.67 rows=1 width=4) (actual 
time=129.037..129.037 rows=0 loops=1)
-  Subquery Scan IN_subquery  (cost=716.66..716.67 rows=1 
width=4) (actual time=129.017..129.025 rows=1 loops=1)
  -  Aggregate  (cost=716.66..716.66 rows=1 width=4) 
(actual time=129.008..129.011 rows=1 loops=1)
-  Seq Scan on ar  (cost=0.00..659.55 rows=22844 
width=4) (actual time=0.020..73.266 rows=22844 loops=1)

  Filter: (customer_id = 11373)
Total runtime: 9954.133 ms
(28 rows)

The shorter query's plan is:

  
QUERY PLAN
 
-
Hash Join  (cost=728.42..732.96 rows=8 width=50) (actual 
time=130.908..131.593 rows=10 loops=1)

  Hash Cond: (outer.id = inner.chart_id)
  -  Seq Scan on chart c  (cost=0.00..3.23 rows=123 width=50) (actual 
time=0.006..0.361 rows=123 loops=1)
  -  Hash  (cost=728.40..728.40 rows=8 width=8) (actual 
time=130.841..130.841 rows=0 loops=1)
-  Nested Loop