Re: [PERFORM] Looking for tips

2005-07-26 Thread Vivek Khera


On Jul 19, 2005, at 3:01 PM, Tom Lane wrote:

You could possibly get some improvement if you can re-use prepared  
plans
for the queries; but this will require some fooling with the client  
code

(I'm not sure if DBD::Pg even has support for it at all).



DBD::Pg 1.40+ by default uses server-side prepared statements when  
you do $dbh->prepare() against an 8.x database server.


Vivek Khera, Ph.D.
+1-301-869-4449 x806




smime.p7s
Description: S/MIME cryptographic signature


Re: [PERFORM] Looking for tips

2005-07-23 Thread Marc Mamin
Title: Re: [PERFORM] Looking for tips






 
Hi,
I have a similar application,
but instead of adding new items to the db once at time,
I retrieve new IDs from a sequence (actually only every 10'000 
times) and write a csv file from perl.
When finished, I load all new record in one run with Copy.
 
hth,
 
Marc Mamin



From: 
[EMAIL PROTECTED] on behalf of Oliver 
CrosbySent: Wed 7/20/2005 3:50 AMTo: PFCCc: 
Sven Willenberger; Dawid Kuroczko; Kevin Grittner; [EMAIL PROTECTED]; 
pgsql-performance@postgresql.orgSubject: Re: [PERFORM] Looking for 
tips

Sorry for the lack of specifics...We have a file 
generated as a list of events, one per line. Supposelines 1,2,3,5,7,11,etc 
were related, then the last one would specifythat it's the last event. 
Gradually this gets assembled by a perlscript and when the last event is 
encountered, it gets inserted intothe db. For a given table, let's say it's 
of the form (a,b,c) where'a' is a pkey, 'b' is indexed, and 'c' is other 
related information.The most common 'b' values are cached locally with the 
perl script tosave us having to query the db. So what we end up having 
is:if 'b' exists in cache, use cached 'a' value and continueelse if 
'b' exists in the db, use the associated 'a' value and continueelse add a 
new line with 'b', return the new 'a' and continueThe local cache was a 
huge time saver with mysql. I've tried making aplpgsql function that handles 
everything in one step on the db side,but it didn't show any improvement. 
Time permitting, I'll try some newapproaches with changing the scripts and 
queries, though right now Iwas just hoping to tune postgresql.conf to work 
better with thehardware available.Thanks to everyone for your help. 
Very much appreciated.---(end of 
broadcast)---TIP 5: don't forget to increase your 
free space map settings




Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
Sorry for the lack of specifics...

We have a file generated as a list of events, one per line. Suppose
lines 1,2,3,5,7,11,etc were related, then the last one would specify
that it's the last event. Gradually this gets assembled by a perl
script and when the last event is encountered, it gets inserted into
the db. For a given table, let's say it's of the form (a,b,c) where
'a' is a pkey, 'b' is indexed, and 'c' is other related information.
The most common 'b' values are cached locally with the perl script to
save us having to query the db. So what we end up having is:

if 'b' exists in cache, use cached 'a' value and continue
else if 'b' exists in the db, use the associated 'a' value and continue
else add a new line with 'b', return the new 'a' and continue

The local cache was a huge time saver with mysql. I've tried making a
plpgsql function that handles everything in one step on the db side,
but it didn't show any improvement. Time permitting, I'll try some new
approaches with changing the scripts and queries, though right now I
was just hoping to tune postgresql.conf to work better with the
hardware available.

Thanks to everyone for your help. Very much appreciated.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Looking for tips

2005-07-19 Thread PFC


	You could have a program pre-parse your log and put it in a format  
understandable by COPY, then load it in a temporary table and write a part  
of your application simply as a plpgsql function, reading from this table  
and doing queries (or a plperl function)...



So... (bear with me here.. trying to make sense of this)..
With triggers there's a way I can do the parsing I need to on a log
file and react to completed events in non-sequential order (you can
ignore that part.. it's just how we piece together different related
events) and then have perl/DBD::Pg invoke a copy command (which, from
what I can tell, has to operate on a file...) and the copy command can
feed the ID I need back to perl so I can work with it...
If that doesn't hurt my brain, then I'm at least kinda confused...
Anyway. Heading home now. I'll think about this more tonight/tomorrow.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq





---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Looking for tips

2005-07-19 Thread Sven Willenberger
On Tue, 2005-07-19 at 17:04 -0400, Oliver Crosby wrote:
> > since triggers work with COPY, you could probably write a trigger that
> > looks for this condition and does the ID processsing you need; you could
> > thereby enjoy the enormous speed gain resulting from COPY and maintain
> > your data continuity.
> 
> So... (bear with me here.. trying to make sense of this)..
> With triggers there's a way I can do the parsing I need to on a log
> file and react to completed events in non-sequential order (you can
> ignore that part.. it's just how we piece together different related
> events) and then have perl/DBD::Pg invoke a copy command (which, from
> what I can tell, has to operate on a file...) and the copy command can
> feed the ID I need back to perl so I can work with it...
> If that doesn't hurt my brain, then I'm at least kinda confused...
> Anyway. Heading home now. I'll think about this more tonight/tomorrow.
> 

Well without knowing the specifics of what you are actually trying to
accomplish I cannot say yes or no to your question. I am not sure from
where this data is coming that you are inserting into the db. However,
if the scenario is this: a) attempt to insert a row b) if row exists
already, grab the ID and do other db selects/inserts/deletes based on
that ID, then there is no need to feed this information back to the
perlscript. Is your perlscript parsing a file and then using the parsed
information to insert rows? If so, how is the ID that is returned used?
Can you have the trigger use the ID that may be returned to perform
whatever it is that your perlscript is trying to accomplish with that
ID?

It's all kind of vague so my answers may or may not help, but based on
the [lack of] specifics you have provided, I fear that is the best
suggestion that I can offer at this point.

Sven


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> since triggers work with COPY, you could probably write a trigger that
> looks for this condition and does the ID processsing you need; you could
> thereby enjoy the enormous speed gain resulting from COPY and maintain
> your data continuity.

So... (bear with me here.. trying to make sense of this)..
With triggers there's a way I can do the parsing I need to on a log
file and react to completed events in non-sequential order (you can
ignore that part.. it's just how we piece together different related
events) and then have perl/DBD::Pg invoke a copy command (which, from
what I can tell, has to operate on a file...) and the copy command can
feed the ID I need back to perl so I can work with it...
If that doesn't hurt my brain, then I'm at least kinda confused...
Anyway. Heading home now. I'll think about this more tonight/tomorrow.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Looking for tips

2005-07-19 Thread Sven Willenberger
On Tue, 2005-07-19 at 16:28 -0400, Oliver Crosby wrote:
> > If it is possible try:
> > 1) wrapping many inserts into one transaction
> > (BEGIN;INSERT;INSERT;...INSERT;COMMIT;).  As PostgreSQL will need to
> > handle less transactions per second (each your insert is a transaction), it
> > may work faster.
> 
> Aye, that's what I have it doing right now. The transactions do save a
> HUGE chunk of time. (Cuts it down by about 40%).
> 
> > 2) If you can do 1, you could go further and use a COPY command which is
> > the fastest way to bulk-load a database.
> 
> I don't think I can use COPY in my case because I need to do
> processing on a per-line basis, and I need to check if the item I want
> to insert is already there, and if it is, I need to get it's ID so I
> can use that for further processing.
> 

since triggers work with COPY, you could probably write a trigger that
looks for this condition and does the ID processsing you need; you could
thereby enjoy the enormous speed gain resulting from COPY and maintain
your data continuity.

Sven


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Looking for tips

2005-07-19 Thread Christopher Weimann
On 07/19/2005-02:41PM, Oliver Crosby wrote:
> 
> No queries in particular appear to be a problem. 

That could mean they are ALL a problem.  Let see some EXPLAIN ANAYZE
results just to rule it out.

> At the moment it's just one user, 

With 1 user PostgreSQL will probobaly never beat MySQL
but with hundreds it will.


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> If it is possible try:
> 1) wrapping many inserts into one transaction
> (BEGIN;INSERT;INSERT;...INSERT;COMMIT;).  As PostgreSQL will need to
> handle less transactions per second (each your insert is a transaction), it
> may work faster.

Aye, that's what I have it doing right now. The transactions do save a
HUGE chunk of time. (Cuts it down by about 40%).

> 2) If you can do 1, you could go further and use a COPY command which is
> the fastest way to bulk-load a database.

I don't think I can use COPY in my case because I need to do
processing on a per-line basis, and I need to check if the item I want
to insert is already there, and if it is, I need to get it's ID so I
can use that for further processing.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Looking for tips

2005-07-19 Thread PFC



PS: Where can I find benchmarks comparing PHP vs Perl vs Python in
terms of speed of executing prepared statements?


I'm afraid you'll have to do these yourself !

	And, I don't think the Python drivers support real prepared statements  
(the speed of psycopy is really good though).
	I don't think PHP either ; they don't even provide a database interface  
to speak of (ie you have to build the query string by hand including  
quoting).



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Looking for tips

2005-07-19 Thread Dawid Kuroczko
On 7/19/05, Oliver Crosby <[EMAIL PROTECTED]> wrote:
> > We had low resource utilization and poor throughput on inserts of
> > thousands of rows within a single database transaction.  There were a
> > lot of configuration parameters we changed, but the one which helped the
> > most was wal_buffers -- we wound up setting it to 1000.  This may be
> > higher than it needs to be, but when we got to something which ran well,
> > we stopped tinkering.  The default value clearly caused a bottleneck.
> 
> I just tried wal_buffers = 1000, sort_mem at 10% and
> effective_cache_size at 75%.
> The performance refuses to budge.. I guess that's as good as it'll go?

If it is possible try:
1) wrapping many inserts into one transaction
(BEGIN;INSERT;INSERT;...INSERT;COMMIT;).  As PostgreSQL will need to
handle less transactions per second (each your insert is a transaction), it
may work faster.

2) If you can do 1, you could go further and use a COPY command which is
the fastest way to bulk-load a database.

Sometimes I insert data info temporary table, and then do:
INSERT INTO sometable SELECT * FROM tmp_table;
(but I do it when I want to do some select, updates, etc on
the data before "commiting" them to main table; dropping
temporary table is much cheaper than vacuuming many-a-row
table).

  Regards,
 Dawid

PS: Where can I find benchmarks comparing PHP vs Perl vs Python in
terms of speed of executing prepared statements?

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> We had low resource utilization and poor throughput on inserts of
> thousands of rows within a single database transaction.  There were a
> lot of configuration parameters we changed, but the one which helped the
> most was wal_buffers -- we wound up setting it to 1000.  This may be
> higher than it needs to be, but when we got to something which ran well,
> we stopped tinkering.  The default value clearly caused a bottleneck.

I just tried wal_buffers = 1000, sort_mem at 10% and
effective_cache_size at 75%.
The performance refuses to budge.. I guess that's as good as it'll go?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Looking for tips

2005-07-19 Thread Jeff Trout


On Jul 19, 2005, at 3:36 PM, Steinar H. Gunderson wrote:


On Tue, Jul 19, 2005 at 03:16:31PM -0400, Tom Lane wrote:

Ah, but are they really prepared, or is DBD::Pg faking it by  
inserting

parameter values into the query text and then sending the assembled
string as a fresh query?



They are really prepared.


That depends on what version you are using. Older versions did what  
Tom mentioned rather than sending PREPARE & EXECUTE.


Not sure what version that changed in.

--
Jeff Trout <[EMAIL PROTECTED]>
http://www.jefftrout.com/
http://www.stuarthamm.net/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PERFORM] Looking for tips

2005-07-19 Thread Steinar H. Gunderson
On Tue, Jul 19, 2005 at 03:16:31PM -0400, Tom Lane wrote:
> Ah, but are they really prepared, or is DBD::Pg faking it by inserting
> parameter values into the query text and then sending the assembled
> string as a fresh query?  

They are really prepared.

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> If you're running only a single query at a time (no multiple clients),
> then this is pretty much the definition of a MySQL-friendly workload;
> I'd have to say we are doing really well if we are only 50% slower.
> Postgres doesn't have any performance advantages until you get into
> complex queries or a significant amount of concurrency.

The original port was actually twice as slow. It improved quite a bit
after I added transactions and trimmed the schema a bit.

> You could possibly get some improvement if you can re-use prepared plans
> for the queries; but this will require some fooling with the client code
> (I'm not sure if DBD::Pg even has support for it at all).

Aye. We have prepared statements.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Looking for tips

2005-07-19 Thread Tom Lane
Oliver Crosby <[EMAIL PROTECTED]> writes:
>> You could possibly get some improvement if you can re-use prepared plans
>> for the queries; but this will require some fooling with the client code
>> (I'm not sure if DBD::Pg even has support for it at all).

> Aye. We have prepared statements.

Ah, but are they really prepared, or is DBD::Pg faking it by inserting
parameter values into the query text and then sending the assembled
string as a fresh query?  It wasn't until about 7.4 that we had adequate
backend support to let client libraries support prepared queries
properly, and I'm unsure that DBD::Pg has been updated to take advantage
of that support.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PERFORM] Looking for tips

2005-07-19 Thread PFC


	I can't say wether MySQL is faster for very small queries (like  
SELECT'ing one row based on an indexed field).

That's why I was asking you about the language...

I assume you're using a persistent connection.
	For simple queries like this, PG 8.x seemed to be a lot faster than PG  
7.x. Have you tried 8 ?
	I was asking you which language, because for such really small queries  
you have to take into account the library overhead. For instance, in PHP a  
simple query can be 10 times slower in Postgres than in MySQL and I  
believe it is because php's MySQL driver has seen a lot of optimization  
whereas the postgres driver has not. Interestingly, the situation is  
reversed with Python : its best postgres driver (psycopg 2) is a lot  
faster than the MySQL adapter, and faster than both php adapters (a lot  
faster).


The same query can get (this is from the back of my head):

PHP+Postgres3-5 ms
Python+MySQL1ms
PHP+MySQL   0.5 ms
Python+Postgres 0.15 ms

And yes, I had queries executing in 150 microseconds or so, this includes  
time to convert the results to native python objects ! This was on a loop  
of 1 times the same query. But psycopg2 is fast. The overhead for  
parsing a simple query and fetching just a row is really small.


This is on my Centrino 1.6G laptop.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Looking for tips

2005-07-19 Thread Steinar H. Gunderson
On Tue, Jul 19, 2005 at 03:01:00PM -0400, Tom Lane wrote:
> You could possibly get some improvement if you can re-use prepared plans
> for the queries; but this will require some fooling with the client code
> (I'm not sure if DBD::Pg even has support for it at all).

Newer versions has, when compiled against the 8.0 client libraries and using
an 8.0 server (AFAIK).

/* Steinar */
-- 
Homepage: http://www.sesse.net/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Looking for tips

2005-07-19 Thread Tom Lane
Oliver Crosby <[EMAIL PROTECTED]> writes:
> The queries are all simple insert or select statements on single tables.
> Eg. select x from table where y=?; or insert into table (a, b, c)
> values (?, ?, ?);
> In the case of selects where it's a large table, there's an index on
> the column being searched, so in terms of the example above, x is
> either a pkey column or other related field, and y is a non-pkey
> column.

If you're running only a single query at a time (no multiple clients),
then this is pretty much the definition of a MySQL-friendly workload;
I'd have to say we are doing really well if we are only 50% slower.
Postgres doesn't have any performance advantages until you get into
complex queries or a significant amount of concurrency.

You could possibly get some improvement if you can re-use prepared plans
for the queries; but this will require some fooling with the client code
(I'm not sure if DBD::Pg even has support for it at all).

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Looking for tips

2005-07-19 Thread Kevin Grittner
Hi Oliver,

We had low resource utilization and poor throughput on inserts of
thousands of rows within a single database transaction.  There were a
lot of configuration parameters we changed, but the one which helped the
most was wal_buffers -- we wound up setting it to 1000.  This may be
higher than it needs to be, but when we got to something which ran well,
we stopped tinkering.  The default value clearly caused a bottleneck.
 
You might find this page useful:
 
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
 
-Kevin
 
 
>>> Oliver Crosby <[EMAIL PROTECTED]> 07/19/05 1:21 PM >>>
I was hoping to start with tuning postgres to match the hardware, but
in any case..

The queries are all simple insert or select statements on single tables.
Eg. select x from table where y=?; or insert into table (a, b, c)
values (?, ?, ?);
In the case of selects where it's a large table, there's an index on
the column being searched, so in terms of the example above, x is
either a pkey column or other related field, and y is a non-pkey
column.

I'm not sure what you mean by structure.

I tried explain analyse on the individual queries, but I'm not sure
what can be done to manipulate them when they don't do much.

My test environment has about 100k - 300k rows in each table, and for
production I'm expecting this to be in the order of 1M+.

The OS is Redhat Enterprise 3.

I'm using a time command when I call the scripts to get a total
running time from start to finish.

I don't know what we have for RAID, but I suspect it's just a single
10k or 15k rpm hdd.

I'll try your recommendations for shared_buffers and
effective_cache_size. Thanks John!

We're trying to improve performance on a log processing script to the
point where it can be run as close as possible to realtime. A lot of
what gets inserted depends on what's already in the db, and it runs
item-by-item... so unfortunately I can't take advantage of copy.

We tried dropping indices, copying data in, then rebuilding. It works
great for a bulk import, but the processing script went a lot slower
without them. (Each insert is preceeded by a local cache check and
then a db search to see if an ID already exists for an item.)

We have no foreign keys at the moment. Would they help?


On 7/19/05, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
> Oliver Crosby wrote:
> > Hi,
> > I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs
of ram.
> > Running scripts locally, it takes about 1.5x longer than mysql, and
the
> > load on the server is only about 21%.
> 
> What queries?
> What is your structure?
> Have you tried explain analyze?
> How many rows in the table?
> Which OS?
> How are you testing the speed?
> What type of RAID?
> 
> 
> 
> --
> Your PostgreSQL solutions company - Command Prompt, Inc.
1.800.492.2240
> PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
> Managed Services, Shared and Dedicated Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PERFORM] Looking for tips

2005-07-19 Thread PFC



What programming language are these scripts written in ?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> What programming language are these scripts written in ?

perl. using the DBD:Pg interface instead of command-lining it through psql

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
> Identify what the problem is first of all. Some things to consider:
>  - Are there particular queries giving you trouble?
>  - Is your load mostly reads or mostly writes?
>  - Do you have one user or 100?
>  - Are you block-loading data efficiently where necessary?
>  - Have you indexed both sides of your foreign-keys where sensible?
>  - Are your disks being used effectively?
>  - Are your statistics accurate/up to date?

No queries in particular appear to be a problem. I think it's just the
overall speed. If any of the configuration settings will help make the
simple select queries go faster, that would be ideal.
The load is about 50/50 read/write.
At the moment it's just one user, but the goal is to have a cluster of
servers (probably less than a dozen) updating to a central db.
Indices exist for the fields being searched, but we don't have any foreign keys.
I'm not too familiar with effective disk usage or statistics...

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Looking for tips

2005-07-19 Thread Oliver Crosby
I was hoping to start with tuning postgres to match the hardware, but
in any case..

The queries are all simple insert or select statements on single tables.
Eg. select x from table where y=?; or insert into table (a, b, c)
values (?, ?, ?);
In the case of selects where it's a large table, there's an index on
the column being searched, so in terms of the example above, x is
either a pkey column or other related field, and y is a non-pkey
column.

I'm not sure what you mean by structure.

I tried explain analyse on the individual queries, but I'm not sure
what can be done to manipulate them when they don't do much.

My test environment has about 100k - 300k rows in each table, and for
production I'm expecting this to be in the order of 1M+.

The OS is Redhat Enterprise 3.

I'm using a time command when I call the scripts to get a total
running time from start to finish.

I don't know what we have for RAID, but I suspect it's just a single
10k or 15k rpm hdd.

I'll try your recommendations for shared_buffers and
effective_cache_size. Thanks John!

We're trying to improve performance on a log processing script to the
point where it can be run as close as possible to realtime. A lot of
what gets inserted depends on what's already in the db, and it runs
item-by-item... so unfortunately I can't take advantage of copy.

We tried dropping indices, copying data in, then rebuilding. It works
great for a bulk import, but the processing script went a lot slower
without them. (Each insert is preceeded by a local cache check and
then a db search to see if an ID already exists for an item.)

We have no foreign keys at the moment. Would they help?


On 7/19/05, Joshua D. Drake <[EMAIL PROTECTED]> wrote:
> Oliver Crosby wrote:
> > Hi,
> > I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs of ram.
> > Running scripts locally, it takes about 1.5x longer than mysql, and the
> > load on the server is only about 21%.
> 
> What queries?
> What is your structure?
> Have you tried explain analyze?
> How many rows in the table?
> Which OS?
> How are you testing the speed?
> What type of RAID?
> 
> 
> 
> --
> Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
> PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
> Managed Services, Shared and Dedicated Hosting
> Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/
>

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Looking for tips

2005-07-19 Thread Richard Huxton

Oliver Crosby wrote:

Hi,
I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs of ram.
Running scripts locally, it takes about 1.5x longer than mysql, and the load 
on the server is only about 21%.


What scripts? What do they do?
Oh, and 7.4.8 is the latest release - worth upgrading for the fixes.

I upped the sort_mem to 8192 (kB), and shared_buffers and 
effective_cache_size to 65536 (512MB), but neither the timing nor the server 
load have changed at all.


Well, effective_cache_size is the amount of RAM being used by the OS to 
cache your files, so take a look at top/free and set it based on that 
(pick a steady load).


What sort_mem should be will obviously depend how much sorting you do.

Drop shared_buffers down to about 1 - 2 (at a guess)

You may find the following useful
 http://www.varlena.com/varlena/GeneralBits/Tidbits/index.php
Read the Performance Tuning article, there is an updated one for version 
8 at:

 http://www.powerpostgresql.com/PerfList

> FYI, I'm going to be working on data sets in the

order of GB.


Fair enough.

 I think I've gone about as far as I can with google.. can anybody give me 
some advice on how to improve the raw performance before I start looking at 
code changes?


Identify what the problem is first of all. Some things to consider:
 - Are there particular queries giving you trouble?
 - Is your load mostly reads or mostly writes?
 - Do you have one user or 100?
 - Are you block-loading data efficiently where necessary?
 - Have you indexed both sides of your foreign-keys where sensible?
 - Are your disks being used effectively?
 - Are your statistics accurate/up to date?

Bear in mind that MySQL will probably be quicker for simple queries for 
one user and always will be. If you have multiple users running a mix of 
multi-table joins and updates then PG will have a chance to stretch its 
legs.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Looking for tips

2005-07-19 Thread John A Meinel

Oliver Crosby wrote:

Hi,
I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs of ram.
Running scripts locally, it takes about 1.5x longer than mysql, and the
load on the server is only about 21%.
I upped the sort_mem to 8192 (kB), and shared_buffers and
effective_cache_size to 65536 (512MB), but neither the timing nor the
server load have changed at all. FYI, I'm going to be working on data
sets in the order of GB.

I think I've gone about as far as I can with google.. can anybody give
me some advice on how to improve the raw performance before I start
looking at code changes?

Thanks in advance.


First, try to post in plain-text rather than html, it is easier to read. :)

Second, if you can determine what queries are running slow, post the
result of EXPLAIN ANALYZE on them, and we can try to help you tune
them/postgres to better effect.

Just a blanket question like this is hard to answer. Your new
shared_buffers are probably *way* too high. They should be at most
around 10% of ram. Since this is a dedicated server effective_cache_size
should be probably ~75% of ram, or close to 1.2GB.

There are quite a few things that you can tweak, so the more information
you can give, the more we can help.

For instance, if you are loading a lot of data into a table, if
possible, you want to use COPY not INSERT.
If you have a lot of indexes and are loading a significant portion, it
is sometimes faster to drop the indexes, COPY the data in, and then
rebuild the indexes.

For tables with a lot of inserts/updates, you need to watch out for
foreign key constraints. (Generally, you will want an index on both
sides of the foreign key. One is required, the other is recommended for
faster update/deletes).

John
=:->



signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Looking for tips

2005-07-19 Thread Joshua D. Drake

Oliver Crosby wrote:

Hi,
I'm running Postgres 7.4.6 on a dedicated server with about 1.5gigs of ram.
Running scripts locally, it takes about 1.5x longer than mysql, and the 
load on the server is only about 21%.


What queries?
What is your structure?
Have you tried explain analyze?
How many rows in the table?
Which OS?
How are you testing the speed?
What type of RAID?



--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly