Re: [PERFORM] Performance tuning for postgres

2010-06-10 Thread Robert Haas
On Fri, Jun 4, 2010 at 12:40 AM, Yogesh Naik
yogesh_n...@persistent.co.in wrote:
 I am performing a DB insertion and update for 3000+ records and while doing
 so i get CPU utilization
 to 100% with 67% of CPU used by postgres

That sounds normal to me.  What would you expect to happen?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance tuning for postgres

2010-06-04 Thread Kevin Grittner
Yogesh Naik yogesh_n...@persistent.co.in wrote:
 
 I am performing a DB insertion and update for 3000+ records and
 while doing so i get CPU utilization to 100% with 67% of CPU used
 by postgres
 
 I have also done optimization on queries too...
 
 Is there any way to optimized the CPU utilization for postgres
 
We'd need a lot more information before we could make useful
suggestions.  Knowing something about your hardware, OS, exact
PostgreSQL version, postgresql.conf contents, the table definition,
any foreign keys or other constraints, and exactly how you're doing
the inserts would all be useful.  Please read this and repost:
 
http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance tuning for postgres

2010-06-04 Thread Michael Gould
In my opinion it depends on the application, the priority of the application
and whether or not it is a commercially sold product, but depending on your
needs you might want to consider having a 3rd party vendor who has expertise
in this process review and help tune the application.  One vendor that I
know does this is EnterpriseDB.  I've worked with other SQL engines and have
a lot of experience tuning queries in a couple of the environments but
PostGresql isn't one of them.  Having an experienced DBA review your system
can make the difference between night and day.

Best Regards

Michael Gould

Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 Yogesh Naik yogesh_n...@persistent.co.in wrote:
  
 I am performing a DB insertion and update for 3000+ records and
 while doing so i get CPU utilization to 100% with 67% of CPU used
 by postgres
 
 I have also done optimization on queries too...
 
 Is there any way to optimized the CPU utilization for postgres
  
 We'd need a lot more information before we could make useful
 suggestions.  Knowing something about your hardware, OS, exact
 PostgreSQL version, postgresql.conf contents, the table definition,
 any foreign keys or other constraints, and exactly how you're doing
 the inserts would all be useful.  Please read this and repost:
  
 http://wiki.postgresql.org/wiki/SlowQueryQuestions
  
 -Kevin
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance
 

--
Michael Gould, Managing Partner
Intermodal Software Solutions, LLC
904.226.0978
904.592.5250 fax



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance tuning for postgres

2010-06-04 Thread Bryan Hinton
Is this a bulk insert?  Are you wrapping your statements within a
transaction(s)?
How many columns in the table?  What do the table statistics look like?



On Fri, Jun 4, 2010 at 9:21 AM, Michael Gould 
mgo...@intermodalsoftwaresolutions.net wrote:

 In my opinion it depends on the application, the priority of the
 application
 and whether or not it is a commercially sold product, but depending on your
 needs you might want to consider having a 3rd party vendor who has
 expertise
 in this process review and help tune the application.  One vendor that I
 know does this is EnterpriseDB.  I've worked with other SQL engines and
 have
 a lot of experience tuning queries in a couple of the environments but
 PostGresql isn't one of them.  Having an experienced DBA review your system
 can make the difference between night and day.

 Best Regards

 Michael Gould

 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
  Yogesh Naik yogesh_n...@persistent.co.in wrote:
 
  I am performing a DB insertion and update for 3000+ records and
  while doing so i get CPU utilization to 100% with 67% of CPU used
  by postgres
 
  I have also done optimization on queries too...
 
  Is there any way to optimized the CPU utilization for postgres
 
  We'd need a lot more information before we could make useful
  suggestions.  Knowing something about your hardware, OS, exact
  PostgreSQL version, postgresql.conf contents, the table definition,
  any foreign keys or other constraints, and exactly how you're doing
  the inserts would all be useful.  Please read this and repost:
 
  http://wiki.postgresql.org/wiki/SlowQueryQuestions
 
  -Kevin
 
  --
  Sent via pgsql-performance mailing list (
 pgsql-performance@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-performance
 

 --
 Michael Gould, Managing Partner
 Intermodal Software Solutions, LLC
 904.226.0978
 904.592.5250 fax



 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Performance Tuning Large PL/PGSQL Stored Procedure

2010-03-26 Thread Merlin Moncure
On Thu, Mar 25, 2010 at 11:56 PM, Eliot Gable
egable+pgsql-performa...@gmail.com wrote:

 How many rows min/max/avg are coming back in your refcursors?  Are you
 using cursors in order to return multiple complex data structures
 (sets, etc) in a single function call?


 I think the largest number of rows is around 40. Most are substantially
 smaller. However, most of them have about two dozen or more columns, and I
 have already shortened the list of columns to the minimum possible. The
 average number of rows is around 10, but the largest sets of rows also have
 the most columns. I'm using the cursors in order to obtain multiple complex
 data structures in a single function call.

ok, small sets.  yes, passing them back to the client as arrays is
probably going to be faster.  It's a trivial change to your proc.  you
have to define a type for your array element the way we are going to
use it.  you can use a composite type or a table (I prefer a table).

create table mystuff_t
(
  a text,
  b int,
  c timestamptz
);

create function myproc([...], mystuffs out mystuff_t[])
[inside proc]

replace your cursor declaration with this:

select array
(
   select (a,b,c)::mystuff_t from [...]
) into mystuffs;

code an alternate version of the function and then inside libpq
execute the query in binary and discard the results, timing the
results and comparing to how you run your query now also discarding
the results.  we want to time it this way because from timing it from
psql includes the time to print out the array in text format which we
can avoid with libpqtypes (which we are not going to mess with, until
we know there is a resaon to go in this direction).  We do need to
include the time to turn around and fetch the data from the
refcursors. If you see at least a 10-20% improvement, it warrants
further effort IMO (and say goodbye to refcursors forever).

 WITH clauses can make your queries much easier to read and yield great
 speedups if you need to access the table expression multiple times
 from other parts of the query.  however, in some cases you can get
 into trouble because a standard set of joins is going to give the
 planner the most flexibility in terms of query optimization.


 So far, every case I have converted to WITH clauses has resulted in more
 than double the speed (half the time required to perform the query). The
 main reason appears to be from avoiding calculating JOIN conditions multiple
 times in different parts of the query due to the UNION and EXCEPT clauses.

I have a hard time believing that unless there are other factors
compromising the planner like bad statistics or a non optimal query or
you are dealing with a relatively special case.

'EXCEPT' btw is also an optimization target. maybe think about
converting to 'letf join where rightcol is null' or something like
that.  not 100% sure, I think some work was done recently on except so
this advice may not be as true as it used to be, and possibly moot if
the number of rows being considered by except is very small.

 So, you are saying that I can return a complex type as a result which
 contains arrays of other complex types and just use my single SELECT command
 to retrieve the whole data set? That would be much simpler and I imagine
 must faster.

yes, however you will want to receive as few complex types as
possible, meaning your result set should still have multiple columns.
reducing the number of columns is not an optimization target.  in
other words, do the minimal amount of stacking necessary to allow
single query extraction of data.

 I really am chasing milliseconds here, and I appreciate all your feedback.
 You've given me a relatively large number of possible optimizations I can
 try out. I will definitely try out the libpqtypes. That sounds like a
 promising way to further cut down on execution time. I think most of my
 performance penalty is in transfering the results back to the C++
 application.

yes.  I've suggested libpqtypes to a number of people on the lists,
and you are what i'd consider the ideal candidate. libpqtypes will
completely transform the way you think about postgresql and libpq.
good luck. if you need help setting it up you can email me privately
or on the libpqtypes list.

merlin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance Tuning Large PL/PGSQL Stored Procedure

2010-03-26 Thread Richard Huxton

On 26/03/10 03:56, Eliot Gable wrote:


I really am chasing milliseconds here, and I appreciate all your feedback.
You've given me a relatively large number of possible optimizations I can
try out. I will definitely try out the libpqtypes. That sounds like a
promising way to further cut down on execution time. I think most of my
performance penalty is in transfering the results back to the C++
application.


In addition to all of Merlin's good advice, if the client is on a 
different machine to the server then try sticking wireshark or similar 
onto the connection. That should make it pretty clear where the main 
costs are in getting your data back.


--
  Richard Huxton
  Archonet Ltd

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance Tuning Large PL/PGSQL Stored Procedure

2010-03-25 Thread Merlin Moncure
On Tue, Mar 23, 2010 at 5:00 PM, Eliot Gable
egable+pgsql-performa...@gmail.com wrote:
 The complex type contains roughly 25 fields, mostly text, plus another 10
 REFCURSORs.

How many rows min/max/avg are coming back in your refcursors?  Are you
using cursors in order to return multiple complex data structures
(sets, etc) in a single function call?

 The application that calls the stored procedure was also written by me in
 C++ and uses asynchronous libpq API commands to execute a single SQL
 transaction which calls the stored procedure and also performs a FETCH ALL
 on all open cursors. It then returns all results into various structures.
 All rows of all cursors that are open are always used for every call to the
 stored procedure.

 The stored procedure implements various logic which determines which tables
 in the database to query and how to filter the results from those queries to
 return only the relevant information needed by the C++ application.

 Currently, in terms of optimization, I have taken the following approaches
 based on the following reasoning:

 1. For all queries whose results need to return to the C++ application, I
 utilize cursors so that all results can be readied and generated by the
 stored procedure with just one call to the PostgreSQL backend. I accomplish
 this using asynchronous libpq API calls to issue a single transaction to the
 server. The first command after the BEGIN is a SELECT * FROM
 MyStoredProc(blah), which is then followed by FETCH ALL commands for each
 cursor that the stored procedure leaves open. I then follow up with multiple
 API calls to return the results and retrieve the rows from those results.
 This minimizes the amount of back-and-forth between my C++ application and
 the database backend.

 1a. Incidentally, I am also using cursors for most queries inside the stored
 procedure that do not return results to the C++ application. I am unsure
 whether this incurs a performance penalty compared to doing, for example, a
 SELECT ... INTO (var1, var2, ...) within the body of the stored procedure.
 Instead of using SELECT ... INTO, I am using OPEN cursor_name; FETCH
 cursor_name INTO (var1, var2).

 2. I have built indexes on all columns that are used in where clauses and
 joins.

 3. I use lots of joins to pull in data from various tables (there are around
 60 tables that are queried with each call to the stored procedure).

 4. When performing joins, the first table listed is the one that returns the
 most context-specific results, which always also means that it has the
 most-specific and fewest number of relevant rows. I then join them in order
 of least number of result rows with all inner joins preceding left outer
 joins.

 5. Instead of using UNIONs and EXCEPT clauses, I use multiple WITH clauses
 to define several different query-specific views. I order them such that I
 can join additional tables in later WITH clauses to the views created

WITH clauses can make your queries much easier to read and yield great
speedups if you need to access the table expression multiple times
from other parts of the query.  however, in some cases you can get
into trouble because a standard set of joins is going to give the
planner the most flexibility in terms of query optimization.

 previously in a way that minimizes the number of rows involved in the JOIN
 operations while still providing provably accurate result sets. The EXCEPT
 clauses are then replaced by also defining one view which contains a set of
 IDs that I want filtered from the final result set and using a WHERE id NOT
 IN (SELECT id FROM filtered_view). Typically, this approach leaves me with
 just one UNION of two previously defined views (the union is required


UNION is always an optimization target (did you mean UNION ALL?)

 7. When I have a query I need to execute whose results will be used in
 several other queries, I currently open the cursor for that query using the
 FOR ... LOOP construct to retrieve all records in the result set and build a
 result array using the array_append() method. I then do an unnest(my_array)

do not use array_append.  always do array(select ...) whenever it is
possible. when it isn't, rethink your problem until it is possible.
only exception is to use array_agg aggregate if your problem really is
an aggregation type of thing.  as a matter of fact, any for...loop is
an optimization target because a re-think will probably yield a query
that does the same thing without paying for the loop.


 For most of the joins, they simply join on foreign key IDs and no additional
 filtering criteria are used on their information. Only a handful of the
 joined tables bring in additional criteria by which to filter the result
 set.

 The approach used in 7 with cursors and building a result array which is
 then unnested has me worried in terms of performance. It seems to me there
 should be some better way to accomplish the same thing.

 The stored procedure does not perform 

Re: [PERFORM] Performance Tuning Large PL/PGSQL Stored Procedure

2010-03-25 Thread Eliot Gable
On Thu, Mar 25, 2010 at 10:00 PM, Merlin Moncure mmonc...@gmail.com wrote:

 On Tue, Mar 23, 2010 at 5:00 PM, Eliot Gable
 egable+pgsql-performa...@gmail.com egable%2bpgsql-performa...@gmail.com
 wrote:
  The complex type contains roughly 25 fields, mostly text, plus another 10
  REFCURSORs.

 How many rows min/max/avg are coming back in your refcursors?  Are you
 using cursors in order to return multiple complex data structures
 (sets, etc) in a single function call?


I think the largest number of rows is around 40. Most are substantially
smaller. However, most of them have about two dozen or more columns, and I
have already shortened the list of columns to the minimum possible. The
average number of rows is around 10, but the largest sets of rows also have
the most columns. I'm using the cursors in order to obtain multiple complex
data structures in a single function call.



  The application that calls the stored procedure was also written by me in
  C++ and uses asynchronous libpq API commands to execute a single SQL
  transaction which calls the stored procedure and also performs a FETCH
 ALL
  on all open cursors. It then returns all results into various structures.
  All rows of all cursors that are open are always used for every call to
 the
  stored procedure.
 
  The stored procedure implements various logic which determines which
 tables
  in the database to query and how to filter the results from those queries
 to
  return only the relevant information needed by the C++ application.
 
  Currently, in terms of optimization, I have taken the following
 approaches
  based on the following reasoning:
 
  1. For all queries whose results need to return to the C++ application, I
  utilize cursors so that all results can be readied and generated by the
  stored procedure with just one call to the PostgreSQL backend. I
 accomplish
  this using asynchronous libpq API calls to issue a single transaction to
 the
  server. The first command after the BEGIN is a SELECT * FROM
  MyStoredProc(blah), which is then followed by FETCH ALL commands for each
  cursor that the stored procedure leaves open. I then follow up with
 multiple
  API calls to return the results and retrieve the rows from those results.
  This minimizes the amount of back-and-forth between my C++ application
 and
  the database backend.
 
  1a. Incidentally, I am also using cursors for most queries inside the
 stored
  procedure that do not return results to the C++ application. I am unsure
  whether this incurs a performance penalty compared to doing, for example,
 a
  SELECT ... INTO (var1, var2, ...) within the body of the stored
 procedure.
  Instead of using SELECT ... INTO, I am using OPEN cursor_name; FETCH
  cursor_name INTO (var1, var2).
 
  2. I have built indexes on all columns that are used in where clauses and
  joins.
 
  3. I use lots of joins to pull in data from various tables (there are
 around
  60 tables that are queried with each call to the stored procedure).
 
  4. When performing joins, the first table listed is the one that returns
 the
  most context-specific results, which always also means that it has the
  most-specific and fewest number of relevant rows. I then join them in
 order
  of least number of result rows with all inner joins preceding left outer
  joins.
 
  5. Instead of using UNIONs and EXCEPT clauses, I use multiple WITH
 clauses
  to define several different query-specific views. I order them such that
 I
  can join additional tables in later WITH clauses to the views created

 WITH clauses can make your queries much easier to read and yield great
 speedups if you need to access the table expression multiple times
 from other parts of the query.  however, in some cases you can get
 into trouble because a standard set of joins is going to give the
 planner the most flexibility in terms of query optimization.


So far, every case I have converted to WITH clauses has resulted in more
than double the speed (half the time required to perform the query). The
main reason appears to be from avoiding calculating JOIN conditions multiple
times in different parts of the query due to the UNION and EXCEPT clauses.


  previously in a way that minimizes the number of rows involved in the
 JOIN
  operations while still providing provably accurate result sets. The
 EXCEPT
  clauses are then replaced by also defining one view which contains a set
 of
  IDs that I want filtered from the final result set and using a WHERE id
 NOT
  IN (SELECT id FROM filtered_view). Typically, this approach leaves me
 with
  just one UNION of two previously defined views (the union is required


 UNION is always an optimization target (did you mean UNION ALL?)


Thanks for the suggestion on UNION ALL; I indeed do not need elimination of
duplicates, so UNION ALL is a better option.


  7. When I have a query I need to execute whose results will be used in
  several other queries, I currently open the cursor for that query using
 

Re: [PERFORM] performance tuning queries

2008-11-27 Thread PFC



First off, any thoughts per tuning inserts into large tables. I have a  
large

table with an insert like this:

insert into public.bigtab1 (text_col1, text_col2, id) values ...

QUERY PLAN
--
 Result  (cost=0.00..0.01 rows=1 width=0)
(1 row)

The query cost is low but this is one of the slowest statements per  
pgfouine


Possible Causes of slow inserts :

- slow triggers ?
- slow foreign key checks ? (missing index on referenced table ?)
- functional index on a slow function ?
- crummy hardware (5 MB/s RAID cards, etc)
- too many indexes ?


Next we have a select count(*) that  also one of the top offenders:

select count(*) from public.tab3  where user_id=31
and state='A'
and amount0;

 QUERY PLAN
-
 Aggregate  (cost=3836.53..3836.54 rows=1 width=0)
   -  Index Scan using order_user_indx ontab3 user_id   
(cost=0.00..3834.29

rows=897 width=0)
 Index Cond: (idx_user_id = 31406948::numeric)
 Filter: ((state = 'A'::bpchar) AND (amount  0::numeric))
(4 rows)

We have an index on the user_id but not on the state or amount,

add index to amount ?


Can we see EXPLAIN ANALYZE ?

	In this case the ideal index would be multicolumn (user_id, state) or  
(user_id,amount) or (user_id,state,amount) but choosing between the 3  
depends on your data...


You could do :

SELECT count(*), state, amount0  FROM public.tab3  where user_id=31 GROUP  
BY state, amount0;


And post the results.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance tuning queries

2008-11-27 Thread A. Kretschmer
am  Wed, dem 26.11.2008, um 21:21:04 -0700 mailte Kevin Kempter folgendes:
 Next we have a select count(*) that  also one of the top offenders:
 
 select count(*) from public.tab3  where user_id=31 
 and state='A' 
 and amount0;
 
  QUERY PLAN   

 -
  Aggregate  (cost=3836.53..3836.54 rows=1 width=0)
-  Index Scan using order_user_indx ontab3 user_id  (cost=0.00..3834.29 
 rows=897 width=0)
  Index Cond: (idx_user_id = 31406948::numeric)
  Filter: ((state = 'A'::bpchar) AND (amount  0::numeric))
 (4 rows)
 
 We have an index on the user_id but not on the state or amount, 
 
 add index to amount ?

Depends.

- Is the index on user_id a unique index?
- how many different values are in the table for state, i.e., maybe an
  index on state can help
- how many rows in the table with amount  0? If almost all rows
  contains an amount  0 an index can't help in this case


Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: - Header)
GnuPG-ID:   0x3FFF606C, privat 0x7F4584DA   http://wwwkeys.de.pgp.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance tuning queries

2008-11-27 Thread Mario Weilguni

Kevin Kempter schrieb:

Hi All;

I'm looking for tips / ideas per performance tuning some specific queries. 
These are generally large tables on a highly active OLTP system 
(100,000 - 200,000 plus queries per day)


First off, any thoughts per tuning inserts into large tables. I have a large 
table with an insert like this:


insert into public.bigtab1 (text_col1, text_col2, id) values ...

QUERY PLAN
--

 Result  (cost=0.00..0.01 rows=1 width=0)
(1 row)

The query cost is low but this is one of the slowest statements per pgfouine
  
Do you insert multiple values in one transaction, or one transaction per 
insert?



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-04 Thread Matthew

On Tue, 4 Mar 2008, Ivan Voras wrote:

I'm curious about the math behind this - is ~4000 burst or sustained
rate? For common BBU cache sizes (256M, 512M), filling that amount with
data is pretty trivial. When the cache is full, new data can enter the
cache only at a rate at which old data is evacuated from the cache (to
the drive), which is at normal, uncached disk drive speeds.


Should be sustained rate. The reason is if you have no BBU cache, then 
each transaction needs to wait for the disc to rotate around to the bit 
where you want to write, even though each transaction is going to be 
writing in approximately the same place each time. However, with a BBU 
cache, the system no longer needs to wait for the disc to rotate, and the 
writes can be made from the cache to the disc in large groups of 
sequential writes, which is much faster. Several transactions worth can be 
written on each rotation instead of just one.


Matthew

--
People who love sausages, respect the law, and work with IT standards 
shouldn't watch any of them being made.  -- Peter Gutmann


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-04 Thread Greg Smith

On Tue, 4 Mar 2008, Ivan Voras wrote:


I'm curious about the math behind this - is ~4000 burst or sustained
rate?


Average, which is not quite burst or sustained.  No math behind it, just 
looking at a few samples of pgbench data on similar hardware.  A system 
like this one is profiled at 
http://www.kaltenbrunner.cc/blog/index.php?/archives/21-8.3-vs.-8.2-a-simple-benchmark.html 
for example.


For common BBU cache sizes (256M, 512M), filling that amount with data 
is pretty trivial.


I don't have any good numbers handy but I think the burst is 6000, you 
only get that for a few seconds before all the caches fill and the rate 
drops considerably.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread Bill Moran
alan bryan [EMAIL PROTECTED] wrote:

 I've got a new server and am myself new to tuning postgres.
 
 Server is an 8 core Xeon 2.33GHz, 8GB RAM, RAID 10 on a 3ware 9550SX-4LP w/ 
 BBU.
 
 It's serving as the DB for a fairly write intensive (maybe 25-30%) Web
 application in PHP.  We are not using persistent connections, thus the
 high max connections.
 
 I've done the following so far:
 
  cat /boot/loader.conf
 kern.ipc.semmni=256
 kern.ipc.semmns=512
 kern.ipc.semmnu=256
 
  cat /etc/sysctl.conf
 kern.ipc.shmall=393216
 kern.ipc.shmmax=1610612736

I would just set this to 2G (which is the max).  It doesn't really hurt
anything if you don't use it all.

 kern.ipc.semmap=256
 kern.ipc.shm_use_phys=1
 
 postgresql.conf settings (changed from Default):
 max_connections = 180
 shared_buffers = 1024MB

Why not 2G, which would be 25% of total memory?

 maintenance_work_mem = 128MB
 wal_buffers = 1024kB
 
 I then set up a test database for running pgbench with scaling factor
 100. I then ran:
  pgbench -c 100 -t 1000 testdb
 and got:
 starting vacuum...end.
 transaction type: TPC-B (sort of)
 scaling factor: 100
 number of clients: 100
 number of transactions per client: 1000
 number of transactions actually processed: 10/10
 tps = 557.095867 (including connections establishing)
 tps = 558.013714 (excluding connections establishing)
 
 Just for testing, I tried turning off fsync and got:
  pgbench -c 100 -t 1000 testdb
 starting vacuum...end.
 transaction type: TPC-B (sort of)
 scaling factor: 100
 number of clients: 100
 number of transactions per client: 1000
 number of transactions actually processed: 10/10
 tps = 4014.075114 (including connections establishing)
 tps = 4061.662041 (excluding connections establishing)
 
 Do these numbers sound inline with what I should be seeing?  What else
 can I do to try to get better performance in the more general sense
 (knowing that specifics are tied to real world data and testing).  Any
 hints for FreeBSD specific tuning would be helpful.

Are you running FreeBSD 7?  If performance is of the utmost importance,
then you need to be running the 7.X branch.

Based on your pgbench results, I'm guessing you didn't get battery-backed
cache on your systems?  That makes a big difference no matter what OS
you're using.

Besides that, I can't think of any FreeBSD-specific things to do.  Basically,
general tuning advice applies to FreeBSD as well as to most other OS.

-- 
Bill Moran
Collaborative Fusion Inc.

[EMAIL PROTECTED]
Phone: 412-422-3463x4023

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread alan bryan
On Mon, Mar 3, 2008 at 4:26 PM, Bill Moran
[EMAIL PROTECTED] wrote:

cat /boot/loader.conf
   kern.ipc.semmni=256
   kern.ipc.semmns=512
   kern.ipc.semmnu=256
  
cat /etc/sysctl.conf
   kern.ipc.shmall=393216
   kern.ipc.shmmax=1610612736

  I would just set this to 2G (which is the max).  It doesn't really hurt
  anything if you don't use it all.

I'll try that and report back.


   kern.ipc.semmap=256
   kern.ipc.shm_use_phys=1
  
   postgresql.conf settings (changed from Default):
   max_connections = 180
   shared_buffers = 1024MB

  Why not 2G, which would be 25% of total memory?


Ditto - I'll report back.



  Are you running FreeBSD 7?  If performance is of the utmost importance,
  then you need to be running the 7.X branch.

  Based on your pgbench results, I'm guessing you didn't get battery-backed
  cache on your systems?  That makes a big difference no matter what OS
  you're using.

  Besides that, I can't think of any FreeBSD-specific things to do.  Basically,
  general tuning advice applies to FreeBSD as well as to most other OS.

Yes, FreeBSD 7.0-Release.  Tried both the 4BSD and ULE schedulers and
didn't see much difference with this test.
I do have the Battery for the 3ware and it is enabled.  I'll do some
bonnie++ benchmarks and make sure disk is near where it should be.

Should turning off fsync make things roughly 8x-10x faster?  Or is
that indicative of something not being correct or tuned quite right in
the rest of the system?  I'll have to run in production with fsync on
but was just testing to see how much of an effect it had.

Thanks,
Alan

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Performance tuning on FreeBSD

2008-03-03 Thread Greg Smith

On Mon, 3 Mar 2008, alan bryan wrote:


pgbench -c 100 -t 1000 testdb

tps = 558.013714 (excluding connections establishing)

Just for testing, I tried turning off fsync and got:
tps = 4061.662041 (excluding connections establishing)


This is odd.  ~500 is what I expect from this test when there is no write 
cache to accelerate fsync, while ~4000 is normal for your class of 
hardware when you have such a cache.  Since you say your 3Ware card is 
setup with a cache and a BBU, that's suspicious--you should be able to get 
around 4000 with fsync on.  Any chance you have the card set to 
write-through instead of write-back?  That's the only thing that comes to 
mind that would cause this.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your Subscription:
http://mail.postgresql.org/mj/mj_wwwusr?domain=postgresql.orgextra=pgsql-performance


Re: [PERFORM] Performance Tuning and Disk Cache

2007-03-18 Thread hubert depesz lubaczewski

On 3/18/07, Barry Moore [EMAIL PROTECTED] wrote:

Does anyone know how I can repeatedly run the same query in the
worst case scenario of no postgres data in the disk cache (e.g.,
clear the page cache or force it to be ignored)?


try to disconnect from postgresql, reconnect, rerun the query.
if it doesn't help - you can try unmounting filesystem which contains
postgresql data, and remounting it again. of course with postgresql
shutdown.

depesz

--
http://www.depesz.com/ - nowy, lepszy depesz

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [PERFORM] Performance Tuning and Disk Cache

2007-03-18 Thread David Boreham

Barry Moore wrote:

I have a very slow query that I'm trying to tune.  I think my  
performance tuning is being complicated by the system's page cache.


If a run the query after the system has been busy with other tasks  
for quite a long time then the query can take up to 8-10 minutes to  
complete.  If I then rerun the same query it will complete in a  
couple of seconds.


Does anyone know how I can repeatedly run the same query in the  
worst case scenario of no postgres data in the disk cache (e.g.,  
clear the page cache or force it to be ignored)?


In my experience the only 100% reliable way to do this is to reboot the 
machine.




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance Tuning and Disk Cache

2007-03-18 Thread Rangarajan Vasudevan
If you are running on a Linux kernel, try /proc/sys/vm/drop_caches. I
believe the appropriate command is echo 3  /proc/sys/vm/drop_caches.
Since Postgres has its own cache of data, the above followed by a PG
restart should do what you are looking for.

Ranga


 Barry Moore wrote:

 I have a very slow query that I'm trying to tune.  I think my
 performance tuning is being complicated by the system's page cache.

 If a run the query after the system has been busy with other tasks
 for quite a long time then the query can take up to 8-10 minutes to
 complete.  If I then rerun the same query it will complete in a
 couple of seconds.

 Does anyone know how I can repeatedly run the same query in the
 worst case scenario of no postgres data in the disk cache (e.g.,
 clear the page cache or force it to be ignored)?

 In my experience the only 100% reliable way to do this is to reboot the
 machine.



 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq




---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PERFORM] Performance Tuning and Disk Cache

2007-03-18 Thread Michael Stone

On Sun, Mar 18, 2007 at 06:45:34AM -0600, Barry Moore wrote:
Does anyone know how I can repeatedly run the same query in the  
worst case scenario of no postgres data in the disk cache (e.g.,  
clear the page cache or force it to be ignored)?


Depends on your OS. On linux you can run:
echo 1  /proc/sys/vm/drop_caches

Mike Stone

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Tuning Article

2005-06-24 Thread Dmitri Bichko
Hi,

The article seems to dismiss RAID5 a little too quickly.  For many
application types, using fast striped mirrors for the index space and
RAID5 for the data can offer quite good performance (provided a
sufficient number of spindles for the RAID5 - 5 or 6 disks or more).  In
fact, random read (ie most webapps) performance of RAID5 isn't
necessarily worse than that of RAID10, and can in fact be better in some
circumstances.  And, using the cheaper RAID5 might allow you to do that
separation between index and data in the first place.

Just thought I'd mention it,
Dmitri

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Frank Wiles
Sent: Wednesday, June 22, 2005 10:52 AM
To: pgsql-performance@postgresql.org
Subject: [PERFORM] Performance Tuning Article



  Hi Everyone, 

  I've put together a short article and posted it online regarding
  performance tuning PostgreSQL in general.  I believe it helps to bring
  together the info in a easy to digest manner. I would appreciate any
  feedback, comments, and especially any technical corrections.  

  The article can be found here: 

  http://www.revsys.com/writings/postgresql-performance.html

  Thanks! 

 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
The information transmitted is intended only for the person or entity to
which it is addressed and may contain confidential and/or privileged
material. Any review, retransmission, dissemination or other use of, or
taking of any action in reliance upon, this information by persons or
entities other than the intended recipient is prohibited. If you
received this in error, please contact the sender and delete the
material from any computer

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Radu-Adrian Popescu


There have been problems with Xeon processors.



Can you elaborate on that please ?

Thanks,
--
Radu-Adrian Popescu
CSA, DBA, Developer
Aldrapay MD
Aldratech Ltd.
+40213212243


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Dave Cramer
My understanding is that it isn't particularly XEON processors that  
is the problem


Any dual processor will exhibit the problem, XEON's with  
hyperthreading exacerbate the problem though


and the good news is that it has been fixed in 8.1

Dave
On 23-Jun-05, at 8:16 AM, Keith Worthington wrote:


Radu-Adrian Popescu wrote:



There have been problems with Xeon processors.



Can you elaborate on that please ?
Thanks,



Not really as I do not understand the issue.

Here is one post from the archives.
http://archives.postgresql.org/pgsql-performance/2005-05/msg00441.php

If you search the archives for xeon sooner or later you will bump  
into something relevant.


--
Kind Regards,
Keith

---(end of  
broadcast)---

TIP 7: don't forget to increase your free space map settings






Dave Cramer
[EMAIL PROTECTED]
www.postgresintl.com
ICQ #14675561
jabber [EMAIL PROTECTED]
ph (519 939 0336 )


---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Frank Wiles
On Wed, 22 Jun 2005 22:31:29 -0400
Keith Worthington [EMAIL PROTECTED] wrote:

 Frank,
 
 A couple of things I wish I had been told when I started asking how to
 
 configure a new machine.
 
 Use RAID 10 (striping across mirrored disks)
   or RAID 0+1 (mirror a striped array) for your data.
 Use RAID 1 (mirror) for your OS
 Use RAID 1 (mirror) for the WAL.
 
 Don't put anything else on the array holding the WAL.
 
 There have been problems with Xeon processors.

  I believe all of these issues are covered in the article, but
  obviously not clearly enough.  I'll work on rewording that section.

 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Radu-Adrian Popescu

Dave Cramer wrote:
My understanding is that it isn't particularly XEON processors that  is 
the problem


Any dual processor will exhibit the problem, XEON's with  hyperthreading 
exacerbate the problem though


and the good news is that it has been fixed in 8.1



Where's that ? The only information I have is a message from Tom Lane saying the 
buffer manager (or something like that) locking has been redone for 8.0. Any 
pointers ?



Dave


Thanks,
--
Radu-Adrian Popescu
CSA, DBA, Developer
Aldrapay MD
Aldratech Ltd.
+40213212243


smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Michael Stone

On Wed, Jun 22, 2005 at 10:31:29PM -0400, Keith Worthington wrote:

Use RAID 10 (striping across mirrored disks)
 or RAID 0+1 (mirror a striped array) for your data.


yikes! never tell an unsuspecting person to use mirred stripes--that
configuration has lower reliability and performance than striped mirrors
with no redeeming qualities.

Mike Stone

---(end of broadcast)---
TIP 6: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PERFORM] Performance Tuning Article

2005-06-23 Thread Tom Lane
Dave Cramer [EMAIL PROTECTED] writes:
 AFAIK, the problem was the buffer manager

The buffer manager was the place that seemed to be hit hardest by Xeon's
problems with spinlock contention.  I think we've partially fixed that
issue in 8.1, but as we continue to improve the system's performance,
it's likely to surface as a bottleneck again in other places.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Josh Berkus
Frank,

   I've put together a short article and posted it online regarding
   performance tuning PostgreSQL in general.  I believe it helps to bring
   together the info in a easy to digest manner. I would appreciate any
   feedback, comments, and especially any technical corrections.

Looks nice. You should mark the link to the perf tips at Varlena.com as 
PostgreSQL 7.4 and augment it with the current version here:
www.powerpostgresql.com/PerfList
as well as the Annotated .Conf File:
www.powerpostgresql.com/Docs

For my part, I've generally seen that SATA disks still suck for read-write 
applications.   I generally rate 1 UltraSCSI = 2 SATA disks for anything but 
a 99% read application.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Frank Wiles
On Wed, 22 Jun 2005 10:16:03 -0700
Josh Berkus josh@agliodbs.com wrote:

 Frank,
 
I've put together a short article and posted it online regarding
performance tuning PostgreSQL in general.  I believe it helps to
bring together the info in a easy to digest manner. I would
appreciate any feedback, comments, and especially any technical
corrections.
 
 Looks nice. You should mark the link to the perf tips at Varlena.com
 as  PostgreSQL 7.4 and augment it with the current version here:
 www.powerpostgresql.com/PerfList
 as well as the Annotated .Conf File:
 www.powerpostgresql.com/Docs

  Thanks! These changes have been incorporated. 
 
 For my part, I've generally seen that SATA disks still suck for
 read-write  applications.   I generally rate 1 UltraSCSI = 2 SATA
 disks for anything but  a 99% read application.

  I'll work this bit of wisdom in later tonight. Thanks again for the
  feedback. 

 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org
 -


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Karim Nassar
On Wed, 2005-06-22 at 09:52 -0500, Frank Wiles wrote:
  I've put together a short article and posted it online regarding
   performance tuning PostgreSQL in general. 

Nice work! Some minor issues I saw:

* section Understanding the process, para 5:

Now that PostgreSQL has a plan of what it believes to be the best way
to retrieve the hardware it is time to actually get it.

Do you mean retrieve the data instead of retrieve the hardware?


* Perhaps some examples under Disk Configuration? 


* section Database Design and Layout, after new table layout:

Take for example the employee table above. Your probably only display
active employees throughout the majority of the application...

Do you mean You're probably only displaying?


HTH,
-- 
Karim Nassar [EMAIL PROTECTED]


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Tobias Brox
[Frank Wiles - Wed at 09:52:27AM -0500]
   I've put together a short article and posted it online regarding
   performance tuning PostgreSQL in general.  I believe it helps to bring
   together the info in a easy to digest manner. I would appreciate any
   feedback, comments, and especially any technical corrections.  

I did not read through the whole article, but I already have some comments;

work_mem was formerly sort_mem.  As many of us still use pg7, you should
probably have a note about it.

There are already quite some short articles at the web about this issue, and
that was actually my starting point when I was assigned the task of tweaking
the database performance.  I think diversity is a good thing, some of the
short articles was relatively outdated, others were not very well written.
And also - I still never had the chance to do proper benchmarking of the
impact of my changes in the configuration file, I just chose to trust some
of the advices when I saw almost the same advice repeated in several
articles.

I think we need some comprehensive chapter about this in the manual, with
plenty of pointers - or eventually some separate well-organized pages
telling about all known issues.  It seems to me that many of the standard
tips here are repeating themselves over and over again.

-- 
Tobias Brox, +86-13521622905
Nordicbet, IT dept

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance Tuning Article

2005-06-22 Thread Keith Worthington




 I've put together a short article and posted it online regarding
 performance tuning PostgreSQL in general.  I believe it helps to
 bring together the info in a easy to digest manner. I would
 appreciate any feedback, comments, and especially any technical
 corrections.


Looks nice. You should mark the link to the perf tips at Varlena.com
as  PostgreSQL 7.4 and augment it with the current version here:
www.powerpostgresql.com/PerfList
as well as the Annotated .Conf File:
www.powerpostgresql.com/Docs



  Thanks! These changes have been incorporated. 
 


For my part, I've generally seen that SATA disks still suck for
read-write  applications.   I generally rate 1 UltraSCSI = 2 SATA
disks for anything but  a 99% read application.



  I'll work this bit of wisdom in later tonight. Thanks again for the
  feedback. 


 -
   Frank Wiles [EMAIL PROTECTED]
   http://www.wiles.org
 -


Frank,

A couple of things I wish I had been told when I started asking how to 
configure a new machine.


Use RAID 10 (striping across mirrored disks)
 or RAID 0+1 (mirror a striped array) for your data.
Use RAID 1 (mirror) for your OS
Use RAID 1 (mirror) for the WAL.

Don't put anything else on the array holding the WAL.

There have been problems with Xeon processors.

--
Kind Regards,
Keith

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PERFORM] Performance Tuning

2005-02-09 Thread John Arbash Meinel
Chris Kratz wrote:
Hello All,
In contrast to what we hear from most others on this list, we find our
database servers are mostly CPU bound.  We are wondering if this is because
we have postgres configured incorrectly in some way, or if we really need
more powerfull processor(s) to gain more performance from postgres.

If everything is cached in ram, it's pretty easy to be CPU bound. You
very easily could be at this point if your database is only 2.6G and you
don't touch all the tables often.
I do believe that when CPU bound, the best thing to do is get faster CPUs.
...
Our question is simply this, is it better to invest in a faster processor at
this point, or are there configuration changes to make it faster?  I've done
some testing with with 4x SCSI 10k and the performance didn't improve, in
fact it actually was slower the the sata drives marginally.  One of our
developers is suggesting we should compile postgres from scratch for this
particular processor, and we may try that.  Any other ideas?
-Chris
On this particular development server, we have:
Athlon XP,3000
1.5G Mem
4x Sata drives in Raid 0

I'm very surprised you are doing RAID 0. You realize that if 1 drive
goes out, your entire array is toast, right? I would recommend doing
either RAID 10 (0+1), or even Raid 5 if you don't do a lot of writes.
Probably most important, though is to look at the individual queries and
see what they are doing.
Postgresql 7.4.5 installed via RPM running on Linux kernel 2.6.8.1
Items changed in the postgresql.conf:
tcpip_socket = true
max_connections = 32
port = 5432
shared_buffers = 12288  # min 16, at least max_connections*2, 8KB each
sort_mem=16384
vacuum_mem = 32768  # min 1024, size in KB
max_fsm_pages = 6   # min max_fsm_relations*16, 6 bytes each
max_fsm_relations = 1000# min 100, ~50 bytes each
effective_cache_size = 115200   # typically 8KB each
random_page_cost = 1# units are one sequential page fetch cost

Most of these seem okay to me, but random page cost is *way* too low.
This should never be tuned below 2.  I think this says an index scan of
*all* rows is as cheap as a sequential scan of all rows. and that
should never be true.
What could actually be happening is that you are getting index scans
when a sequential scan would be faster.
I don't know what you would see, but what does explain analyze select
count(*) from blah; say. If it is an index scan, you have your machine
mistuned. select count(*) always grabs every row, and this is always
cheaper with a sequential scan.
John
=:-


signature.asc
Description: OpenPGP digital signature


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Chris Kratz
On Wednesday 09 February 2005 03:38 pm, John Arbash Meinel wrote:
...
 I'm very surprised you are doing RAID 0. You realize that if 1 drive
 goes out, your entire array is toast, right? I would recommend doing
 either RAID 10 (0+1), or even Raid 5 if you don't do a lot of writes.

grin  Yeah, we know.  This is a development server and we drop and reload 
databases regularly (sometimes several times a day).  In this case we don't 
really care about the integrity of the data since it's for our developers to 
test code against.  Also, the system is on a mirrored set of drives.  On our 
live servers we have hardware raid 1 at this point for the data drives.  When 
I/O becomes a bottleneck, we are planning on moving to Raid 10 for the data 
and Raid 1 for the transaction log with as many drives as I can twist arms 
for.  Up to this point it has been easier just to stuff the servers full of 
memory and let the OS cache the db in memory.  We know that at some point 
this will no longer work, but for now it is.

As a side note, I learned something very interesting for our developers here.  
We had been doing a drop database and then a reload off a db dump from our 
live server for test data.  This takes 8-15 minutes depending on the server 
(the one above takes about 8 minutes).  I learned through testing that I can 
use create database template some_other_database and make a duplicate in 
about 2.5 minutes. which is a huge gain for us.  We can load a pristine copy, 
make a duplicate, do our testing on the duplicate, drop the duplicate and 
create a new duplicate in less then five mintes.

Cool.

 Probably most important, though is to look at the individual queries and
 see what they are doing.

 Postgresql 7.4.5 installed via RPM running on Linux kernel 2.6.8.1
 
 Items changed in the postgresql.conf:
 
 tcpip_socket = true
 max_connections = 32
 port = 5432
 shared_buffers = 12288   # min 16, at least max_connections*2, 
 8KB each
 sort_mem=16384
 vacuum_mem = 32768   # min 1024, size in KB
 max_fsm_pages = 6# min max_fsm_relations*16, 6 bytes each
 max_fsm_relations = 1000 # min 100, ~50 bytes each
 effective_cache_size = 115200# typically 8KB each
 random_page_cost = 1 # units are one sequential page fetch cost

 Most of these seem okay to me, but random page cost is *way* too low.
 This should never be tuned below 2.  I think this says an index scan of
 *all* rows is as cheap as a sequential scan of all rows. and that
 should never be true.

You caught me.  I actually tweaked that today after finding a page that 
suggested doing that if the data was mostly in memory.  I have been running 
it at 2, and since we didn't notice any improvement, it will be going back to 
2.  

 What could actually be happening is that you are getting index scans
 when a sequential scan would be faster.

 I don't know what you would see, but what does explain analyze select
 count(*) from blah; say. If it is an index scan, you have your machine
 mistuned. select count(*) always grabs every row, and this is always
 cheaper with a sequential scan.

 John
 =:-
With a random_page_cost set to 1, on a larger table a select count(*) nets 
this...

 QUERY PLAN
--
 Aggregate  (cost=9848.12..9848.12 rows=1 width=0) (actual 
time=4916.869..4916.872 rows=1 loops=1)
   -  Seq Scan on answer  (cost=0.00..8561.29 rows=514729 width=0) (actual 
time=0.011..2624.202 rows=514729 loops=1)
 Total runtime: 4916.942 ms
(3 rows)

Now here is a very curious thing.  If I turn on timing and run the count 
without explain analyze, I get...

 count

 514729
(1 row)

Time: 441.539 ms

How odd.  Running the explain adds 4.5s to it.  Running the explain again goes 
back to almost 5s.  Now I wonder why that would be different.

Changing random cpu cost back to 2 nets little difference (4991.940ms for 
explain and 496ms)  But we will leave it at that for now.

-- 
Chris Kratz
Systems Analyst/Programmer
VistaShare LLC
www.vistashare.com


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Chris Kratz
On Wednesday 09 February 2005 03:59 pm, Greg Stark wrote:
 Chris Kratz [EMAIL PROTECTED] writes:
  We continue to tune our individual queries where we can, but it seems we
  still are waiting on the db a lot in our app.  When we run most queries,
  top shows the postmaster running at 90%+ constantly during the duration
  of the request. The disks get touched occasionally, but not often.  Our
  database on disk is around 2.6G and most of the working set remains
  cached in memory, hence the few disk accesses.  All this seems to point
  to the need for faster processors.

 I would suggest looking at the top few queries that are taking the most
 cumulative time on the processor. It sounds like the queries are doing a
 ton of logical i/o on data that's cached in RAM. A few indexes might cut
 down on the memory bandwidth needed to churn through all that data.

Hmmm, yes we continue to use indexes judiciously.  I actually think we've 
overdone it in some cases since inserts are starting to slow in some critical 
areas.

  Items changed in the postgresql.conf:
  ...
  random_page_cost = 1# units are one sequential page fetch 
  cost

 This makes it nigh impossible for the server from ever making a sequential
 scan when an index would suffice. What query made you do this? What plan
 did it fix?

Yes, it got set back to 2.  I was testing various settings suggested by a 
posting in the archives and that one didn't get reset.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Chris Kratz
On Wednesday 09 February 2005 05:08 pm, Merlin Moncure wrote:
  Hello All,
 
  In contrast to what we hear from most others on this list, we find our
  database servers are mostly CPU bound.  We are wondering if this is
  because
  we have postgres configured incorrectly in some way, or if we really

 need

  more powerfull processor(s) to gain more performance from postgres.

 Yes, many apps are not I/O bound (mine isn't).  Here are factors that
 are likely to make your app CPU bound:

 1. Your cache hit ratio is very high
 2. You have a lot of concurrency.
 3. Your queries are complex, for example, doing sorting or statistics
 analysis

For now, it's number 3.  Relatively low usage, but very complex sql.

 4. Your queries are simple, but the server has to process a lot of them
 (transaction overhead becomes significant) sequentially.
 5. You have context switching problems, etc.

 On the query side, you can tune things down considerably...try and keep
 sorting down to a minimum (order on keys, avoid distinct where possible,
 use 'union all', not 'union').  Basically, reduce individual query time.

 Other stuff:
 For complex queries, use views to cut out plan generation.
 For simple but frequently run queries (select a,b,c from t where k), use
 parameterized prepared statements for a 50% cpu savings, this may not be
 an option in some client interfaces.

Prepared statements are not something we've tried yet.  Perhaps we should look 
into that in cases where it makes sense.


 On the hardware side, you will get improvements by moving to Opteron,
 etc.

 Merlin

Well, that's what we were looking for.  

---

It sounds like our configuration as it stands is probably about as good as we 
are going to get with the hardware we have at this point.

We are cpu bound reflecting the fact that we tend to have complex statements 
doing aggregates, sorts and group bys.

The solutions appear to primarily be:
1. Going to faster hardware of which probably Opterons would be about the only 
choice.  And even that probably won't be a huge difference.
2. Moving to more materialized views and prepared statements where we can.
3. Continue to tweak the sql behind our app.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] Performance Tuning

2005-02-09 Thread PFC

As a side note, I learned something very interesting for our developers  
here.
We had been doing a drop database and then a reload off a db dump from  
our
live server for test data.  This takes 8-15 minutes depending on the  
server
(the one above takes about 8 minutes).  I learned through testing that I  
can
use create database template some_other_database and make a duplicate in
about 2.5 minutes. which is a huge gain for us.  We can load a pristine  
copy,
make a duplicate, do our testing on the duplicate, drop the duplicate and
create a new duplicate in less then five mintes.
	I think thats because postgres just makes a file copy from the template.  
Thus you could make it 2x faster if you put the template in another  
tablespace on another drive.

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Mike Rylander
On Wed, 9 Feb 2005 17:30:41 -0500, Chris Kratz
[EMAIL PROTECTED] wrote:
 The solutions appear to primarily be:
 1. Going to faster hardware of which probably Opterons would be about the only
 choice.  And even that probably won't be a huge difference.

I'd beg to differ on that last part.  The difference between a 3.6GHz
Xeon and a 2.8GHz Opteron is ~150% speed increase on the Opteron on my
CPU bound app.  This is because the memory bandwidth on the Opteron is
ENORMOUS compared to on the Xeon.  Add to that the fact that you
actually get to use more than about 2G of RAM directly and you've got
the perfect platform for a high speed database on a budget.

 2. Moving to more materialized views and prepared statements where we can.

Definitely worth investigating.  I wish I could, but I can't get my
customers to even consider slightly out of date stats :(

 3. Continue to tweak the sql behind our app.

Short of an Opteron based system, this is by far your best bet.

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Christopher Browne
The world rejoiced as [EMAIL PROTECTED] (PFC) wrote:
 As a side note, I learned something very interesting for our
 developers  here.
 We had been doing a drop database and then a reload off a db dump
 from  our
 live server for test data.  This takes 8-15 minutes depending on the
 server
 (the one above takes about 8 minutes).  I learned through testing
 that I  can
 use create database template some_other_database and make a duplicate in
 about 2.5 minutes. which is a huge gain for us.  We can load a
 pristine  copy,
 make a duplicate, do our testing on the duplicate, drop the duplicate and
 create a new duplicate in less then five mintes.

   I think thats because postgres just makes a file copy from the
 template.  Thus you could make it 2x faster if you put the template
 in another tablespace on another drive.

I had some small amusement today trying this feature out in one of our
environments today...

We needed to make a copy of one of the databases we're replicating for
the sysadmins to use for some testing.

I figured using the template capability was:
 a) Usefully educational to one of the other DBAs, and
 b) Probably a quick way to copy the data over.

We shortly discovered that we had to shut off the Slony-I daemon in
order to get exclusive access to the database; no _big_ deal.

At that point, he hit ENTER, and rather quickly saw...
CREATE DATABASE.

We then discovered that the sysadmins wanted the test DB to be on one
of the other servers.  Oops.  Oh, well, we'll have to do this on the
other server; no big deal.

Entertainment ensued...  My, that's taking a while...  At about the
point that we started thinking there might be a problem...

CREATE DATABASE

The entertainment was that the first box is one of those spiffy new
4-way Opteron boxes, whilst the slow one was a 4-way Xeon...  Boy,
those Opterons are faster...
-- 
output = reverse(moc.liamg @ enworbbc)
http://cbbrowne.com/info/rdbms.html
No matter how far you have gone on the wrong road, turn back.
-- Turkish proverb

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] Performance Tuning

2005-02-09 Thread Rod Taylor
On Wed, 2005-02-09 at 15:01 -0500, Chris Kratz wrote:
 Hello All,
 
 In contrast to what we hear from most others on this list, we find our 
 database servers are mostly CPU bound.  We are wondering if this is because 
 we have postgres configured incorrectly in some way, or if we really need 
 more powerfull processor(s) to gain more performance from postgres.  

Not necessarily. I had a very disk bound system, bought a bunch of
higher end equipment (which focuses on IO) and now have a (faster) but
CPU bound system.

It's just the way the cookie crumbles.

Some things to watch for are large calculations which are easy to move
client side, such as queries that sort for display purposes. Or data
types which aren't really required (using numeric where an integer would
do).

 We continue to tune our individual queries where we can, but it seems we 
 still 
 are waiting on the db a lot in our app.  When we run most queries, top shows 
 the postmaster running at 90%+ constantly during the duration of the request. 
  

Is this for the duration of a single request or 90% constantly?

If it's a single request, odds are you're going through much more
information than you need to. Lots of aggregate work (max / min) perhaps
or count(*)'s where an approximation would do?

 Our question is simply this, is it better to invest in a faster processor at 
 this point, or are there configuration changes to make it faster?  I've done 

If it's for a single request, you cannot get single processors which are
much faster than what you describe as having.

Want to send us a few EXPLAIN ANALYZE's of your longer running queries?

-- 
Rod Taylor [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster