Re: [PERFORM] performance config help

2010-01-14 Thread Dimitri Fontaine
Bob Dusek  writes:
> So, pgBouncer is pretty good.  It doesn't appear to be as good as
> limiting TCON and using pconnect, but since we can't limit TCON in a
> production environment, we may not have a choice.

You can still use pconnect() with pgbouncer, in transaction mode, if
your application is compatible with that (no advisory locks or other
session level tricks). 

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com
Hello together,

I need to increase the write performance when inserting
bytea of 8MB. I am using 8.2.4 on windows with libpq.

The test setting is simple:

I write 100x times a byte array (bytea) of 8 MB random data
into a table having a binary column (and oids and 3 other
int columns, oids are indexed). I realized that writing 8 MB
of 0-bytes is optimized away. With random data, the disk
space now is filled with 800MB each run as expected. I use a
transaction around the insert command.

This takes about 50s, so, 800MB/50s = 16MB/s.

However the harddisk (sata) could write 43 MB/s in the worst
case! Why is write performance limited to 16 MB/s?


Some more hints what I do:

I use PQexecParams() and the INSERT ... $001 notation to NOT
create a real escapted string from the data additionally but
use a pointer to the 8MB data buffer.

I altered the binary column to STORAGE EXTERNAL.

Some experiments with postgresql.conf (fsync off,
shared_buffers=1000MB, checkpoint_segments=256) did not
change the 50s- much (somtimes 60s sometimes a little less).

4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.


Do you have any further idea why 16MB/s seems to be the
limit here?

Thank You
 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Andy Colson

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Robert Haas wrote:


I'm kind of surprised that there are disk I/O subsystems that are so
bad that a single thread doing non-stop I/O can take down the whole
server.  Is that normal?


No.


Does it happen on non-Windows operating
systems?


Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve that?

-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Ivan Voras

fka...@googlemail.com wrote:

Hello together,

I need to increase the write performance when inserting
bytea of 8MB. I am using 8.2.4 on windows with libpq.




This takes about 50s, so, 800MB/50s = 16MB/s.

However the harddisk (sata) could write 43 MB/s in the worst
case! Why is write performance limited to 16 MB/s?


Do you have any further idea why 16MB/s seems to be the
limit here?


Are you doing it locally or over a network? If you are accessing the 
server over a network then it could be the location of the bottleneck.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread tom
Hi,

=== Problem ===

i have a db-table "data_measurand" with about 6000 (60 Millions)
rows and the following query takes about 20-30 seconds (with psql):

mydb=# select count(*) from data_measurand;
  count   
--
 60846187
(1 row)


=== Question ===

- What can i do to improve the performance for the data_measurand table?
 
=== Background ===

I created a application with django 1.1 ( http://djangoproject.com ) to
collect, analyze and visualize measurement data.

=== My System ===

= Postgres Version =
postgres=# select version();
version
-
 PostgreSQL 8.3.9 on i486-pc-linux-gnu, compiled by GCC gcc-4.3.real
(Debian 4.3.2-1.1) 4.3.2
(1 row)

I installed postgres with apt-get from debian lenny without any
modifications.

= Debian Lenny Kernel Version =
lenny:~# uname  -a
Linux or.ammonit.com 2.6.26-2-686-bigmem #1 SMP Wed Nov 4 21:12:12 UTC
2009 i686 GNU/Linux

= Hardware = 
model name  : AMD Athlon(tm) 64 X2 Dual Core Processor 6000+
cpu MHz : 1000.000
cache size  : 512 KB
MemTotal: 8281516 kB (8 GB)

I use a software raid and LVM for Logical Volume Management. Filesystem
is ext3



=== My Table Definitions ===

mydb=# \d data_measurand;
 Table "public.data_measurand"
 Column  |  Type  |
Modifiers  
-++-
 id  | integer| not null default
nextval('data_measurand_id_seq'::regclass)
 entry_id| integer| not null
 sensor_id   | integer| not null
 avg_value   | numeric(10,4)  | 
 avg_count_value | integer| 
 min_value   | numeric(10,4)  | 
 max_value   | numeric(10,4)  | 
 sigma_value | numeric(10,4)  | 
 unit| character varying(20)  | not null
 status  | integer| not null
 comment | character varying(255) | not null
Indexes:
"data_measurand_pkey" PRIMARY KEY, btree (id)
"data_measurand_entry_id_68e2e3fe" UNIQUE, btree (entry_id,
sensor_id)
"data_measurand_avg_count_value" btree (avg_count_value)
"data_measurand_avg_value" btree (avg_value)
"data_measurand_comment" btree (comment)
"data_measurand_entry_id" btree (entry_id)
"data_measurand_max_value" btree (max_value)
"data_measurand_min_value" btree (min_value)
"data_measurand_sensor_id" btree (sensor_id)
"data_measurand_sigma_value" btree (sigma_value)
"data_measurand_status" btree (status)
"data_measurand_unit" btree (unit)
Foreign-key constraints:
"entry_id_refs_id_50fa9bdf" FOREIGN KEY (entry_id) REFERENCES
data_entry(id) DEFERRABLE INITIALLY DEFERRED
"sensor_id_refs_id_5ed84c7c" FOREIGN KEY (sensor_id) REFERENCES
sensor_sensor(id) DEFERRABLE INITIALLY DEFERRED



mydb=# \d data_entry;
   Table "public.data_entry"
  Column  |   Type   |
Modifiers
--+--+-
 id   | integer  | not null default
nextval('data_entry_id_seq'::regclass)
 project_id   | integer  | not null
 logger_id| integer  | not null
 original_file_id | integer  | not null
 datetime | timestamp with time zone | not null
Indexes:
"data_entry_pkey" PRIMARY KEY, btree (id)
"data_entry_logger_id_197f5d41" UNIQUE, btree (logger_id, datetime)
"data_entry_datetime" btree (datetime)
"data_entry_logger_id" btree (logger_id)
"data_entry_original_file_id" btree (original_file_id)
"data_entry_project_id" btree (project_id)
Foreign-key constraints:
"logger_id_refs_id_5f73cf46" FOREIGN KEY (logger_id) REFERENCES
logger_logger(id) DEFERRABLE INITIALLY DEFERRED
"original_file_id_refs_id_44e8d3b1" FOREIGN KEY (original_file_id)
REFERENCES data_originalfile(id) DEFERRABLE INITIALLY DEFERRED
"project_id_refs_id_719fb302" FOREIGN KEY (project_id) REFERENCES
project_project(id) DEFERRABLE INITIALLY DEFERRED



mydb=# \d project_project;
Table "public.project_project"
Column |  Type  |
Modifiers   
---++--
 id| integer| not null default
nextval('project_project_id_seq'::regclass)
 auth_group_id | integer| not null
 name  | character varying(200) | not null
 timezone  | character varying(200) | 
 longitude | double precision   | 
 latitude  | double precision   | 
 altitude  | double precision   | 
 co

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Matthew Wakeling

On Thu, 14 Jan 2010, fka...@googlemail.com wrote:

This takes about 50s, so, 800MB/50s = 16MB/s.

However the harddisk (sata) could write 43 MB/s in the worst
case! Why is write performance limited to 16 MB/s?


Several reasons:

The data needs to be written first to the WAL, in order to provide 
crash-safety. So you're actually writing 1600MB, not 800.


Postgres needs to update a few other things on disc (indexes on the large 
object table maybe?), and needs to call fsync a couple of times. That'll 
add a bit of time.


Your discs can't write 43MB/s in the *worst case* - the worst case is lots 
of little writes scattered over the disc, where it would be lucky to 
manage 1MB/s. Not all of the writes Postgres makes are sequential. A handy 
way of knowing how sequential the writes are is to listen to the disc as 
it writes - the clicking sounds are where it has to waste time moving the 
disc head from one part of the disc to another.


Matthew

--
No trees were killed in the sending of this message.  However a large
number of electrons were terribly inconvenienced.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Aidan Van Dyk
* fka...@googlemail.com  [100114 09:29]:
 
> This takes about 50s, so, 800MB/50s = 16MB/s.
> 
> However the harddisk (sata) could write 43 MB/s in the worst
> case! Why is write performance limited to 16 MB/s?
 
> I altered the binary column to STORAGE EXTERNAL.
> 
> Some experiments with postgresql.conf (fsync off,
> shared_buffers=1000MB, checkpoint_segments=256) did not
> change the 50s- much (somtimes 60s sometimes a little less).
> 
> 4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.
> 
> 
> Do you have any further idea why 16MB/s seems to be the
> limit here?

So, your SATA disk can do 43MB/s of sequential writes, but you're example
is doing:
1) Sequential writes to WAL
2) Random writes to your index
3) Sequential writes to table heap
4) Sequential writes to table' toast heap
5) Any other OS-based FS overhead

Now, writes #2,3 and 4 don't happen completely concurrently with your
WAL, some of them are still in postgres buffers, but easily enough to
interrupt the stream of WAL enough to certainly make it believable that
with everything going on on the disk, you can only write WAL at a
*sustained* 16 MB/s

If you're running a whole system on a single SATA which can stream
43MB/s, remember that for *every* other read/write sent do the disk, you
lose up to 1MB/s (12ms seek time, read/write, and back).  And in that
"every other", you have FS metadata updates, any other file writes the
FS flushes, etc...  20 aditional blocks being that are either read or
written to disk are going to completely chop your 43MB/s rate...

a.

-- 
Aidan Van Dyk Create like a god,
ai...@highrise.ca   command like a king,
http://www.highrise.ca/   work like a slave.


signature.asc
Description: Digital signature


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Matthew Wakeling

On Thu, 14 Jan 2010, tom wrote:

i have a db-table "data_measurand" with about 6000 (60 Millions)
rows and the following query takes about 20-30 seconds (with psql):

mydb=# select count(*) from data_measurand;
 count
--
60846187
(1 row)


Sounds pretty reasonable to me. Looking at your table, the rows are maybe 
200 bytes wide? That's 12GB of data for Postgres to munch through. 30 
seconds is really rather quick for that (400MB/s). What sort of RAID array 
is managing to give you that much?



I use a software raid and LVM for Logical Volume Management. Filesystem
is ext3


Ditch lvm.


This is an FAQ. Counting the rows in a table is an expensive operation in 
Postgres. It can't be answered directly from an index. If you want, you 
can keep track of the number of rows yourself with triggers, but beware 
that this will slow down write access to the table.


Matthew

--
Nog: Look! They've made me into an ensign!
O'Brien: I didn't know things were going so badly.
Nog: Frightening, isn't it?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Florian Weimer
> Do you have any further idea why 16MB/s seems to be the limit here?

BYTEA deserialization is very slow, and this could be a factor here.
Have you checked that you are in fact I/O bound?

You can speed things up by sending the data in binary, by passing
approriate parameters to PQexecParams().

-- 
Florian Weimer
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread A. Kretschmer
In response to tom :
> Hi,
> 
> === Problem ===
> 
> i have a db-table "data_measurand" with about 6000 (60 Millions)
> rows and the following query takes about 20-30 seconds (with psql):
> 
> mydb=# select count(*) from data_measurand;
>   count   
> --
>  60846187
> (1 row)
> 
> 
> === Question ===
> 
> - What can i do to improve the performance for the data_measurand table?

Short answer: nothing.

Long answer: PG has to check the visibility for each record, so it
forces a seq.scan.

But you can get an estimation, ask pg_class (a system table), the column
reltuples there contains an estimated row rount.
http://www.postgresql.org/docs/current/static/catalog-pg-class.html

If you really needs the correct row-count you should create a TRIGGER
and count with this trigger all INSERTs and DELETEs.


Regards, Andreas
-- 
Andreas Kretschmer
Kontakt:  Heynitz: 035242/47150,   D1: 0160/7141639 (mehr: -> Header)
GnuPG: 0x31720C99, 1006 CCB4 A326 1D42 6431  2EB0 389D 1DC2 3172 0C99

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Kevin Grittner
Matthew Wakeling  wrote:
 
> This is an FAQ.
 
I just added it to the wiki FAQ page:
 
http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F
 
-Kevin


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] performance config help

2010-01-14 Thread Pierre Frédéric Caillau d



So, pgBouncer is pretty good.  It doesn't appear to be as good as
limiting TCON and using pconnect, but since we can't limit TCON in a
production environment, we may not have a choice.


Actually, you can : use lighttpd and php/fastcgi.

	Lighttpd handles the network stuff, and funnels/queues any number of  
client connections into a limited number of PHP fastcgi processes. You can  
configure this process pool to your tastes.


	Rather than instanciating 1 PHP interpreter (and 1 postgres) per client  
connection, you can set it up for a max of N PHP procs. If PHP waits a lot  
on IO (you use url fopen, that kind of things) you can set N=5..10 per  
core, but if you don't use that, N=2-3 per core is good. It needs to be  
tuned to your application's need.


	The idea is that if you got enough processes to keep your CPU busy,  
adding more will just fill your RAM, trash your CPU cache, add more  
context swithes, and generally lower your total throughput. Same is true  
for Postgres, too.


	I've switched from apache to lighttpd on a rather busy community site and  
the difference in performance and memory usage were quite noticeable.  
Also, this site used MySQL (argh) so the occasional locking on some MyISAM  
tables would become really itchy unless the number of concurrent processes  
was kept to a manageable level.


	When you bring down your number of postgres processes to some manageable  
level (plot a curve of throughput versus processes and select the  
maximum), if postgres still spends idle time waiting for locks, you'll  
need to do some exploration :


- use the lock view facility in postgres
	- check your triggers : are you using some trigger that updates a count  
as rows are modified ? This can be a point of contention.

- check your FKs too.
- try fsync=off
- try to put the WAL and tables on a ramdisk.
	If you have even a few % iowait, maybe that hides the fact that 1  
postmaster is fsyncing and perhaps 10 others are waiting on it to finish,  
which doesn't count as iowait...


- recompile postgres and enable lwlock timing



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Ivan Voras

Kevin Grittner wrote:

Matthew Wakeling  wrote:
 

This is an FAQ.
 
I just added it to the wiki FAQ page:
 
http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F


Maybe you could add a short note why an estimation like from the 
pg_class table is usually enough.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance



Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Kevin Grittner
Ivan Voras  wrote:
 
> Maybe you could add a short note why an estimation like from the 
> pg_class table is usually enough.
 
OK.  Will do.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Pierre Frédéric Caillau d



However the harddisk (sata) could write 43 MB/s in the worst
case! Why is write performance limited to 16 MB/s?

Some more hints what I do:

I use PQexecParams() and the INSERT ... $001 notation to NOT
create a real escapted string from the data additionally but
use a pointer to the 8MB data buffer.

I altered the binary column to STORAGE EXTERNAL.

Some experiments with postgresql.conf (fsync off,
shared_buffers=1000MB, checkpoint_segments=256) did not
change the 50s- much (somtimes 60s sometimes a little less).

4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.


Big CPU and slow disk...

	You should add another disk just for the WAL -- disks are pretty cheap  
these days.
	Writing the WAL on a second disk is the first thing to do on a  
configuration like yours, if you are limited by writes.

It also reduces the fsync lag a lot since the disk is only doing WAL.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Pierre Frédéric Caillau d



"high CPU usage" It might very well be "high IO usage".


Try this :

Copy (using explorer, the shell, whatever) a huge file.
This will create load similar to ALTER TABLE.
Measure throughput, how much is it ?

	If your server blows up just like it did on ALTER TABLE, you got a IO  
system problem.

If everything is smooth, you can look into other things.

	How's your fragmentation ? Did the disk ever get full ? What does the  
task manager say (swap in/out, disk queue lengthn etc)


PS : try a separate tablespace on another disk.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Andreas Kretschmer
Hi,

version: 8.4.2


I have a table called values:

test=*# \d values
Table "public.values"
 Column |  Type   | Modifiers
+-+---
 id | integer |
 value  | real|
Indexes:
"idx_id" btree (id)

The table contains 10 random rows and is analysed.



And i have 2 queries, both returns the same result:

test=*# explain analyse select id, avg(value) over (partition by value) from 
values where id = 50 order by id;
   QUERY PLAN   
 
-
 WindowAgg  (cost=531.12..549.02 rows=1023 width=8) (actual time=2.032..4.165 
rows=942 loops=1)  
   ->  Sort  (cost=531.12..533.68 rows=1023 width=8) (actual time=2.021..2.270 
rows=942 loops=1) 
 Sort Key: value
 
 Sort Method:  quicksort  Memory: 53kB  
 
 ->  Bitmap Heap Scan on "values"  (cost=24.19..479.98 rows=1023 
width=8) (actual time=0.269..1.167 rows=942 loops=1)
   Recheck Cond: (id = 50)  
 
   ->  Bitmap Index Scan on idx_id  (cost=0.00..23.93 rows=1023 
width=0) (actual time=0.202..0.202 rows=942 loops=1) 
 Index Cond: (id = 50)  
 
 Total runtime: 4.454 ms
 
(9 rows)
 

Time: 4.859 ms
test=*# explain analyse select * from (select id, avg(value) over (partition by 
value) from values  order by id) foo where id = 50;
   QUERY PLAN   


 Subquery Scan foo  (cost=22539.64..24039.64 rows=500 width=12) (actual 
time=677.196..722.975 rows=942 loops=1) 
   Filter: (foo.id = 50)

   ->  Sort  (cost=22539.64..22789.64 rows=10 width=8) (actual 
time=631.991..690.411 rows=10 loops=1)   
 Sort Key: "values".id  

 Sort Method:  external merge  Disk: 2528kB 

 ->  WindowAgg  (cost=6.32..12866.32 rows=10 width=8) (actual 
time=207.462..479.330 rows=10 loops=1)
   ->  Sort  (cost=6.32..11366.32 rows=10 width=8) (actual 
time=207.442..281.546 rows=10 loops=1)   
 Sort Key: "values".value   

 Sort Method:  external merge  Disk: 1752kB 

 ->  Seq Scan on "values"  (cost=0.00..1443.00 rows=10 
width=8) (actual time=0.010..29.742 rows=10 loops=1) 
 Total runtime: 725.362 ms  

(11 rows)   



No question, this is a silly query, but the problem is the 2nd query: it
is obviously not possible for the planner to put the where-condition
into the subquery. That's bad if i want to create a view:

test=*# create view view_values as select id, avg(value) over (partition by 
value) from values  order by id;
CREATE VIEW
Time: 41.280 ms
test=*# commit;
COMMIT
Time: 0.514 ms
test=# explain analyse select * from view_values where id=50;

It is the same bad plan with the Seq Scan on "values".


Is this a bug or PEBKAC or something else?




Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing lis

Re: [PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Tom Lane
Andreas Kretschmer  writes:
> No question, this is a silly query, but the problem is the 2nd query: it
> is obviously not possible for the planner to put the where-condition
> into the subquery.

Well, yeah: it might change the results of the window functions.
I see no bug here.  Your second query asks for a much more complicated
computation, it's not surprising it takes longer.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Andreas Kretschmer
Tom Lane  wrote:

> Andreas Kretschmer  writes:
> > No question, this is a silly query, but the problem is the 2nd query: it
> > is obviously not possible for the planner to put the where-condition
> > into the subquery.
> 
> Well, yeah: it might change the results of the window functions.
> I see no bug here.  Your second query asks for a much more complicated
> computation, it's not surprising it takes longer.

Thank you for the fast answer.

But sorry, I disagree. It is the same query with the same result. I can't see
how the queries should return different results.

What have i overlooked?


tia, Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Tom Lane
Andreas Kretschmer  writes:
> Tom Lane  wrote:
>> I see no bug here.  Your second query asks for a much more complicated
>> computation, it's not surprising it takes longer.

> But sorry, I disagree. It is the same query with the same result. I can't see
> how the queries should return different results.

In the first query

select id, avg(value) over (partition by value) from values where id = 50 order 
by id;

the avg() calculations are being done over only rows with id = 50.  In
the second query

select * from (select id, avg(value) over (partition by value) from values  
order by id) foo where id = 50;

they are being done over all rows.  In this particular example you
happen to get the same result, but that's just because "avg(foo) over
partition by foo" is a dumb example --- it will necessarily just yield
identically foo.  In more realistic computations the results would be
different.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Greg Smith

Kevin Grittner wrote:

Matthew Wakeling  wrote:
 
  

This is an FAQ.

 
I just added it to the wiki FAQ page:
 
http://wiki.postgresql.org/wiki/FAQ#Why_is_.22SELECT_count.28.2A.29_FROM_bigtable.3B.22_slow.3F
  
The content was already there, just not linked into the main FAQ yet: 
http://wiki.postgresql.org/wiki/Slow_Counting


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com



Re: [PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Andreas Kretschmer
Tom Lane  wrote:

> Andreas Kretschmer  writes:
> > Tom Lane  wrote:
> >> I see no bug here.  Your second query asks for a much more complicated
> >> computation, it's not surprising it takes longer.
> 
> > But sorry, I disagree. It is the same query with the same result. I can't 
> > see
> > how the queries should return different results.
> 
> In the first query
> 
> select id, avg(value) over (partition by value) from values where id = 50 
> order by id;
> 
> the avg() calculations are being done over only rows with id = 50.  In
> the second query
> 
> select * from (select id, avg(value) over (partition by value) from values  
> order by id) foo where id = 50;
> 
> they are being done over all rows.  In this particular example you
> happen to get the same result, but that's just because "avg(foo) over
> partition by foo" is a dumb example --- it will necessarily just yield
> identically foo.  In more realistic computations the results would be
> different.

Okay, i believe you now ;-)

I will try to find a case with different results ...

Thx for your fast help!


Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Greg Smith

Andy Colson wrote:

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve 
that?


I've never seen the deadline scheduler resolve anything.  If you're out 
of I/O capacity and that's blocking other work, performance is dominated 
by the policies of the underlying controller/device caches.  Think about 
it a minute:  disks nowadays can easily have 32MB of buffer in them, 
right?  And random read/write operations are lucky to clear 2MB/s on 
cheap drivers.  So once the drive is filled with requests, you can 
easily sit there for ten seconds before the scheduler even has any input 
on resolving the situation.  That's even more true if you've got a 
larger controller cache in the mix.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Kevin Grittner
Greg Smith  wrote:
 
> The content was already there, just not linked into the main FAQ
> yet: 
> http://wiki.postgresql.org/wiki/Slow_Counting
 
For a question asked this frequently, it should probably be in the
FAQ.  I'll add a link from there to the more thorough write-up.
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Andy Colson

On 1/14/2010 12:07 PM, Greg Smith wrote:

Andy Colson wrote:

On 1/13/2010 11:36 PM, Craig Ringer wrote:

Yes. My 3ware 8500-8 on a Debian Sarge box was so awful that launching a
terminal would go from a 1/4 second operation to a 5 minute operation
under heavy write load by one writer. I landed up having to modify the
driver to partially mitigate the issue, but a single user on the
terminal server performing any sort of heavy writing would still
absolutely nuke performance.


On a side note, on linux, would using the deadline scheduler resolve
that?


I've never seen the deadline scheduler resolve anything. If you're out
of I/O capacity and that's blocking other work, performance is dominated
by the policies of the underlying controller/device caches. Think about
it a minute: disks nowadays can easily have 32MB of buffer in them,
right? And random read/write operations are lucky to clear 2MB/s on
cheap drivers. So once the drive is filled with requests, you can easily
sit there for ten seconds before the scheduler even has any input on
resolving the situation. That's even more true if you've got a larger
controller cache in the mix.



That makes sense.  So if there is very little io, or if there is way way 
too much, then the scheduler really doesn't matter.  So there is a slim 
middle ground where the io is within a small percent of the HD capacity 
where the scheduler might make a difference?


-Andy

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Greg Smith

Kevin Grittner wrote:

Greg Smith  wrote:
 
  

The content was already there, just not linked into the main FAQ
yet: 
http://wiki.postgresql.org/wiki/Slow_Counting

 
For a question asked this frequently, it should probably be in the

FAQ.  I'll add a link from there to the more thorough write-up.
 


There's a whole list of FAQs that are documented on the wiki but not in 
the main FAQ yet leftover from before the main FAQ was hosted there. You 
can see them all at 
http://wiki.postgresql.org/wiki/Frequently_Asked_Questions


I just haven't had time to merge those all usefully into the main FAQ.

--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com



Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Greg Smith

Andy Colson wrote:
So if there is very little io, or if there is way way too much, then 
the scheduler really doesn't matter.  So there is a slim middle ground 
where the io is within a small percent of the HD capacity where the 
scheduler might make a difference?


That's basically how I see it.  There seem to be people who run into 
workloads in the middle ground where the scheduler makes a world of 
difference.  I've never seen one myself, and suspect that some of the 
reports of deadline being a big improvement just relate to some buginess 
in the default CFQ implementation that I just haven't encountered.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] bad execution plan for subselects containing windowing-function

2010-01-14 Thread Andreas Kretschmer
Andreas Kretschmer  wrote:
> > they are being done over all rows.  In this particular example you
> > happen to get the same result, but that's just because "avg(foo) over
> > partition by foo" is a dumb example --- it will necessarily just yield
> > identically foo.  In more realistic computations the results would be
> > different.
> 
> Okay, i believe you now ;-)
> 
> I will try to find a case with different results ...

I have got it!


test=# select * from values;
 id | value
+---
  1 |10
  2 |20
  3 |30
  4 |40
  5 |50
  6 |60
  7 |70
  8 |80
  9 |90
(9 rows)

Time: 0.240 ms
test=*# select id, sum(value) over (order by id) from values where id = 5;
 id | sum
+-
  5 |  50
(1 row)

Time: 0.352 ms
test=*# select * from (select id, sum(value) over (order by id) from values) 
foo where id = 5;
 id | sum
+-
  5 | 150
(1 row)

Time: 0.383 ms



Andreas
-- 
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect.  (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly."   (unknown)
Kaufbach, Saxony, Germany, Europe.  N 51.05082°, E 13.56889°

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Kevin Grittner
Greg Smith  wrote:
 
> There's a whole list of FAQs that are documented on the wiki but
> not in the main FAQ yet leftover from before the main FAQ was
> hosted there. You can see them all at 
> http://wiki.postgresql.org/wiki/Frequently_Asked_Questions
> 
> I just haven't had time to merge those all usefully into the main
> FAQ.
 
Well, unless you object to the way I did it, there's one down. 
Should I remove it from the list of "Other FAQs" on the page you
cite?
 
(Of course, it goes without saying that you're welcome to improve
upon anything I put in there.)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Slow "Select count(*) ..." query on table with 60 Mio. rows

2010-01-14 Thread Greg Smith

Kevin Grittner wrote:

Greg Smith  wrote:
 
  

There's a whole list of FAQs that are documented on the wiki but
not in the main FAQ yet leftover from before the main FAQ was
hosted there. You can see them all at 
http://wiki.postgresql.org/wiki/Frequently_Asked_Questions


I just haven't had time to merge those all usefully into the main
FAQ.

 
Well, unless you object to the way I did it, there's one down. 
Should I remove it from the list of "Other FAQs" on the page you

cite?
 


Sure; everyone should feel free to assimilate into the main FAQ and wipe 
out anything on that smaller list. Those are mainly topics where the 
discussion of workarounds and approaches can be much longer than 
standard FAQ length, so I suspect many of the answers are going to be a 
very brief summary with a link to longer discussion. If you come across 
a really small one, we might even wipe out the original page once it's 
merged in.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com



[PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Carlo Stonebanks

My client just informed me that new hardware is available for our DB server.

. Intel Core 2 Quads Quad
. 48 GB RAM
. 4 Disk RAID drive (RAID level TBD)

I have put the ugly details of what we do with our DB below, as well as the 
postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB 
with very large tables and the server is always busy serving a constant 
stream of single-row UPDATEs and INSERTs from parallel automated processes.


There are less than 10 users, as the server is devoted to the KB production 
system.


My questions:

1) Which RAID level would you recommend
2) Which Windows OS would you recommend? (currently 2008 x64 Server)
3) If we were to port to a *NIX flavour, which would you recommend? (which 
support trouble-free PG builds/makes please!)

4) Is this the right PG version for our needs?

Thanks,

Carlo

The details of our use:

. The DB hosts is a data warehouse and a knowledgebase (KB) tracking the 
professional information of 1.3M individuals.

. The KB tables related to these 130M individuals are naturally also large
. The DB is in a perpetual state of serving TCL-scripted Extract, Transform 
and Load (ETL) processes

. These ETL processes typically run 10 at-a-time (i.e. in parallel)
. We would like to run more, but the server appears to be the bottleneck
. The ETL write processes are 99% single row UPDATEs or INSERTs.
. There are few, if any DELETEs
. The ETL source data are "import tables"
. The import tables are permanently kept in the data warehouse so that we 
can trace the original source of any information.

. There are 6000+ and counting
. The import tables number from dozens to hundreds of thousands of rows. 
They rarely require more than a pkey index.
. Linking the KB to the source import date requires an "audit table" of 500M 
rows, and counting.
. The size of the audit table makes it very difficult to manage, especially 
if we need to modify the design.
. Because we query the audit table different ways to audit the ETL processes 
decisions, almost every column in the audit table is indexed.
. The maximum number of physical users is 10 and these users RARELY perform 
any kind of write

. By contrast, the 10+ ETL processes are writing constantly
. We find that internal stats drift, for whatever reason, causing row seq 
scans instead of index scans.
. So far, we have never seen a situation where a seq scan has improved 
performance, which I would attribute to the size of the tables
. We believe our requirements are exceptional, and we would benefit 
immensely from setting up the PG planner to always favour index-oriented 
decisions - which seems to contradict everything that PG advice suggests as 
best practice.


Current non-default conf settings are:

autovacuum = on
autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 250
autovacuum_naptime = 1min
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 500
bgwriter_lru_maxpages = 100
checkpoint_segments = 64
checkpoint_warning = 290
datestyle = 'iso, mdy'
default_text_search_config = 'pg_catalog.english'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
log_destination = 'stderr'
log_line_prefix = '%t '
logging_collector = on
maintenance_work_mem = 16MB
max_connections = 200
max_fsm_pages = 204800
max_locks_per_transaction = 128
port = 5432
shared_buffers = 500MB
vacuum_cost_delay = 100
work_mem = 512MB



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-14 Thread Carlo Stonebanks
Guys, I want to thank you for all of the advice - my client has just made a 
surprise announcement that he would like to set start from scratch with a 
new server, so I am afraid that all of this great advice has to be seen in 
the context of whatever decision is made on that. I am out there, 
hat-in-hand, looking for advice under the PERFORM post: "New server to 
improve performance on our large and  busy DB - advice?"


Thanks again!

Carlo


"Scott Marlowe"  wrote in message 
news:dcc563d11001071740q572cdae2re410788fe790d...@mail.gmail.com...

On Thu, Jan 7, 2010 at 2:48 PM, Carlo Stonebanks
 wrote:

Doing the updates in smaller chunks resolved these apparent freezes - or,
more specifically, when the application DID freeze, it didn't do it for 
more

than 30 seconds. In all likelyhood, this is the OS and the DB thrashing.


It might well be checkpoints.  Have you tried cranking up checkpoint
segments to something like 100 or more and seeing how it behaves then?

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Joshua D. Drake
On Thu, 14 Jan 2010 14:17:13 -0500, "Carlo Stonebanks"
 wrote:
> My client just informed me that new hardware is available for our DB
> server.
> 
> . Intel Core 2 Quads Quad
> . 48 GB RAM
> . 4 Disk RAID drive (RAID level TBD)
> 
> I have put the ugly details of what we do with our DB below, as well as
> the 
> postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB

> with very large tables and the server is always busy serving a constant 
> stream of single-row UPDATEs and INSERTs from parallel automated
processes.
> 
> There are less than 10 users, as the server is devoted to the KB
> production 
> system.
> 
> My questions:
> 
> 1) Which RAID level would you recommend

10

> 2) Which Windows OS would you recommend? (currently 2008 x64 Server)

If you have to run Windows... that works.

> 3) If we were to port to a *NIX flavour, which would you recommend?
(which 
> support trouble-free PG builds/makes please!)

Community driven:
Debian Stable
CentOS 5

Commercial:
Ubuntu LTS
RHEL 5

> 4) Is this the right PG version for our needs?

You want to run at least the latest stable 8.3 series which I believe is
8.3.9.
With the imminent release of 8.5 (6 months), it may be time to move to
8.4.2 instead.


Joshua D. Drake


> 
> Thanks,
> 
> Carlo
> 
> The details of our use:
> 
> . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the

> professional information of 1.3M individuals.
> . The KB tables related to these 130M individuals are naturally also
large
> . The DB is in a perpetual state of serving TCL-scripted Extract,
> Transform 
> and Load (ETL) processes
> . These ETL processes typically run 10 at-a-time (i.e. in parallel)
> . We would like to run more, but the server appears to be the bottleneck
> . The ETL write processes are 99% single row UPDATEs or INSERTs.
> . There are few, if any DELETEs
> . The ETL source data are "import tables"
> . The import tables are permanently kept in the data warehouse so that
we 
> can trace the original source of any information.
> . There are 6000+ and counting
> . The import tables number from dozens to hundreds of thousands of rows.

> They rarely require more than a pkey index.
> . Linking the KB to the source import date requires an "audit table" of
> 500M 
> rows, and counting.
> . The size of the audit table makes it very difficult to manage,
> especially 
> if we need to modify the design.
> . Because we query the audit table different ways to audit the ETL
> processes 
> decisions, almost every column in the audit table is indexed.
> . The maximum number of physical users is 10 and these users RARELY
> perform 
> any kind of write
> . By contrast, the 10+ ETL processes are writing constantly
> . We find that internal stats drift, for whatever reason, causing row
seq 
> scans instead of index scans.
> . So far, we have never seen a situation where a seq scan has improved 
> performance, which I would attribute to the size of the tables
> . We believe our requirements are exceptional, and we would benefit 
> immensely from setting up the PG planner to always favour index-oriented

> decisions - which seems to contradict everything that PG advice suggests
> as 
> best practice.
> 
> Current non-default conf settings are:
> 
> autovacuum = on
> autovacuum_analyze_scale_factor = 0.1
> autovacuum_analyze_threshold = 250
> autovacuum_naptime = 1min
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_vacuum_threshold = 500
> bgwriter_lru_maxpages = 100
> checkpoint_segments = 64
> checkpoint_warning = 290
> datestyle = 'iso, mdy'
> default_text_search_config = 'pg_catalog.english'
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
> log_destination = 'stderr'
> log_line_prefix = '%t '
> logging_collector = on
> maintenance_work_mem = 16MB
> max_connections = 200
> max_fsm_pages = 204800
> max_locks_per_transaction = 128
> port = 5432
> shared_buffers = 500MB
> vacuum_cost_delay = 100
> work_mem = 512MB

-- 
PostgreSQL - XMPP: jdrake(at)jabber(dot)postgresql(dot)org
   Consulting, Development, Support, Training
   503-667-4564 - http://www.commandprompt.com/
   The PostgreSQL Company, serving since 1997

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] new server I/O setup

2010-01-14 Thread Fernando Hevia
Hi all,
 
I've just received this new server:
1 x XEON 5520 Quad Core w/ HT
8 GB RAM 1066 MHz
16 x SATA II Seagate Barracuda 7200.12
3ware 9650SE w/ 256MB BBU
 
It will run an Ubuntu 8.04 LTS Postgres 8.4 dedicated server. Its database
will be getting between 100 and 1000 inserts per second (those are call
detail records of ~300 bytes each) of around 20 clients (voip gateways).
Other activity is mostly read-only and some non time-critical writes
generally at off peak hours.
 
So my first choice was:
 
2 discs in RAID 1 for OS + pg_xlog partitioned with ext2.
12 discs in RAID 10 for postgres data, sole partition with ext3.
2 spares
 
 
My second choice is:
 
4 discs in RAID 10 for OS + pg_xlog partitioned with ext2
10 discs in RAID 10 for postgres, ext3
2 spares.
 
The bbu caché will be enabled for both raid volumes.
 
I justified my first choice in that WAL writes are sequentially and OS
pretty much are too, so a RAID 1 probably would hold ground against a 12
disc RAID 10 with random writes.
 
I don't know in advance if I will manage to gather enough time to try out
both setups so I wanted to know what you guys think of these 2 alternatives.
Do you think a single RAID 1 will become a bottleneck? Feel free to suggest
a better setup I hadn't considered, it would be most welcome.
 
Pd: any clue if hdparm works to deactive the disks write cache even if they
are behind the 3ware controller?
 
Regards,
Fernando.
 


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com
Thank You for your reply.

Ivan Voras:

> Are you doing it locally or over a network? If you are accessing the 
> server over a network then it could be the location of the bottleneck.

All is done locally (for now).
 
 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com

Thanks a lot for the detailed reply.

Matthew Wakeling:

> On Thu, 14 Jan 2010, fka...@googlemail.com wrote:
> > This takes about 50s, so, 800MB/50s = 16MB/s.
> >
> > However the harddisk (sata) could write 43 MB/s in the worst
> > case! Why is write performance limited to 16 MB/s?
> 
> Several reasons:
> 
> The data needs to be written first to the WAL, in order to provide 
> crash-safety. So you're actually writing 1600MB, not 800.

I understand. So the actual throughput is 32MB/s which is
closer to 43 MB/s, of course.

Can I verify that by temporarily disabling WAL writes
completely and see if the thoughput is then doubled?

 Felix


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com

Thanks a lot for your reply.

Hannu Krosing:

> > 4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.
> 
> try inserting the same data using 4 parallel connections or even 8
> parallel ones.

Interesting idea -- I forgot to mention though that 2-3
cores will be occupied soon with other tasks.

 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the hardware the system is running on:

It's an HP Proliant DL-180 G5 server.

Here are the specs... our actual configuration only has one CPU, and 16G of
RAM.
The model of the 2 disks I will post later today, when I get to the server.
I was with many things, sorry.

http://h18000.www1.hp.com/products/quickspecs/12903_na/12903_na.HTML
http://h18004.www1.hp.com/products/quickspecs/DS_00126/DS_00126.pdf

*At A Glance
*The HP ProLiant DL180 G5 is a low cost high capacity storage optimized
2-way server that delivers on a history of design excellence and 2U density
for a variety of rack deployments and applications.

   - Processors:
  - Supports up to two Quad-Core Intel® Xeon® processors: 5400 sequence
  with 12MB Level 2 cache
  - Intel® 5100 Chipset
   - Memory:
  - Up to 32 GB of memory supported by six (6) PC2-5300 (667 MHz) DDR2
  memory slots
   - Internal Drive Support:
  - Supports up to twelve via CTO with controller or up to eight via BTO
  with the addition of a controller:
 - Hot Plug Serial ATA (SATA) 3.5"hard drives; or
 - Hot Plug Serial Attached SCSI (SAS) 3.5"hard drives
 *NOTE:* 4 hard drives are supported standard via BTO. 8 hard drive
 support requires the addition of a Smart Array or HBA
controller. Hot Plug
 and SAS functionality require the addition of a Smart Array or HBA
 controller. 12 hard drive support available via CTO only and
requires a SAS
 controller that supports expanders.
  - Internal storage capacity:
 - SATA Models: Up to 12.0TB (12 x 1TB Hot Plug 3.5" hard drives)
 - SAS Model: Up to 12.0TB (12 x 1TB Hot Plug 3.5" hard drives)
   - Network Controller:
  - One integrated NC105i PCI-e Gigabit NIC (embedded) (Wake on LAN and
  PXE capable)
   - Storage Controllers:
  - HP Embedded SATA RAID Controller (up to 4 hard drive support on
  standard BTO models)
  *NOTE:* Transfer rate 1.5 Gb/s SATA
   - Expansion Slots:
  - One available Low Profile x8 PCI-Express slot using a Low profile
  Riser.
  - Two Full Height/ Full Length Riser options
 - Option1: 2 full-length/full-height PCI-Express x8 connector slots
 (x4 electrical - Standard)
 - Option2: full-length/full-height riser with 2 PCI-X
 Slots(Optional)
   - Infrastructure Management:
  - Optional HP Lights Out 100c Remote Management card with Virtual KVM
  and Virtual Media support (includes IPMI2.0 and SMASH support)
   - USB Ports:
  - Seven USB ports (2) front, (4) rear, (1) internal
   - Optical Drive:
  - Support for one:
 - Optional Multi-bay DVD
 - Optional Floppy (USB only, USB key)
   - Power Supply:
  - 750W Power Supply (Optional Redundancy Hot Plug, Autoswitching) CSCI
  2007/8
  - 1200W High Efficiency Power Supply (Optional Redundancy Hot Plug,
  Autoswitching) (Optional) CSCI 2007/8
 - *NOTE:* Climate Savers Computing Initiative, 2007-2008 Compliant
   - Form Factor:
  - 2U rack models


Regarding the SATA RAID controller, on the other spec pages it says that for
the 8 disks model (ours), it comes with a Smart Array E200. I will try to
check out if we are using the original, since I recall hearing something
about that our disks were SAS (Serial Attached SCSI), and I don't know if it
is possible to connect those disks to embedded Smart Array E200 controller.
Would it be possible?

On Wed, Jan 13, 2010 at 4:13 PM, Eduardo Piombino  wrote:

> Greg, I will post more detailed data as soon as I'm able to gather it.
>
> I was trying out if the cancellation of the ALTER cmd worked ok, I might
> give the ALTER another try, and see how much CPU, RAM and IO usage gets
> involved. I will be doing this monitoring with the process explorer from
> sysinternals, but I don't know how I can make it to log the results. Do you
> know any tool that you have used that can help me generate this evidence? I
> will google a little as soon as possible.
>
>
>
> On Wed, Jan 13, 2010 at 3:46 PM, Greg Smith  wrote:
>
>> Robert Haas wrote:
>>
>>> I'm kind of surprised that there are disk I/O subsystems that are so
>>> bad that a single thread doing non-stop I/O can take down the whole
>>> server.  Is that normal?  Does it happen on non-Windows operating
>>> systems?  What kind of hardware should I not buy to make sure this
>>> doesn't happen to me?
>>>
>>>
>> You can kill any hardware on any OS with the right abusive client.  Create
>> a wide table and insert a few million records into it with generate_series
>> one day and watch what it does to queries trying to run in parallel with
>> that.
>>
>> I think the missing step here to nail down exactly what's happening on
>> Eduardo's system is that he should open up some of the Windows system
>> monitoring tools, look at both disk I/O and CPU usage, and then watch what
>> changes when the troublesome ALTER TABLE shows up.
>>
>>
>> --
>> Greg S

Re: [PERFORM] a heavy duty operation on an "unused" table kills my server

2010-01-14 Thread Eduardo Piombino
Regarding the EA-200 card, here are the specs.
It seems it has support for SAS disks, so it is most probably that we are
using the embedded/default controller.

http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.html
http://h18000.www1.hp.com/products/quickspecs/12460_div/12460_div.pdf

*Key Features *

   - Seamless upgrades from past generations and upgrades to next generation
   HP high performance and high capacity Serial Attached SCSI Smart Array
   controllers.
   - 3G SAS technology delivers high performance and data bandwidth up to
   300 MB\s per physical link and contains full compatibility with 1.5G SATA
   technology.
   - x4 2.5G PCI Express host interface technology delivers high performance
   and data bandwidth up to 2 GB/s maximum bandwidth.
   - Addition of the battery backed cache upgrade enables BBWC, RAID 5,
   Capacity Expansion, RAID migration, and Stripe Size Migration.
   - Mix-and-match SAS and SATA hard drives, lets you deploy drive
   technology as needed to fit your computing environment.
   - Support for up to 2 TB in a single logical drive.
   - Software consistency among all Smart Array family products: Array
   Configuration Utility (ACU), Option ROM Configuration for Arrays (ORCA),
   Systems Insight Manager, Array Diagnostic Utility (ADU) and SmartStart. Some
   of these features are not available with ProLiant 100 series platforms.
   - The SA-E200 controller supports up to 8 drives. The SA-E200i supports
   2-8 drives depending on the server implementation.


*Performance*

HP's High Performance Architecture sets new boundaries of industry
performance expectations!

   - 3Gb/s SAS (300MB/s bandwidth per physical link)
   - x8 3Gb/s SAS physical links (compatible with 1.5G SATA)
   - 64 MB or 128 MB DDR1-266 battery-backed cache provides up to 4.2 GB/s
   maximum bandwidth.
   - x4 2.5G PCI Express host interface provides 2 GB/s maximum bandwidth.
   - MIPS 32-bit Processor
   - Read ahead caching
   - Write-back caching (with battery-backed write cache upgrade)


*Capacity *

Given the increasing need for high performance and rapid capacity expansion,
the SA-E200 offers:

   - Up to 6TB of total storage with 6 x 1TB SATA MDL hard drives (3.5")
   *NOTE:* Support for greater than 2TB in a single logical drive.
   - Up to 2.4TB of total storage with 8 x 300GB SFF SAS hard drives


On Thu, Jan 14, 2010 at 5:49 PM, Eduardo Piombino  wrote:

> Regarding the hardware the system is running on:
>
> It's an HP Proliant DL-180 G5 server.
>
> Here are the specs... our actual configuration only has one CPU, and 16G of
> RAM.
> The model of the 2 disks I will post later today, when I get to the server.
> I was with many things, sorry.
>
> http://h18000.www1.hp.com/products/quickspecs/12903_na/12903_na.HTML
> http://h18004.www1.hp.com/products/quickspecs/DS_00126/DS_00126.pdf
>
> *At A Glance
> *The HP ProLiant DL180 G5 is a low cost high capacity storage optimized
> 2-way server that delivers on a history of design excellence and 2U density
> for a variety of rack deployments and applications.
>
>- Processors:
>   - Supports up to two Quad-Core Intel® Xeon® processors: 5400
>   sequence with 12MB Level 2 cache
>   - Intel® 5100 Chipset
>- Memory:
>   - Up to 32 GB of memory supported by six (6) PC2-5300 (667 MHz) DDR2
>   memory slots
>- Internal Drive Support:
>   - Supports up to twelve via CTO with controller or up to eight via
>   BTO with the addition of a controller:
>  - Hot Plug Serial ATA (SATA) 3.5"hard drives; or
>  - Hot Plug Serial Attached SCSI (SAS) 3.5"hard drives
>  *NOTE:* 4 hard drives are supported standard via BTO. 8 hard
>  drive support requires the addition of a Smart Array or HBA 
> controller. Hot
>  Plug and SAS functionality require the addition of a Smart Array or 
> HBA
>  controller. 12 hard drive support available via CTO only and 
> requires a SAS
>  controller that supports expanders.
>   - Internal storage capacity:
>  - SATA Models: Up to 12.0TB (12 x 1TB Hot Plug 3.5" hard drives)
>  - SAS Model: Up to 12.0TB (12 x 1TB Hot Plug 3.5" hard drives)
>- Network Controller:
>   - One integrated NC105i PCI-e Gigabit NIC (embedded) (Wake on LAN
>   and PXE capable)
>- Storage Controllers:
>   - HP Embedded SATA RAID Controller (up to 4 hard drive support on
>   standard BTO models)
>   *NOTE:* Transfer rate 1.5 Gb/s SATA
>- Expansion Slots:
>   - One available Low Profile x8 PCI-Express slot using a Low profile
>   Riser.
>   - Two Full Height/ Full Length Riser options
>  - Option1: 2 full-length/full-height PCI-Express x8 connector
>  slots (x4 electrical - Standard)
>  - Option2: full-length/full-height riser with 2 PCI-X
>  Slots(Optional)
>- Infrastructure Management:
>   - Optional HP Lights Out 100c Remote Management card with V

Re: [PERFORM] new server I/O setup

2010-01-14 Thread Scott Marlowe
On Thu, Jan 14, 2010 at 1:03 PM, Fernando Hevia  wrote:
> Hi all,
>
> I've just received this new server:
> 1 x XEON 5520 Quad Core w/ HT
> 8 GB RAM 1066 MHz
> 16 x SATA II Seagate Barracuda 7200.12
> 3ware 9650SE w/ 256MB BBU
>
> It will run an Ubuntu 8.04 LTS Postgres 8.4 dedicated server. Its database
> will be getting between 100 and 1000 inserts per second (those are call
> detail records of ~300 bytes each) of around 20 clients (voip gateways).
> Other activity is mostly read-only and some non time-critical writes
> generally at off peak hours.
>
> So my first choice was:
>
> 2 discs in RAID 1 for OS + pg_xlog partitioned with ext2.
> 12 discs in RAID 10 for postgres data, sole partition with ext3.
> 2 spares
>
>
> My second choice is:
>
> 4 discs in RAID 10 for OS + pg_xlog partitioned with ext2
> 10 discs in RAID 10 for postgres, ext3
> 2 spares.
>
> The bbu caché will be enabled for both raid volumes.
>
> I justified my first choice in that WAL writes are sequentially and OS
> pretty much are too, so a RAID 1 probably would hold ground against a 12
> disc RAID 10 with random writes.

I think your first choice is right.  I use the same basic setup with
147G 15k5 SAS seagate drives and the pg_xlog / OS partition is almost
never close to the same level of utilization, according to iostat, as
the main 12 disk RAID-10 array is.  We may have to buy a 16 disk array
to keep up with load, and it would be all main data storage, and our
pg_xlog main drive pair would be just fine.

> I don't know in advance if I will manage to gather enough time to try out
> both setups so I wanted to know what you guys think of these 2
> alternatives. Do you think a single RAID 1 will become a bottleneck? Feel
> free to suggest a better setup I hadn't considered, it would be most
> welcome.

For 12 disks, most likely not.  Especially since your load is mostly
small randomish writes, not a bunch of big multi-megabyte records or
anything, so the random access performance on the 12 disk RAID-10
should be your limiting factor.

> Pd: any clue if hdparm works to deactive the disks write cache even if they
> are behind the 3ware controller?

Not sure, but I'm pretty sure the 3ware card already does the right
thing and turns off the write caching.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com
Aidan Van Dyk:

> So, your SATA disk can do 43MB/s of sequential writes, but you're example
> is doing:
> 1) Sequential writes to WAL
> 2) Random writes to your index
> 3) Sequential writes to table heap
> 4) Sequential writes to table' toast heap
> 5) Any other OS-based FS overhead

Ok, I see. Thanks a lot for the detailed answer! Especially
writing to WAL may eat up 50% as I've learned now. So,
16MB/s x 2 would in fact be 32 MB/s, plus some extras...


However, does that mean: If I have a raw sequential
performance of 100%, I will get a binary write (like in my
example) which is about 33% as a general rule of thumb?

Just to mention:

* The system has two hard disks, the first for
  WinXP, the second purely for the postgres data.

* I was doing nothing else simultanously on the newly
  installed OS.

* The consumed time (50s, see my test case) were needed to
  99.9 % just by PGexecParam() function.

* No network connect to the postgres server (everything
  local).
  
* No complex sql command; just inserting 100x times using
  PGexecParam(), as a transaction.

* The binary data was marked as such in PGexecParam
  (Format = 1).

* What I meant by 43 MB/s "worst case": I downloaded
  some hd benchmarks which showed a performance of
  43-70 MB/s. (Whereas repetitions of my postgres test did
  never vary, but *constantly* performed at 16MB/s).

Hm.

Nevertheless: If your explanation covers all what can be
said about it then replacing the hard disk by a faster one
should increase the performance here (I'll try to check that
out).

Thanks again!

 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com
Florian Weimer:

> > Do you have any further idea why 16MB/s seems to be the limit here?
> 
> BYTEA deserialization is very slow, and this could be a factor here.
> Have you checked that you are in fact I/O bound?

Could you elaborate that a bit? It sounds interesting but I
do not get what you mean by:

"bytea deserialization": Do you mean from an escaped string
back to real binary data? Does that apply to my case (I use
PGexecParam and have the Format arg set to 1, binary) ?

"I/O bound": What do you mean by that?


> You can speed things up by sending the data in binary, by passing
> approriate parameters to PQexecParams().

Do you mean the Format arg =1 ? If not, what is appropriate
here?

 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread fka...@googlemail.com
Pierre Frédéric Caillaud:

> > 4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.
> 
>   Big CPU and slow disk...
> 
>   You should add another disk just for the WAL -- disks are pretty cheap  
> these days.
>   Writing the WAL on a second disk is the first thing to do on a  
> configuration like yours, if you are limited by writes.
>   It also reduces the fsync lag a lot since the disk is only doing WAL.

Good idea -- where can I set the path to WAL?

 Felix



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Fernando Hevia
 

> -Mensaje original-
> De: fka...@googlemail.com

> Nevertheless: If your explanation covers all what can be said 
> about it then replacing the hard disk by a faster one should 
> increase the performance here (I'll try to check that out).
> 

Moving the pg_xlog directory to the OS drive should make a difference and it
will cost you zero.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-14 Thread Craig James

Carlo Stonebanks wrote:
Guys, I want to thank you for all of the advice - my client has just 
made a surprise announcement that he would like to set start from 
scratch with a new server, so I am afraid that all of this great advice 
has to be seen in the context of whatever decision is made on that. I am 
out there, hat-in-hand, looking for advice under the PERFORM post: "New 
server to improve performance on our large and  busy DB - advice?"


You might start this as a new topic with a relevant title, and reiterate your 
database requirements.  Otherwise it will get submerged as just a footnote to 
your original question.  It's really nice to be able to quickly find the 
new-equipment discussions.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Scott Marlowe
On Thu, Jan 14, 2010 at 12:17 PM, Carlo Stonebanks
 wrote:
> My client just informed me that new hardware is available for our DB server.
>
> . Intel Core 2 Quads Quad
> . 48 GB RAM
> . 4 Disk RAID drive (RAID level TBD)
>
> I have put the ugly details of what we do with our DB below, as well as the
> postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB
> with very large tables and the server is always busy serving a constant
> stream of single-row UPDATEs and INSERTs from parallel automated processes.
>
> There are less than 10 users, as the server is devoted to the KB production
> system.
>
> My questions:
>
> 1) Which RAID level would you recommend

RAID-10 with a battery backed hardware caching controller.

> 2) Which Windows OS would you recommend? (currently 2008 x64 Server)

That's probably the most stable choice out there for Windows.

> 3) If we were to port to a *NIX flavour, which would you recommend? (which
> support trouble-free PG builds/makes please!)

I'd parrot what Joshua Drake said here.  Centos / RHEL / Debian / Ubuntu

> 4) Is this the right PG version for our needs?

8.3 is very stable.  Update to the latest.  8.4 seems good, but I've
had, and still am having, problems with it crashing in production.
Not often, maybe once every couple of months, but just enough that I'm
not ready to try and use it there yet.  And I can't force the same
failure in testing, at least not yet.

> The details of our use:
>
> . These ETL processes typically run 10 at-a-time (i.e. in parallel)
> . We would like to run more, but the server appears to be the bottleneck
> . The ETL write processes are 99% single row UPDATEs or INSERTs.

Can you run the ETL processes in such a way that they can do many
inserts and updates at once?  That would certainly speed things up a
bit.

> . The size of the audit table makes it very difficult to manage, especially
> if we need to modify the design.

You might want to look into partitioning / inheritance if that would help.

> . Because we query the audit table different ways to audit the ETL processes
> decisions, almost every column in the audit table is indexed.

This may or may not help.  If you're querying it and the part in the
where clause referencing this column isn't very selective, and index
won't be chosen anyway.  If you've got multiple columns in your where
clause, the more selective ones will use and index and the rest will
get filtered out instead of using an index.  Look in
pg_stat_user_indexes for indexes that don't get used and drop them
unless, of course, they're unique indexes.

> . The maximum number of physical users is 10 and these users RARELY perform
> any kind of write
> . By contrast, the 10+ ETL processes are writing constantly

You may be well served by having two servers, one to write to, and a
slave that is used by the actual users.  Our slony slaves have a much
easier time writing out their data than our master database does.

> . We find that internal stats drift, for whatever reason, causing row seq
> scans instead of index scans.

Yeah, this is a known problem on heavily updated tables and recent
entries.  Cranking up autovacuum a bit can help, but often it requires
special treatment, either by adjusting the autovac analyze threshold
values for just those tables, or running manual analyzes every couple
of minutes.

> . So far, we have never seen a situation where a seq scan has improved
> performance, which I would attribute to the size of the tables

Not so much the size of the tables, as the size of the request.  If
you were running aggregates across whole large tables, a seq scan
would definitely be the way to go.  If you're asking for one row,
index scan should win.  Somewhere between those two, when you get up
to hitting some decent percentage of the rows, the switch from index
scan to seq scan makes sense, and it's likely happening too early for
you.  Look at random_page_cost and effective_cache_size for starters.

> . We believe our requirements are exceptional, and we would benefit
> immensely from setting up the PG planner to always favour index-oriented
> decisions - which seems to contradict everything that PG advice suggests as
> best practice.

See previous comment I made up there ^^^  It's not about always using
indexes, it's about giving the planner the information it needs to
make the right choice.

> Current non-default conf settings are:
>
> autovacuum = on
> autovacuum_analyze_scale_factor = 0.1

You might wanna lower the analyze scale factor if you're having
problems with bad query plans on fresh data.

> autovacuum_analyze_threshold = 250
> autovacuum_naptime = 1min
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_vacuum_threshold = 500
> bgwriter_lru_maxpages = 100
> checkpoint_segments = 64
> checkpoint_warning = 290
> datestyle = 'iso, mdy'
> default_text_search_config = 'pg_catalog.english'
> lc_messages = 'C'
> lc_monetary = 'C'
> lc_numeric = 'C'
> lc_time = 'C'
> log_destination = 'stderr'

[PERFORM] New server to improve performance on our large and busy DB - advice? (v2)

2010-01-14 Thread Carlo Stonebanks

My client just informed me that new hardware is available for our DB server.

. Intel Core 2 Quads Quad
. 48 GB RAM
. 4 Disk RAID drive (RAID level TBD)

I have put the ugly details of what we do with our DB below, as well as the
postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB
with very large tables and the server is always busy serving a constant
stream of single-row UPDATEs and INSERTs from parallel automated processes.

There are less than 10 users, as the server is devoted to the KB production
system.

My questions:

1) Which RAID level would you recommend
2) Which Windows OS would you recommend? (currently 2008 x64 Server)
3) If we were to port to a *NIX flavour, which would you recommend? (which
support trouble-free PG builds/makes please!)
4) Is this the right PG version for our needs?

Thanks,

Carlo

The details of our use:

. The DB hosts is a data warehouse and a knowledgebase (KB) tracking the
professional information of 1.3M individuals.
. The KB tables related to these 130M individuals are naturally also large
. The DB is in a perpetual state of serving TCL-scripted Extract, Transform
and Load (ETL) processes
. These ETL processes typically run 10 at-a-time (i.e. in parallel)
. We would like to run more, but the server appears to be the bottleneck
. The ETL write processes are 99% single row UPDATEs or INSERTs.
. There are few, if any DELETEs
. The ETL source data are "import tables"
. The import tables are permanently kept in the data warehouse so that we
can trace the original source of any information.
. There are 6000+ and counting
. The import tables number from dozens to hundreds of thousands of rows.
They rarely require more than a pkey index.
. Linking the KB to the source import date requires an "audit table" of 500M
rows, and counting.
. The size of the audit table makes it very difficult to manage, especially
if we need to modify the design.
. Because we query the audit table different ways to audit the ETL processes
decisions, almost every column in the audit table is indexed.
. The maximum number of physical users is 10 and these users RARELY perform
any kind of write
. By contrast, the 10+ ETL processes are writing constantly
. We find that internal stats drift, for whatever reason, causing row seq
scans instead of index scans.
. So far, we have never seen a situation where a seq scan has improved
performance, which I would attribute to the size of the tables
. We believe our requirements are exceptional, and we would benefit
immensely from setting up the PG planner to always favour index-oriented
decisions - which seems to contradict everything that PG advice suggests as
best practice.

Current non-default conf settings are:

autovacuum = on
autovacuum_analyze_scale_factor = 0.1
autovacuum_analyze_threshold = 250
autovacuum_naptime = 1min
autovacuum_vacuum_scale_factor = 0.2
autovacuum_vacuum_threshold = 500
bgwriter_lru_maxpages = 100
checkpoint_segments = 64
checkpoint_warning = 290
datestyle = 'iso, mdy'
default_text_search_config = 'pg_catalog.english'
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
log_destination = 'stderr'
log_line_prefix = '%t '
logging_collector = on
maintenance_work_mem = 16MB
max_connections = 200
max_fsm_pages = 204800
max_locks_per_transaction = 128
port = 5432
shared_buffers = 500MB
vacuum_cost_delay = 100
work_mem = 512MB



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Massive table (500M rows) update nightmare

2010-01-14 Thread Carlo Stonebanks
My bad - I thought I had, so it has been re-posted with a (v2) disclaimer in 
the title... like THAT will stop the flaming! 


Thanks for your patience!

"Craig James"  wrote in message 
news:4b4f8a49.7010...@emolecules.com...

Carlo Stonebanks wrote:
Guys, I want to thank you for all of the advice - my client has just made 
a surprise announcement that he would like to set start from scratch with 
a new server, so I am afraid that all of this great advice has to be seen 
in the context of whatever decision is made on that. I am out there, 
hat-in-hand, looking for advice under the PERFORM post: "New server to 
improve performance on our large and  busy DB - advice?"


You might start this as a new topic with a relevant title, and reiterate 
your database requirements.  Otherwise it will get submerged as just a 
footnote to your original question.  It's really nice to be able to 
quickly find the new-equipment discussions.


Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)

2010-01-14 Thread Dave Crooke
I'll bite 


1. In general, RAID-10 is the only suitable RAID configuration for a
database. The decision making comes in how many drives, and splitting stuff
up into LUNs (like putting pg_xlog on its own LUN).


2. None of the above - you're asking the wrong question really. PostgreSQL
is open source, and is developed on Unix. The Windows version is a pretty
good port, as Windows posrt of OSS stuff go, but it's just that, a port.
Your server is being dedicated to running Postgres, so the right question to
ask is "What is the best OS for running Postgres?".

For any given database engine, regardless of the marketing and support
stance, there is only one true "primary" enterprise OS platform that most
big mission critical sites use, and is the best supported and most stable
platform for that RDBMS. For Oracle, that's HP-UX (but 10 years ago, it was
Solaris). For PostgreSQL, it's Linux.

The biggest problem with Postgres on Windows is that it only comes in
32-bit. RAM is the ultimate performance tweak for an RDBMS, and to make
proper use of modern amounts of RAM, you need a 64-bit executable.



3. The two choices I'd consider are both Linux:

- for the conservative / supported approach, get Red Hat and buy support
from them and (e.g.) Enterprise DB
- if you plan to keep pretty current and are happy actively managing
versions and running locally compiled builds, go with Ubuntu


4. The general wisdom is that there are a lot of improvements from 8.3 to
8.4, but how much benefit you'll see in your environment is another
question. If you're building a new system and have to migrate anyway, it
seems like a good opportunity to upgrade.


Cheers
Dave

On Thu, Jan 14, 2010 at 3:25 PM, Carlo Stonebanks <
stonec.regis...@sympatico.ca> wrote:

> My client just informed me that new hardware is available for our DB
> server.
>
> . Intel Core 2 Quads Quad
> . 48 GB RAM
> . 4 Disk RAID drive (RAID level TBD)
>
> I have put the ugly details of what we do with our DB below, as well as the
> postgres.conf settings. But, to summarize: we have a PostgreSQL 8.3.6 DB
> with very large tables and the server is always busy serving a constant
> stream of single-row UPDATEs and INSERTs from parallel automated processes.
>
> There are less than 10 users, as the server is devoted to the KB production
> system.
>
> My questions:
>
> 1) Which RAID level would you recommend
> 2) Which Windows OS would you recommend? (currently 2008 x64 Server)
> 3) If we were to port to a *NIX flavour, which would you recommend? (which
> support trouble-free PG builds/makes please!)
> 4) Is this the right PG version for our needs?
>
> Thanks,
>
> Carlo
>
> The details of our use:
>
> . The DB hosts is a data warehouse and a knowledgebase (KB) tracking the
> professional information of 1.3M individuals.
> . The KB tables related to these 130M individuals are naturally also large
> . The DB is in a perpetual state of serving TCL-scripted Extract, Transform
> and Load (ETL) processes
> . These ETL processes typically run 10 at-a-time (i.e. in parallel)
> . We would like to run more, but the server appears to be the bottleneck
> . The ETL write processes are 99% single row UPDATEs or INSERTs.
> . There are few, if any DELETEs
> . The ETL source data are "import tables"
> . The import tables are permanently kept in the data warehouse so that we
> can trace the original source of any information.
> . There are 6000+ and counting
> . The import tables number from dozens to hundreds of thousands of rows.
> They rarely require more than a pkey index.
> . Linking the KB to the source import date requires an "audit table" of
> 500M
> rows, and counting.
> . The size of the audit table makes it very difficult to manage, especially
> if we need to modify the design.
> . Because we query the audit table different ways to audit the ETL
> processes
> decisions, almost every column in the audit table is indexed.
> . The maximum number of physical users is 10 and these users RARELY perform
> any kind of write
> . By contrast, the 10+ ETL processes are writing constantly
> . We find that internal stats drift, for whatever reason, causing row seq
> scans instead of index scans.
> . So far, we have never seen a situation where a seq scan has improved
> performance, which I would attribute to the size of the tables
> . We believe our requirements are exceptional, and we would benefit
> immensely from setting up the PG planner to always favour index-oriented
> decisions - which seems to contradict everything that PG advice suggests as
> best practice.
>
> Current non-default conf settings are:
>
> autovacuum = on
> autovacuum_analyze_scale_factor = 0.1
> autovacuum_analyze_threshold = 250
> autovacuum_naptime = 1min
> autovacuum_vacuum_scale_factor = 0.2
> autovacuum_vacuum_threshold = 500
> bgwriter_lru_maxpages = 100
> checkpoint_segments = 64
> checkpoint_warning = 290
> datestyle = 'iso, mdy'
> default_text_search_config = 'pg_catalog.english'
> lc_messages = 'C'
> lc_m

Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Pierre Frédéric Caillau d
On Thu, 14 Jan 2010 22:28:07 +0100, fka...@googlemail.com  
 wrote:



Pierre Frédéric Caillaud:


> 4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.

Big CPU and slow disk...

You should add another disk just for the WAL -- disks are pretty cheap
these days.
Writing the WAL on a second disk is the first thing to do on a
configuration like yours, if you are limited by writes.
It also reduces the fsync lag a lot since the disk is only doing WAL.


Good idea -- where can I set the path to WAL?


At install, or use a symlink (they exist on windows too !...)

http://stackoverflow.com/questions/1901405/postgresql-wal-on-windows

	I've no idea of the other needed NTFS tweaks, like if there is a  
noatime/nodiratime ?...




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server to improve performance on our large and busy DB - advice? (v2)

2010-01-14 Thread Craig Ringer

On 15/01/2010 6:35 AM, Dave Crooke wrote:

I'll bite 


1. In general, RAID-10 is the only suitable RAID configuration for a
database. The decision making comes in how many drives, and splitting
stuff up into LUNs (like putting pg_xlog on its own LUN).





The biggest problem with Postgres on Windows is that it only comes in
32-bit. RAM is the ultimate performance tweak for an RDBMS, and to make
proper use of modern amounts of RAM, you need a 64-bit executable.


 though that's much less important for Pg than for most other 
things, as Pg uses a one-process-per-connection model and lets the OS 
handle much of the caching. So long as the OS can use all that RAM for 
caching, Pg will benefit, and it's unlikely you need >2GB for any given 
client connection or for the postmaster.


It's nice to have the flexibility to push up shared_buffers, and it'd be 
good to avoid any overheads in running 32-bit code on win64. However, 
it's not that unreasonable to run a 32-bit Pg on a 64-bit OS and expect 
good performance.


You can always go 64-bit once 8.5/9.0 hits and has stabilized, anyway.

--
Craig Ringer

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Greg Smith

Carlo Stonebanks wrote:

1) Which RAID level would you recommend


It looks like you stepped over a critical step, which is "will the 
server have a good performing RAID card?".  Your whole upgrade could 
underperform if you make a bad mistake on that part.  It's really 
important to nail that down, and to benchmark to prove you got what you 
expected from your hardware vendor.


3) If we were to port to a *NIX flavour, which would you recommend? 
(which support trouble-free PG builds/makes please!)


The only platform I consider close to trouble free as far as the PG 
builds working without issues are RHEL/CentOS, due to the maturity of 
the PGDG yum repository and how up to date it's kept.  Every time I 
wander onto another platform I find the lag and care taken in packaging 
PostgreSQL to be at least a small step down from there.



4) Is this the right PG version for our needs?


8.4 removes the FSM, which takes away a common source for unexpected 
performance issues when you overflow max_fsm_pages one day.  If you're 
going to deploy 8.3, you need to be more careful to monitor the whole 
VACUUM process; it's easier to ignore in 8.4 and still get by OK.  As 
far as general code stability goes, I think it's a wash at this point.  
You might discover a bug in 8.4 that causes a regression, but I think 
you're just as likely to run into a situation that 8.3 handles badly 
that's improved in 8.4.  Hard to say which will work out better in a 
really general way.


. We believe our requirements are exceptional, and we would benefit 
immensely from setting up the PG planner to always favour 
index-oriented decisions - which seems to contradict everything that 
PG advice suggests as best practice.


Pretty much everyone thinks their requirements are exceptional.  It's 
funny how infrequently that's actually true.  The techniques that favor 
index-use aren't that unique:  collect better stats, set basic 
parameters correctly, adjust random_page_cost, investigate plans that 
don't do what you want to figure out why.  It's easy to say there's 
something special about your data rather than follow fundamentals here; 
I'd urge you to avoid doing that.  The odds that the real issue is that 
you're feeding the optimizer bad data is more likely than most people 
think, which brings us to:



Current non-default conf settings are:


I don't see effective_cache_size listed there.  If that's at the 
default, I wouldn't be surprised that you're seeing sequential scans 
instead of indexed ones far too often.




max_connections = 200
work_mem = 512MB


This is a frightening combination by the way.

--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Inserting 8MB bytea: just 25% of disk perf used?

2010-01-14 Thread Magnus Hagander
2010/1/15 Pierre Frédéric Caillaud :
> On Thu, 14 Jan 2010 22:28:07 +0100, fka...@googlemail.com 
>  wrote:
>
>> Pierre Frédéric Caillaud:
>>
>>> > 4 Core CPU 3 Ghz, WinXP, 1 TB SATA disk.
>>>
>>>        Big CPU and slow disk...
>>>
>>>        You should add another disk just for the WAL -- disks are pretty 
>>> cheap
>>> these days.
>>>        Writing the WAL on a second disk is the first thing to do on a
>>> configuration like yours, if you are limited by writes.
>>>        It also reduces the fsync lag a lot since the disk is only doing WAL.
>>
>> Good idea -- where can I set the path to WAL?
>
>        At install, or use a symlink (they exist on windows too !...)
>
>        http://stackoverflow.com/questions/1901405/postgresql-wal-on-windows
>
>        I've no idea of the other needed NTFS tweaks, like if there is a 
> noatime/nodiratime ?...

It does. See 
http://www.hagander.net/talks/Advanced%20PostgreSQL%20on%20Windows.pdf


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] new server I/O setup

2010-01-14 Thread Greg Smith

Fernando Hevia wrote:
I justified my first choice in that WAL writes are sequentially and OS 
pretty much are too, so a RAID 1 probably would hold ground against a 
12 disc RAID 10 with random writes.


The problem with this theory is that when PostgreSQL does WAL writes and 
asks to sync the data, you'll probably discover all of the open OS 
writes that were sitting in the Linux write cache getting flushed before 
that happens.  And that could lead to horrible performance--good luck if 
the database tries to do something after cron kicks off updatedb each 
night for example.


I think there are two viable configurations you should be considering 
you haven't thought about:

, but neither is quite what you're looking at:

2 discs in RAID 1 for OS
2 discs in RAID 1 for pg_xlog
10 discs in RAID 10 for postgres, ext3
2 spares.

14 discs in RAID 10 for everything
2 spares.

Impossible to say which of the four possibilities here will work out 
better.  I tend to lean toward the first one I listed above because it 
makes it very easy to monitor the pg_xlog activity (and the non-database 
activity) separately from everything else, and having no other writes 
going on makes it very unlikely that the pg_xlog will ever become a 
bottleneck.  But if you've got 14 disks in there, it's unlikely to be a 
bottleneck anyway.  The second config above will get you slightly better 
random I/O though, so for workloads that are really limited on that 
there's a good reason to prefer it.


Also:  the whole "use ext2 for the pg_xlog" idea is overrated far as I'm 
concerned.  I start with ext3, and only if I get evidence that the drive 
is a bottleneck do I ever think of reverting to unjournaled writes just 
to get a little speed boost.  In practice I suspect you'll see no 
benchmark difference, and will instead curse the decision the first time 
your server is restarted badly and it gets stuck at fsck.


Pd: any clue if hdparm works to deactive the disks write cache even if 
they are behind the 3ware controller?


You don't use hdparm for that sort of thing; you need to use 3ware's 
tw_cli utility.  I believe that the individual drive caches are always 
disabled, but whether the controller cache is turned on or not depends 
on whether the card has a battery.  The behavior here is kind of weird 
though--it changes if you're in RAID mode vs. JBOD mode, so be careful 
to look at what all the settings are.  Some of these 3ware cards default 
to extremely aggressive background scanning for bad blocks too, you 
might have to tweak that downward too.


--
Greg Smith2ndQuadrant   Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com  www.2ndQuadrant.com



Re: [PERFORM] New server to improve performance on our large and busy DB - advice?

2010-01-14 Thread Carlo Stonebanks
Pretty much everyone thinks their requirements are exceptional.  It's 
funny how infrequently that's actually true.  The techniques that favor 
index-use aren't that unique:  collect better stats, set basic parameters 
correctly, adjust random_page_cost, investigate plans that don't do what 
you want to figure out why.  It's easy to say there's something special 
about your data rather than follow fundamentals here; I'd urge you to 
avoid doing that.  The odds that the real issue is that you're feeding the 
optimizer bad data is more likely than most people think, which brings us 
to:


I understand that. And the answer is usually to go and do and ANALYZE 
manually (if it isn't this, it will be some dependency on a set-returning 
stored function we wrote before we could specify the rows and cost). My 
question is really - why do I need this constant intervention? When we 
rarely do aggregates, when our queries are (nearly) always single row 
queries (and very rarely more than 50 rows) out of tables that have hundreds 
of thousands to millions of rows, what does it take to NOT have to 
intervene? WHich brings me to your next point:


I don't see effective_cache_size listed there.  If that's at the default, 
I wouldn't be surprised that you're seeing sequential scans instead of 
indexed ones far too often.


Nice to know - I suspect someone has been messing around with stuff they 
don't understand. I do know that after some screwing around they got the 
server to the point that it wouldn't restart and tried to back out until it 
would.



max_connections = 200
work_mem = 512MB



This is a frightening combination by the way.


Looks like it's connected to the above issue. The real max connection value 
is 1/10th of that.


Thanks Greg!

Carlo 



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance