Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Smith

Dave Crooke wrote:
There seems to be a wide range of opinion on this  I am new to PG 
and grew up on Oracle, where more SGA is always a good thing ... I 
know people who run Oracle on 2TB Superdome's with titanic SGA sizes 
to keep the whole DB in RAM. I'd be using a 40GB+ Oracle SGA on that 
box of yours.


I wouldn't call it opinion so much as a series of anecdotes all 
suggesting the same thing:  that you cannot translate SGA practice into 
PostgreSQL and expect that to work the same way.  Some data points:


-An academic study at Duke suggested 40% of RAM was optimal for their 
mixed workload, but that was a fairly small amount of RAM.  
http://www.cs.duke.edu/~shivnath/papers/ituned.pdf


-Tests done by Jignesh Shah at Sun not too long ago put diminishing 
returns on a system with a bunch of RAM at 10GB, probably due to buffer 
lock contention issues (details beyond that number not in the slides, 
recalling from memory of the talk itself):  
http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_best


-My warnings about downsides related to checkpoint issues with larger 
buffer pools isn't an opinion at all; that's a fact based on limitations 
in how Postgres does its checkpoints.  If we get something more like 
Oracle's incremental checkpoint logic, this particular concern might go 
away.


-Concerns about swapping, work_mem, etc. are all very real.  All of us 
who have had the database server process killed by the Linux OOM killer 
at least once know that's one OS you absolutely cannot push this too 
hard on.  This is not unique to here, that issue exists in Oracle+SGA 
land as well:  
http://lkml.indiana.edu/hypermail/linux/kernel/0103.3/0906.html


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Is DBLINK transactional

2010-03-16 Thread Chris Browne
cr...@postnewspapers.com.au (Craig Ringer) writes:

 On 13/03/2010 5:54 AM, Jeff Davis wrote:
 On Fri, 2010-03-12 at 12:07 -0500, Merlin Moncure wrote:
 of course.  You can always explicitly open a transaction on the remote
 side over dblink, do work, and commit it at the last possible moment.
 Your transactions aren't perfectly synchronized...if you crash in the
 precise moment between committing the remote and the local you can get
 in trouble.  The chances of this are extremely remote though.

 If you want a better guarantee than that, consider using 2PC.

 Translation in case you don't know: 2PC = two phase commit.

 Note that you have to monitor lost transactions that were prepared
 for commit then abandoned by the controlling app and periodically get
 rid of them or you'll start having issues.

There can be issues even if they're not abandoned...

Note that prepared transactions establish, and maintain, until removed,
all the appropriate locks on the underlying tables and tuples.

As a consequence, maintenance-related activities may be somewhat
surprisingly affected.

foo=# begin; set transaction isolation level serializable;
BEGIN
SET
foo=# insert into my_table (date_time, hostname, duration, diag) values (now(), 
'foo', 1, 2);
INSERT 0 1
foo=# prepare transaction 'foo';
PREPARE TRANSACTION

[then, I quit the psql session...]

foo=# select * from pg_locks where relation = (select oid from pg_class where 
relname = 'my_table');
-[ RECORD 1 ]--+-
locktype   | relation
database   | 308021
relation   | 308380
page   |
tuple  |
virtualxid |
transactionid  |
classid|
objid  |
objsubid   |
virtualtransaction | -1/433653
pid|
mode   | RowExclusiveLock
granted| t

If I try to truncate the table...

foo=# truncate my_table;
[hangs, waiting on the lock...]

[looking at another session...]

foo=#  select * from pg_locks where relation = (select oid from pg_class where 
relname = 'my_table');
-[ RECORD 1 ]--+
locktype   | relation
database   | 308021
relation   | 308380
page   |
tuple  |
virtualxid |
transactionid  |
classid|
objid  |
objsubid   |
virtualtransaction | -1/433653
pid|
mode   | RowExclusiveLock
granted| t
-[ RECORD 2 ]--+
locktype   | relation
database   | 308021
relation   | 308380
page   |
tuple  |
virtualxid |
transactionid  |
classid|
objid  |
objsubid   |
virtualtransaction | 2/13
pid| 3749
mode   | AccessExclusiveLock
granted| f

Immediately upon submitting commit prepared 'foo';, both locks are
resolved quite quickly.

 The problem with things that are extremely remote possibilities are
 that they tend to be less remote than we expect ;)

 ... and they know just when they can happen despite all the odds to
 maximise the pain and chaos caused.

A lot of these kinds of things only come up as race conditions.  The
trouble is that a lot of races do wind up synchronizing themselves.

In sporting events, this is intended and desired; an official fires the
starter pistol or activates the horn, or what have you, with the
intended result that athletes begin very nearly simultaneously.  And at
the end of Olympic races, their times frequently differ only by
miniscule intervals.

In my example up above, there's a possibly unexpected synchronization
point; the interweaving of the PREPARE TRANSACTION and TRUNCATE requests
lead to a complete lock against the table.  Supposing 15 processes then
try accessing that table, they'll be blocked until the existing locks
get closed out.  Which takes place the very instant after the COMMIT
PREPARED request comes in.  At that moment, 15 racers are released
very nearly simultaneously.

If there is any further mischief to be had in the race, well, they're
set up to tickle it...
-- 
let name=cbbrowne and tld=gmail.com in name ^ @ ^ tld;;
http://linuxdatabases.info/info/nonrdbms.html
Barf, what is all   this  prissy pedantry?  Groups,  modules,  rings,
ufds, patent-office algebra.  Barf!  -- R. William Gosper

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Pierre C


-My warnings about downsides related to checkpoint issues with larger  
buffer pools isn't an opinion at all; that's a fact based on limitations  
in how Postgres does its checkpoints.  If we get something more like  
Oracle's incremental checkpoint logic, this particular concern might go  
away.


Does PG issue checkpoint writes in sorted order ?

I wonder about something, too : if your DB size is smaller than RAM, you  
could in theory set shared_buffers to a size larger than your DB provided  
you still have enough free RAM left for work_mem and OS writes management.  
How does this interact with the logic which prevents seq-scans hogging  
shared_buffers ?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Nikolas Everett
On Tue, Mar 16, 2010 at 7:24 AM, Pierre C li...@peufeu.com wrote:

 I wonder about something, too : if your DB size is smaller than RAM, you
 could in theory set shared_buffers to a size larger than your DB provided
 you still have enough free RAM left for work_mem and OS writes management.
 How does this interact with the logic which prevents seq-scans hogging
 shared_buffers ?


I think the logic you are referring to is the clock sweep buffer accounting
scheme.  That just makes sure that the most popular pages stay in the
buffers.  If your entire db fits in the buffer pool then it'll all get in
there real fast.

Two things to consider though:
1.  The checkpoint issue still stands.
2.  You should really mess around with your cost estimates if this is the
case.  If you make random IO cost the same as sequential IO postgres will
prefer index scans over bitmap index scans and table scans which makes sense
if everything is in memory.


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Pierre C
I think the logic you are referring to is the clock sweep buffer  
accounting

scheme.  That just makes sure that the most popular pages stay in the
buffers.  If your entire db fits in the buffer pool then it'll all get in
there real fast.



Actually, I meant that in the case of a seq scan, PG will try to use just  
a few buffers (a ring) in shared_buffers instead of thrashing the whole  
buffers. But if there was actually a lot of free space in shared_buffers,  
do the pages stay, or do they not ?


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Tom Lane
Pierre C li...@peufeu.com writes:
 Does PG issue checkpoint writes in sorted order ?

No.  IIRC, a patch for that was submitted, and rejected because no
significant performance improvement could be demonstrated.  We don't
have enough information about the actual on-disk layout to be very
intelligent about this, so it's better to just issue the writes and
let the OS sort them.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Stark
On Tue, Mar 16, 2010 at 1:48 PM, Pierre C li...@peufeu.com wrote:
 Actually, I meant that in the case of a seq scan, PG will try to use just a
 few buffers (a ring) in shared_buffers instead of thrashing the whole
 buffers. But if there was actually a lot of free space in shared_buffers, do
 the pages stay, or do they not ?

They don't. The logic only kicks in if the table is expected to be 
1/4 of shared buffers though.

-- 
greg

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Stark
On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pierre C li...@peufeu.com writes:
 Does PG issue checkpoint writes in sorted order ?

 No.  IIRC, a patch for that was submitted, and rejected because no
 significant performance improvement could be demonstrated.  We don't
 have enough information about the actual on-disk layout to be very
 intelligent about this, so it's better to just issue the writes and
 let the OS sort them.

Keep in mind that postgres is issuing writes to the OS buffer cache.
It defers fsyncing the files as late as it can in the hopes that most
of those buffers will be written out by the OS before then. That gives
the OS a long time window in which to flush them out in whatever order
and whatever schedule is most convenient.

If the OS filesystem buffer cache is really small then that might not
work so well. It might be worth rerunning those benchmarks on a
machine with shared buffers taking up all of RAM.


-- 
greg

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Postgres DB maintainenance - vacuum and reindex

2010-03-16 Thread Meena_Ramkumar

How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it
be made without shutting the server? If so, then what will be performance
degradation percentage?
-- 
View this message in context: 
http://old.nabble.com/Postgres-DB-maintainenance---vacuum-and-reindex-tp27913694p27913694.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] GiST index performance

2010-03-16 Thread Yeb Havinga

Matthew Wakeling wrote:

Matthew Wakeling wrote:
A second quite distinct issue is the general performance of GiST 
indexes

which is also mentioned in the old thread linked from Open Items. For
that, we have a test case at
http://archives.postgresql.org/pgsql-performance/2009-04/msg00276.php 
for
btree_gist indexes. I have a similar example with the bioseg GiST 
index. I

have completely reimplemented the same algorithms in Java for algorithm
investigation and instrumentation purposes, and it runs about a hundred
times faster than in Postgres. I think this is a problem, and I'm 
willing

to do some investigation to try and solve it.
I have not made any progress on this issue. I think Oleg and Teodor 
would be better placed working it out. All I can say is that I 
implemented the exact same indexing algorithm in Java, and it 
performed 100 times faster than Postgres. Now, Postgres has to do a 
lot of additional work, like mapping the index onto disc, locking 
pages, and abstracting to plugin user functions, so I would expect 
some difference - I'm not sure 100 times is reasonable though. I tried 
to do some profiling, but couldn't see any one section of code that 
was taking too much time. Not sure what I can further do.

Hello Mathew and list,

A lot of time spent in gistget.c code and a lot of functioncall5's to 
the gist's consistent function which is out of sight for gprof.
Something different but related since also gist: we noticed before that 
gist indexes that use a compressed form for index entries suffer from 
repeated compress calls on query operands (see 
http://archives.postgresql.org/pgsql-hackers/2009-05/msg00078.php).


The btree_gist int4 compress function calls the generic 
gbt_num_compress, which does a palloc. Maybe this palloc is allso hit al 
lot when scanning the index, because the constants that are queries with 
are repeatedly compressed and palloced.


regards,
Yeb Havinga



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex

2010-03-16 Thread Scott Marlowe
On Mon, Mar 15, 2010 at 11:30 PM, Meena_Ramkumar
winmeena_ramku...@yahoo.co.in wrote:

 How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it
 be made without shutting the server? If so, then what will be performance
 degradation percentage?

vacuum can be tuned by the various vacuum_* parameters in the
postgresql.conf file to have little or no impact on other processes
running.  Depending on your IO subsystem, you can tune it up or down
to fit your needs (speed versus impact on other processes).  reindex
however tends to be more intrusive to the system, and may cause some
performance degradation, which will be very dependent on your IO
subsystem (i.e. a single 7200RPM SATA drive system is more likely to
notice and be slowed down by reindexing than a 48 disk 15krpm SAS
RAID-10 array.

The more important question is what problem are you trying to solve,
and are there other, better approaches than the ones you're trying.
Without more info, no one can really say.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex

2010-03-16 Thread Ben Chobot
Autovacuum is your friend for minimal downtime. It is configurable to let you 
adjust how invasive it will be, and you can have different settings per table 
if you wish.

As for the reindex, why do you think you will be reindexing regularly?

On Mar 15, 2010, at 10:30 PM, Meena_Ramkumar wrote:

 
 How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it
 be made without shutting the server? If so, then what will be performance
 degradation percentage?
 -- 
 View this message in context: 
 http://old.nabble.com/Postgres-DB-maintainenance---vacuum-and-reindex-tp27913694p27913694.html
 Sent from the PostgreSQL - performance mailing list archive at Nabble.com.
 
 
 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Postgres DB maintainenance - vacuum and reindex

2010-03-16 Thread Ing. Marcos Ortiz Valmaseda

Meena_Ramkumar escribió:

How to run vacuumdb and reindex for Postgres DB in a non-stop server? Will it
be made without shutting the server? If so, then what will be performance
degradation percentage?
  

To execute vacuum, you can´t stop the server, is another process of it.
If you are using a recent version of PostgreSQL, you can use autovacuum 
on the server and this process is charged of this or to use VACUUM with 
the right schedule. You should avoid to use VACUUM FULL, because is very 
slow and it requires exclusive locks of the tables that you are 
executing this, and it reduces the table size on the disc but It doesn´t 
reduce the index size, but iit can make indexes larger.


With autovacuum = on, you can avoid to use VACUUM frecuently

The performance degradation depends of the quantity of tables and 
databases that you have on your server.


REINDEX is another task that you can execute periodicly on you server, 
but if you don´t want to affect the production task, the best thing yo 
do is to drop the index and reissue the CREATE INDEX CONCURRENTLY command.


Regards


--
 
-- Ing. Marcos Luís Ortíz Valmaseda   --

-- Twitter: http://twitter.com/@marcosluis2186--
-- FreeBSD Fan/User   --
-- http://www.freebsd.org/es  --
-- Linux User # 418229--
-- Database Architect/Administrator   --
-- PostgreSQL RDBMS   --
-- http://www.postgresql.org  --
-- http://planetpostgresql.org--
-- http://www.postgresql-es.org   --

-- Data WareHouse -- Business Intelligence Apprentice --
-- http://www.tdwi.org--
 
-- Ruby on Rails Fan/Developer--

-- http://rubyonrails.org --


Comunidad Técnica Cubana de PostgreSQL
http://postgresql.uci.cu
http://personas.grm.uci.cu/+marcos

Centro de Tecnologías de Gestión de Datos (DATEC) 
Contacto: 
   Correo: centa...@uci.cu
   Telf: +53 07-837-3737
 +53 07-837-3714
Universidad de las Ciencias Informáticas
http://www.uci.cu 





--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Smith

Pierre C wrote:
Actually, I meant that in the case of a seq scan, PG will try to use 
just a few buffers (a ring) in shared_buffers instead of thrashing the 
whole buffers. But if there was actually a lot of free space in 
shared_buffers, do the pages stay, or do they not ?


Pages inserted into the ring buffer and later re-used for new data do 
not stay behind even if there is room for them.  There's a potential 
improvement possible in that code involving better management of the 
situation where the buffer cache hasn't actually reached full capacity 
yet, but as it's an unusual case it's hard to justify optimizing for.  
Besides, the hope is that in this case the OS cache will end up caching 
everything anyway until it has a reason to evict it.  So if you follow 
the rest of the data suggesting you should not give all the memory to 
PostgreSQL to manage, you end up with a reasonable solution to this 
problem anyway.  Those pages will just live in the OS cache instead of 
the database's, with only a few trickling in and staying behind each 
time you do a sequential scan.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Smith

Greg Stark wrote:

On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  

Pierre C li...@peufeu.com writes:


Does PG issue checkpoint writes in sorted order ?
  

No.  IIRC, a patch for that was submitted, and rejected because no
significant performance improvement could be demonstrated.

If the OS filesystem buffer cache is really small then that might not
work so well. It might be worth rerunning those benchmarks on a
machine with shared buffers taking up all of RAM.
  


Here's the original patch again:  
http://archives.postgresql.org/message-id/20080415181742.6c97.52131...@oss.ntt.co.jp


I was the person who tried to reproduce the suggested 10% pgbench 
speedup on a similar system and couldn't replicate any improvement.  
Never was sure what was going on to show such a difference on the 
reference system used to develop the patch versus mine, since they were 
pretty similar.  Possibly some positive interaction with LVM in the test 
case I didn't have.  Maybe the actual reason sorting helped was 
limitations in the HP P400 controller used there I wasn't running into 
with the Areca card I used.  And the always popular didn't account 
fully for all pgbench run to run variation  possibility crossed my mind 
too--that the original observed speedup wasn't caused by the patch but 
by something else.


I did not go out of my way to find test conditions where the patch would 
more likely to help, like the situation you describe where 
shared_buffers was really large relative to the OS cache.  Since the 
patch complicates the checkpoint code and requires some working memory 
to operate, it would have to be a unquestionable win using standard 
practices before it was worth applying.  If it only helps in a situation 
people are unlikely to use in the field, and it net negative for 
everyone else, that's still going to end up on the interesting but 
rejected idea scrapheap at the end of the day.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Alvaro Herrera
Greg Stark escribió:
 On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  Pierre C li...@peufeu.com writes:
  Does PG issue checkpoint writes in sorted order ?
 
  No.  IIRC, a patch for that was submitted, and rejected because no
  significant performance improvement could be demonstrated.  We don't
  have enough information about the actual on-disk layout to be very
  intelligent about this, so it's better to just issue the writes and
  let the OS sort them.
 
 Keep in mind that postgres is issuing writes to the OS buffer cache.
 It defers fsyncing the files as late as it can in the hopes that most
 of those buffers will be written out by the OS before then. That gives
 the OS a long time window in which to flush them out in whatever order
 and whatever schedule is most convenient.

Maybe it would make more sense to try to reorder the fsync calls
instead.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Maybe it would make more sense to try to reorder the fsync calls
 instead.

Reorder to what, though?  You still have the problem that we don't know
much about the physical layout on-disk.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Maybe it would make more sense to try to reorder the fsync calls
  instead.
 
 Reorder to what, though?  You still have the problem that we don't know
 much about the physical layout on-disk.

Well, to block numbers as a first step.

However, this reminds me that sometimes we take the block-at-a-time
extension policy too seriously.  We had a customer that had a
performance problem because they were inserting lots of data to TOAST
tables, causing very frequent extensions.  I kept wondering whether an
allocation policy that allocated several new blocks at a time could be
useful (but I didn't try it).  This would also alleviate fragmentation,
thus helping the physical layout be more similar to logical block
numbers.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 Reorder to what, though?  You still have the problem that we don't know
 much about the physical layout on-disk.

 Well, to block numbers as a first step.

fsync is a file-based operation, and we know exactly zip about the
relative positions of different files on the disk.

 However, this reminds me that sometimes we take the block-at-a-time
 extension policy too seriously.

Yeah, that's a huge performance penalty in some circumstances.

 We had a customer that had a
 performance problem because they were inserting lots of data to TOAST
 tables, causing very frequent extensions.  I kept wondering whether an
 allocation policy that allocated several new blocks at a time could be
 useful (but I didn't try it).  This would also alleviate fragmentation,
 thus helping the physical layout be more similar to logical block
 numbers.

That's not going to do anything towards reducing the actual I/O volume.
Although I suppose it might be useful if it just cuts the number of
seeks.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane escribi�:
  Reorder to what, though?  You still have the problem that we don't know
  much about the physical layout on-disk.
 
  Well, to block numbers as a first step.
 
 fsync is a file-based operation, and we know exactly zip about the
 relative positions of different files on the disk.

Doh, right, I was thinking in the sync-file-range kind of API.


  We had a customer that had a
  performance problem because they were inserting lots of data to TOAST
  tables, causing very frequent extensions.  I kept wondering whether an
  allocation policy that allocated several new blocks at a time could be
  useful (but I didn't try it).  This would also alleviate fragmentation,
  thus helping the physical layout be more similar to logical block
  numbers.
 
 That's not going to do anything towards reducing the actual I/O volume.
 Although I suppose it might be useful if it just cuts the number of
 seeks.

Oh, they had no problems with I/O volume.  It was relation extension
lock that was heavily contended for them.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Tom Lane escribió:
 That's not going to do anything towards reducing the actual I/O volume.
 Although I suppose it might be useful if it just cuts the number of
 seeks.

 Oh, they had no problems with I/O volume.  It was relation extension
 lock that was heavily contended for them.

Really?  I guess that serialized all the I/O ... I'll bet if we got rid
of that locking somehow, they *would* have a problem with I/O volume.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Alvaro Herrera
Tom Lane escribió:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  Tom Lane escribió:
  That's not going to do anything towards reducing the actual I/O volume.
  Although I suppose it might be useful if it just cuts the number of
  seeks.
 
  Oh, they had no problems with I/O volume.  It was relation extension
  lock that was heavily contended for them.
 
 Really?  I guess that serialized all the I/O ... I'll bet if we got rid
 of that locking somehow, they *would* have a problem with I/O volume.

Well, that would solve the problem as far as I'm concerned and they'd
have to start talking to their storage provider ;-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] shared_buffers advice

2010-03-16 Thread Greg Smith

Alvaro Herrera wrote:

Maybe it would make more sense to try to reorder the fsync calls
instead.
  


The pretty obvious left behind idea from 8.3 spread checkpoint 
development was to similarly spread the fsync calls around.  Given that 
we know, for example, Linux with ext3 is going to dump the whole 
filesystem write cache out when the fsync call comes in, the way they're 
currently scheduled has considerably potential for improvement.


Unfortunately, since the tuning on that is going to be very platform 
dependent and require a lot of benchmarking work, I think we need a 
performance farm up and running as a prerequisite to finishing that work 
off.  The spread checkpoint stuff was a much more obvious improvement, 
and that was hard enough to quantify usefully and test.


Returning to the idea of the sorted checkpoints patch as a simple 
example, if it were possible to just push that patch to a test repo and 
see how that changed typical throughput/latency against a 
well-established history, it would be a lot easier to figure out if 
something like that is sensible to consider or not.  I'm not sure how to 
make progress on similar ideas about tuning closer to the filesystem 
level without having something automated that takes over the actual 
benchmark running and data recording steps; it's just way too time 
consuming to do those right now with every tool that's available for 
PostgreSQL so far.  That's the problem I work on, there are easily a 
half dozen good ideas for improvements here floating around where coding 
time is dwarfed by required performance validation time.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[PERFORM] Block at a time ...

2010-03-16 Thread Dave Crooke
I agree with Tom, any reordering attempt is at best second guessing the
filesystem and underlying storage.

However, having the ability to control the extent size would be a worthwhile
improvement for systems that walk and chew gum (write to lots of tables)
concurrently.

I'm thinking of Oracle's AUTOEXTEND settings for tablespace datafiles  I
think the ideal way to do it for PG would be to make the equivalent
configurable in postgresql.conf system wide, and allow specific per-table
settings in the SQL metadata, similar to auto-vacuum.

An awesomely simple alternative is to just specify the extension as e.g. 5%
of the existing table size  it starts by adding one block at a time for
tiny tables, and once your table is over 20GB, it ends up adding a whole 1GB
file and pre-allocating it. Very little wasteage.

Cheers
Dave

On Tue, Mar 16, 2010 at 4:49 PM, Alvaro Herrera
alvhe...@commandprompt.comwrote:

 Tom Lane escribió:
  Alvaro Herrera alvhe...@commandprompt.com writes:
   Maybe it would make more sense to try to reorder the fsync calls
   instead.
 
  Reorder to what, though?  You still have the problem that we don't know
  much about the physical layout on-disk.

 Well, to block numbers as a first step.

 However, this reminds me that sometimes we take the block-at-a-time
 extension policy too seriously.  We had a customer that had a
 performance problem because they were inserting lots of data to TOAST
 tables, causing very frequent extensions.  I kept wondering whether an
 allocation policy that allocated several new blocks at a time could be
 useful (but I didn't try it).  This would also alleviate fragmentation,
 thus helping the physical layout be more similar to logical block
 numbers.

 --
 Alvaro Herrera
 http://www.CommandPrompt.com/
 PostgreSQL Replication, Consulting, Custom Development, 24x7 support



[PERFORM] Building multiple indexes concurrently

2010-03-16 Thread Rob Wultsch
Lets say I have a large table bigTable to which I would like to add
two btree indexes. Is there a more efficient way to create indexes
than:
CREATE INDEX idx_foo on bigTable (foo);
CREATE INDEX idx_baz on bigTable (baz);
Or
CREATE INDEX CONCURRENTLY idx_foo on bigTable (foo);
CREATE INDEX CONCURRENTLY idx_baz on bigTable (baz);

Are there any particular performance optimizations that would be in
play in such a scenario?

At a minimum I assume that if both of the commands were started at
about the same time they would each scan the table in the same
direction and whichever creation was slower would benefit from most of
the table data it needed being prepopulated in shared buffers. Is this
the case?

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: [PERFORM] Building multiple indexes concurrently

2010-03-16 Thread Ben Chobot
On Mar 16, 2010, at 6:04 PM, Rob Wultsch wrote:

 Lets say I have a large table bigTable to which I would like to add
 two btree indexes. Is there a more efficient way to create indexes
 than:
 CREATE INDEX idx_foo on bigTable (foo);
 CREATE INDEX idx_baz on bigTable (baz);
 Or
 CREATE INDEX CONCURRENTLY idx_foo on bigTable (foo);
 CREATE INDEX CONCURRENTLY idx_baz on bigTable (baz);
 
 Are there any particular performance optimizations that would be in
 play in such a scenario?
 
 At a minimum I assume that if both of the commands were started at
 about the same time they would each scan the table in the same
 direction and whichever creation was slower would benefit from most of
 the table data it needed being prepopulated in shared buffers. Is this
 the case?

That sounds reasonable to me. You might also look at upping your 
maintenance_work_mem for your session, as well.
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance