subject:"\[PERFORM\] PostgreSQL as a local in\-memory cache"

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Brad Nicholson

On Tue, 2010-06-29 at 21:39 -0400, Bruce Momjian wrote:
 Jignesh Shah wrote:
  On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote:
   Tom Lane wrote:
   Bruce Momjian br...@momjian.us writes:
I asked on IRC and was told it is true, and looking at the C code it
looks true. ?What synchronous_commit = false does is to delay writing
the wal buffers to disk and fsyncing them, not just fsync, which is
where the commit loss due to db process crash comes from.
  
Ah, I see. ?Thanks.
  
I am personally surprised it was designed that way; ?I thought we would
just delay fsync.
  
   That would require writing and syncing to be separable actions. ?If
   you're using O_SYNC or similar, they aren't.
  
   Ah, very good point. ?I have added a C comment to clarify why this is
   the current behavior; ?attached and applied.
  
   --
   ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us
   ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com
  
  
  Though has anybody seen a behaviour where synchronous_commit=off is
  slower than synchronous_commit=on  ? Again there are two cases here
  one with O_* flag and other with f*sync flags. But I had seen that
  behavior with PostgreSQL 9.0 beta(2 I think) though havent really
  investigated it much yet .. (though now I dont remember which
  wal_sync_method flag) . Just curious if anybody has seen that
  behavior..
 
 I have trouble believing how synchronous_commit=off could be slower than
 'on'.
 

I wonder if it could be contention on wal buffers?

Say I've turned synchronous_commit off, I drive enough traffic fill up
my wal_buffers.  I assume that we would have to start writing buffers
down to disk before allocating to the new process.

-- 
Brad Nicholson  416-673-4106
Database Administrator, Afilias Canada Corp.



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Bruce Momjian

Brad Nicholson wrote:
Ah, very good point. ?I have added a C comment to clarify why this is
the current behavior; ?attached and applied.
   
--
?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us
?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com
   
   
   Though has anybody seen a behaviour where synchronous_commit=off is
   slower than synchronous_commit=on  ? Again there are two cases here
   one with O_* flag and other with f*sync flags. But I had seen that
   behavior with PostgreSQL 9.0 beta(2 I think) though havent really
   investigated it much yet .. (though now I dont remember which
   wal_sync_method flag) . Just curious if anybody has seen that
   behavior..
  
  I have trouble believing how synchronous_commit=off could be slower than
  'on'.
  
 
 I wonder if it could be contention on wal buffers?
 
 Say I've turned synchronous_commit off, I drive enough traffic fill up
 my wal_buffers.  I assume that we would have to start writing buffers
 down to disk before allocating to the new process.

Uh, good question.  I know this report showed ynchronous_commit=off as
faster than 'on':

http://archives.postgresql.org/pgsql-performance/2010-06/msg00277.php

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Dave Crooke

I haven't jumped in yet on this thread, but here goes 

If you're really looking for query performance, then any database which is
designed with reliability and ACID consistency in mind is going to
inherently have some mis-fit features.

Some other ideas to consider, depending on your query mix:

1. MySQL with the MyISAM database (non-ACID)

2. Put an in-application generic query cache in front of the DB, that runs
in the app address space, e.g. Cache' if using Java

3. Using a DB is a good way to get generic querying capability, but if the
where clause in the querying is over a small set of meta-data, and SQL
syntax is not a big requirement, consider non-RDBMS alternatives, e.g. use
XPath over a W3C DOM object tree to get primary keys to in-memory hash
tables (possibly distributed with something like memcached)

On Mon, Jun 14, 2010 at 9:14 PM, jgard...@jonathangardner.net 
jgard...@jonathangardner.net wrote:

 We have a fairly unique need for a local, in-memory cache. This will
 store data aggregated from other sources. Generating the data only
 takes a few minutes, and it is updated often. There will be some
 fairly expensive queries of arbitrary complexity run at a fairly high
 rate. We're looking for high concurrency and reasonable performance
 throughout.

 The entire data set is roughly 20 MB in size. We've tried Carbonado in
 front of SleepycatJE only to discover that it chokes at a fairly low
 concurrency and that Carbonado's rule-based optimizer is wholly
 insufficient for our needs. We've also tried Carbonado's Map
 Repository which suffers the same problems.

 I've since moved the backend database to a local PostgreSQL instance
 hoping to take advantage of PostgreSQL's superior performance at high
 concurrency. Of course, at the default settings, it performs quite
 poorly compares to the Map Repository and Sleepycat JE.

 My question is how can I configure the database to run as quickly as
 possible if I don't care about data consistency or durability? That
 is, the data is updated so often and it can be reproduced fairly
 rapidly so that if there is a server crash or random particles from
 space mess up memory we'd just restart the machine and move on.

 I've never configured PostgreSQL to work like this and I thought maybe
 someone here had some ideas on a good approach to this.

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Craig James


On 6/30/10 9:42 AM, Dave Crooke wrote:

I haven't jumped in yet on this thread, but here goes 

If you're really looking for query performance, then any database which
is designed with reliability and ACID consistency in mind is going to
inherently have some mis-fit features.

Some other ideas to consider, depending on your query mix:

1. MySQL with the MyISAM database (non-ACID)

2. Put an in-application generic query cache in front of the DB, that
runs in the app address space, e.g. Cache' if using Java

3. Using a DB is a good way to get generic querying capability, but if
the where clause in the querying is over a small set of meta-data, and
SQL syntax is not a big requirement, consider non-RDBMS alternatives,
e.g. use XPath over a W3C DOM object tree to get primary keys to
in-memory hash tables (possibly distributed with something like memcached)


These would be good suggestions if the throwaway database was the only one.  
But in real life, these throwaway databases are built from other databases that are NOT 
throwaway, where the data matters and ACID is critical.  In other words, they'll probably 
need Postgres anyway.

Sure, you could use both Postgres and MySQL/ISAM, but that means installing and 
maintaining both, plus building all of the other application layers to work on 
both systems.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Jignesh Shah

On Tue, Jun 29, 2010 at 9:39 PM, Bruce Momjian br...@momjian.us wrote:
 Jignesh Shah wrote:
 On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote:
  Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   I asked on IRC and was told it is true, and looking at the C code it
   looks true. ?What synchronous_commit = false does is to delay writing
   the wal buffers to disk and fsyncing them, not just fsync, which is
   where the commit loss due to db process crash comes from.
 
   Ah, I see. ?Thanks.
 
   I am personally surprised it was designed that way; ?I thought we would
   just delay fsync.
 
  That would require writing and syncing to be separable actions. ?If
  you're using O_SYNC or similar, they aren't.
 
  Ah, very good point. ?I have added a C comment to clarify why this is
  the current behavior; ?attached and applied.
 
  --
  ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us
  ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com


 Though has anybody seen a behaviour where synchronous_commit=off is
 slower than synchronous_commit=on  ? Again there are two cases here
 one with O_* flag and other with f*sync flags. But I had seen that
 behavior with PostgreSQL 9.0 beta(2 I think) though havent really
 investigated it much yet .. (though now I dont remember which
 wal_sync_method flag) . Just curious if anybody has seen that
 behavior..

 I have trouble believing how synchronous_commit=off could be slower than
 'on'.

 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + None of us is going to be here forever. +


Hi Bruce,

Let me clarify  the problem a bit.. If the underlying WAL disk is SSD
then it seems I can get synchronous_commit=on to work faster than
synchronous_commit=off.. Yes sounds unintuitive to me. But the results
seems to point in that direction. It could be that it hit some other
bottleneck with synchronous_commit=off reaches that
synchronous_commit=on does not hit (or has not hit yet).

 Brads point of wal buffers could be valid. Though typically I havent
seen the need to increase it beyond 1024kB yet.

Hopefully I will retry it with the latest PostgreSQL 9.0 bits and see
it happens again.
More on that later.

Regards,
Jignesh

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-30 Thread Greg Smith


 On 6/30/2010 2:21 PM, Jignesh Shah wrote:
If the underlying WAL disk is SSD then it seems I can get 
synchronous_commit=on to work faster than

synchronous_commit=off..


The first explanation that pops to mind is that synchronous_commit is 
writing all the time, which doesn't have the same sort of penalty on 
SSD.  Whereas if you turn it off, then there are some idle periods where 
the SSD could be writing usefully, but instead it's buffering for the 
next burst instead.  The importance of that can be magnified on 
operating systems that do their own buffering and tend to lag behind 
writes until they see an fsync call, like is the case on Linux with ext3.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Robert Haas

On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote:
 The patch also documents that synchronous_commit = false has
 potential committed transaction loss from a database crash (as well as
 an OS crash).

Is this actually true?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Bruce Momjian

Robert Haas wrote:
 On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote:
  The patch also documents that synchronous_commit = false has
  potential committed transaction loss from a database crash (as well as
  an OS crash).
 
 Is this actually true?

I asked on IRC and was told it is true, and looking at the C code it
looks true.  What synchronous_commit = false does is to delay writing
the wal buffers to disk and fsyncing them, not just fsync, which is
where the commit loss due to db process crash comes from.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Kevin Grittner

Bruce Momjian br...@momjian.us wrote:
 
 What synchronous_commit = false does is to delay writing
 the wal buffers to disk and fsyncing them, not just fsync
 
Ah, that answers the question Josh Berkus asked here:
 
http://archives.postgresql.org/pgsql-performance/2010-06/msg00285.php
 
(which is something I was wondering about, too.)
 
-Kevin

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Robert Haas

On Tue, Jun 29, 2010 at 9:32 AM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote:
  The patch also documents that synchronous_commit = false has
  potential committed transaction loss from a database crash (as well as
  an OS crash).

 Is this actually true?

 I asked on IRC and was told it is true, and looking at the C code it
 looks true.  What synchronous_commit = false does is to delay writing
 the wal buffers to disk and fsyncing them, not just fsync, which is
 where the commit loss due to db process crash comes from.

Ah, I see.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Bruce Momjian

Robert Haas wrote:
 On Tue, Jun 29, 2010 at 9:32 AM, Bruce Momjian br...@momjian.us wrote:
  Robert Haas wrote:
  On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote:
   The patch also documents that synchronous_commit = false has
   potential committed transaction loss from a database crash (as well as
   an OS crash).
 
  Is this actually true?
 
  I asked on IRC and was told it is true, and looking at the C code it
  looks true. ?What synchronous_commit = false does is to delay writing
  the wal buffers to disk and fsyncing them, not just fsync, which is
  where the commit loss due to db process crash comes from.
 
 Ah, I see.  Thanks.

I am personally surprised it was designed that way;  I thought we would
just delay fsync.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Tom Lane

Bruce Momjian br...@momjian.us writes:
 I asked on IRC and was told it is true, and looking at the C code it
 looks true. ?What synchronous_commit = false does is to delay writing
 the wal buffers to disk and fsyncing them, not just fsync, which is
 where the commit loss due to db process crash comes from.

 Ah, I see.  Thanks.

 I am personally surprised it was designed that way;  I thought we would
 just delay fsync.

That would require writing and syncing to be separable actions.  If
you're using O_SYNC or similar, they aren't.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Bruce Momjian

Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I asked on IRC and was told it is true, and looking at the C code it
  looks true. ?What synchronous_commit = false does is to delay writing
  the wal buffers to disk and fsyncing them, not just fsync, which is
  where the commit loss due to db process crash comes from.
 
  Ah, I see.  Thanks.
 
  I am personally surprised it was designed that way;  I thought we would
  just delay fsync.
 
 That would require writing and syncing to be separable actions.  If
 you're using O_SYNC or similar, they aren't.

Ah, very good point.  I have added a C comment to clarify why this is
the current behavior;  attached and applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: src/backend/access/transam/xact.c
===
RCS file: /cvsroot/pgsql/src/backend/access/transam/xact.c,v
retrieving revision 1.291
diff -c -c -r1.291 xact.c
*** src/backend/access/transam/xact.c	13 May 2010 11:39:30 -	1.291
--- src/backend/access/transam/xact.c	29 Jun 2010 18:33:47 -
***
*** 1028,1034 
  	if (XactSyncCommit || forceSyncCommit || haveNonTemp)
  	{
  		/*
! 		 * Synchronous commit case.
  		 *
  		 * Sleep before flush! So we can flush more than one commit records
  		 * per single fsync.  (The idea is some other backend may do the
--- 1028,1034 
  	if (XactSyncCommit || forceSyncCommit || haveNonTemp)
  	{
  		/*
! 		 * Synchronous commit case:
  		 *
  		 * Sleep before flush! So we can flush more than one commit records
  		 * per single fsync.  (The idea is some other backend may do the
***
*** 1054,1060 
  	else
  	{
  		/*
! 		 * Asynchronous commit case.
  		 *
  		 * Report the latest async commit LSN, so that the WAL writer knows to
  		 * flush this commit.
--- 1054,1065 
  	else
  	{
  		/*
! 		 * Asynchronous commit case:
! 		 *
! 		 * This enables possible committed transaction loss in the case of a
! 		 * postmaster crash because WAL buffers are left unwritten.
! 		 * Ideally we could issue the WAL write without the fsync, but
! 		 * some wal_sync_methods do not allow separate write/fsync.
  		 *
  		 * Report the latest async commit LSN, so that the WAL writer knows to
  		 * flush this commit.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Jignesh Shah

On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I asked on IRC and was told it is true, and looking at the C code it
  looks true. ?What synchronous_commit = false does is to delay writing
  the wal buffers to disk and fsyncing them, not just fsync, which is
  where the commit loss due to db process crash comes from.

  Ah, I see.  Thanks.

  I am personally surprised it was designed that way;  I thought we would
  just delay fsync.

 That would require writing and syncing to be separable actions.  If
 you're using O_SYNC or similar, they aren't.

 Ah, very good point.  I have added a C comment to clarify why this is
 the current behavior;  attached and applied.

 --
  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com


Though has anybody seen a behaviour where synchronous_commit=off is
slower than synchronous_commit=on  ? Again there are two cases here
one with O_* flag and other with f*sync flags. But I had seen that
behavior with PostgreSQL 9.0 beta(2 I think) though havent really
investigated it much yet .. (though now I dont remember which
wal_sync_method flag) . Just curious if anybody has seen that
behavior..

Regards,
Jignesh

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-29 Thread Bruce Momjian

Jignesh Shah wrote:
 On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote:
  Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   I asked on IRC and was told it is true, and looking at the C code it
   looks true. ?What synchronous_commit = false does is to delay writing
   the wal buffers to disk and fsyncing them, not just fsync, which is
   where the commit loss due to db process crash comes from.
 
   Ah, I see. ?Thanks.
 
   I am personally surprised it was designed that way; ?I thought we would
   just delay fsync.
 
  That would require writing and syncing to be separable actions. ?If
  you're using O_SYNC or similar, they aren't.
 
  Ah, very good point. ?I have added a C comment to clarify why this is
  the current behavior; ?attached and applied.
 
  --
  ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us
  ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com
 
 
 Though has anybody seen a behaviour where synchronous_commit=off is
 slower than synchronous_commit=on  ? Again there are two cases here
 one with O_* flag and other with f*sync flags. But I had seen that
 behavior with PostgreSQL 9.0 beta(2 I think) though havent really
 investigated it much yet .. (though now I dont remember which
 wal_sync_method flag) . Just curious if anybody has seen that
 behavior..

I have trouble believing how synchronous_commit=off could be slower than
'on'.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-28 Thread Bruce Momjian

Bruce Momjian wrote:
 Tom Lane wrote:
  Dimitri Fontaine dfonta...@hi-media.com writes:
   Josh Berkus j...@agliodbs.com writes:
   a) Eliminate WAL logging entirely
   b) Eliminate checkpointing
   c) Turn off the background writer
   d) Have PostgreSQL refuse to restart after a crash and instead call an
   exteral script (for reprovisioning)
  
   Well I guess I'd prefer a per-transaction setting, allowing to bypass
   WAL logging and checkpointing.
  
  Not going to happen; this is all or nothing.
  
   Forcing the backend to care itself for
   writing the data I'm not sure is a good thing, but if you say so.
  
  Yeah, I think proposal (c) is likely to be a net loss.
  
  (a) and (d) are probably simple, if by reprovisioning you mean
  rm -rf $PGDATA; initdb.  Point (b) will be a bit trickier because
  there are various housekeeping activities tied into checkpoints.
  I think you can't actually remove checkpoints altogether, just
  skip the flush-dirty-pages part.
 
 Based on this thread, I have developed the following documentation patch
 that outlines the performance enhancements possible if durability is not
 required.  The patch also documents that synchronous_commit = false has
 potential committed transaction loss from a database crash (as well as
 an OS crash).

Applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Dimitri Fontaine

Tom Lane t...@sss.pgh.pa.us writes:
 The problem with a system-wide no-WAL setting is it means you can't
 trust the system catalogs after a crash.  Which means you are forced to
 use initdb to recover from any crash, in return for not a lot of savings
 (for typical usages where there's not really much churn in the
 catalogs). 

What about having a catalog only WAL setting, userset ?

I'm not yet clear on the point but it well seems that the per
transaction WAL setting is impossible because of catalogs (meaning
mainly DDL support), but I can see us enforcing durability and crash
safety there.

That would probably mean that setting WAL level this low yet doing any
kind of DDL would need to be either an ERROR, or better yet, a WARNING
telling that the WAL level can not be that low so has been raised by the
system.

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Rob Wultsch

On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote:

 It must be a setting, not a version.

 For instance suppose you have a session table for your website and a
 users table.

 - Having ACID on the users table is of course a must ;
 - for the sessions table you can drop the D

 You're trying to solve a different use-case than the one I am.

 Your use-case will be solved by global temporary tables.  I suggest that
 you give Robert Haas some help  feedback on that.

 My use case is people using PostgreSQL as a cache, or relying entirely
 on replication for durability.

 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com



Is he? Wouldn't a global temporary table have content that is not
visible between db connections? A db session many not be the same as a
user session.

-- 
Rob Wultsch
wult...@gmail.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Robert Haas

On Thu, Jun 24, 2010 at 4:40 AM, Rob Wultsch wult...@gmail.com wrote:
 On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote:

 It must be a setting, not a version.

 For instance suppose you have a session table for your website and a
 users table.

 - Having ACID on the users table is of course a must ;
 - for the sessions table you can drop the D

 You're trying to solve a different use-case than the one I am.

 Your use-case will be solved by global temporary tables.  I suggest that
 you give Robert Haas some help  feedback on that.

 My use case is people using PostgreSQL as a cache, or relying entirely
 on replication for durability.

 Is he? Wouldn't a global temporary table have content that is not
 visible between db connections? A db session many not be the same as a
 user session.


I'm planning to implement global temporary tables, which can have
different contents for each user session.

And I'm also planning to implement unlogged tables, which have the
same contents for all sessions but are not WAL-logged (and are
truncated on startup).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Josh Berkus


 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

Yep.  And it's quite possible that this will be adequate for most users.

And it's also possible that the extra CPU which Robert isn't getting rid
of (bgwriter, checkpointing, etc.) does not have a measurable impact on
performance.  At this point, my idea (which I call
RunningWithScissorsDB) is only an idea for experimentation and
performance testing.  It's pretty far off from being a TODO.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule

2010/6/24 Josh Berkus j...@agliodbs.com:

 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

this is similar MySQL's memory tables. Personally, I don't see any
practical sense do same work on PostgreSQL now, when memcached exists.
Much more important is smarter cache controlling then we have now -
maybe with priorities for some tables and some operations
(applications) - sometimes we don't need use cache for extra large
scans.

Regards

Pavel Stehule



 Yep.  And it's quite possible that this will be adequate for most users.

 And it's also possible that the extra CPU which Robert isn't getting rid
 of (bgwriter, checkpointing, etc.) does not have a measurable impact on
 performance.  At this point, my idea (which I call
 RunningWithScissorsDB) is only an idea for experimentation and
 performance testing.  It's pretty far off from being a TODO.


 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Joshua D. Drake

On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
  And I'm also planning to implement unlogged tables, which have the
  same contents for all sessions but are not WAL-logged (and are
  truncated on startup).
 
 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

Because memcache is yet another layer and increases overhead to the
application developers by adding yet another layer to work with. Non
logged tables would rock.

SELECT * FROM foo;

:D

JD




-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule

2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
  And I'm also planning to implement unlogged tables, which have the
  same contents for all sessions but are not WAL-logged (and are
  truncated on startup).

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.

I see only one positive point - it can help to people with broken
design application with migration to PostgreSQL.

There are different interesting feature - cached procedure's results
like Oracle 11. - it's more general.

only idea.

For me memory tables are nonsens, but what about memory cached
materialised views (maybe periodically refreshed)?

Regards

Pavel


 SELECT * FROM foo;

 :D

:)

 JD




 --
 PostgreSQL.org Major Contributor
 Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
 Consulting, Training, Support, Custom Development, Engineering



-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread A.M.


On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote:

 2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:
 
 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).
 
 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.
 
 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.
 
 I see only one positive point - it can help to people with broken
 design application with migration to PostgreSQL.

The broken design is being required to work around PostgreSQL's lack of this 
optimization.

 
 There are different interesting feature - cached procedure's results
 like Oracle 11. - it's more general.
 
 only idea.
 
 For me memory tables are nonsens, but what about memory cached
 materialised views (maybe periodically refreshed)?

Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory tables. 
Such tables simply contain transient information. One can already make memory 
tables in PostgreSQL by making a tablespace in a tmpfs partition.

I have been eagerly waiting for this feature for six years so that I can write 
proper queries against ever-changing session data with transactional semantics 
(which memcached cannot offer). The only restriction I see for these transient 
data tables is that they cannot be referenced by standard tables using foreign 
key constraints. Otherwise, these tables behave like any other. That's the 
benefit.

Cheers,
M
-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule

2010/6/24 A.M. age...@themactionfaction.com:

 On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote:

 2010/6/24 Joshua D. Drake j...@commandprompt.com:
 On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote:
 2010/6/24 Josh Berkus j...@agliodbs.com:

 And I'm also planning to implement unlogged tables, which have the
 same contents for all sessions but are not WAL-logged (and are
 truncated on startup).

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Because memcache is yet another layer and increases overhead to the
 application developers by adding yet another layer to work with. Non
 logged tables would rock.

 I see only one positive point - it can help to people with broken
 design application with migration to PostgreSQL.

 The broken design is being required to work around PostgreSQL's lack of this 
 optimization.


 There are different interesting feature - cached procedure's results
 like Oracle 11. - it's more general.

 only idea.

 For me memory tables are nonsens, but what about memory cached
 materialised views (maybe periodically refreshed)?

 Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory 
 tables. Such tables simply contain transient information. One can already 
 make memory tables in PostgreSQL by making a tablespace in a tmpfs 
 partition.

 I have been eagerly waiting for this feature for six years so that I can 
 write proper queries against ever-changing session data with transactional 
 semantics (which memcached cannot offer). The only restriction I see for 
 these transient data tables is that they cannot be referenced by standard 
 tables using foreign key constraints. Otherwise, these tables behave like any 
 other. That's the benefit.


if you remove WAL, then there are MVCC still - you have to do VACUUM,
you have to do ANALYZE, you have to thinking about indexes ...
Processing pipe for simple query is long too. The removing WAL doesn't
do memory database from Postgres. But You have to know best, what do
you do.

Regards

Pavel Stehule

p.s. maybe memcached is too simply for you - there are more NoSQL db

 Cheers,
 M
 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Josh Berkus


 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

Thing is, if you only have one table (say, a sessions table) which you
don't want logged, you don't necessarily want to fire up a 2nd software
application just for that.  Plus, recent testing seems to show that with
no logging, memcached isn't really faster than PG.

Also, like for asynch_commit, this is something where users are
currently turning off fsync.  Any option where we can present users with
controlled, predictable data loss instead of random corruption is a good
one.

 Much more important is smarter cache controlling then we have now -
 maybe with priorities for some tables and some operations
 (applications) - sometimes we don't need use cache for extra large
 scans.

Well, that would be good *too*.  You working on it?  ;-)

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-24 Thread Pavel Stehule

2010/6/24 Josh Berkus j...@agliodbs.com:

 this is similar MySQL's memory tables. Personally, I don't see any
 practical sense do same work on PostgreSQL now, when memcached exists.

 Thing is, if you only have one table (say, a sessions table) which you
 don't want logged, you don't necessarily want to fire up a 2nd software
 application just for that.  Plus, recent testing seems to show that with
 no logging, memcached isn't really faster than PG.

sorry, I thinking some else. Not only WAL does significant overhead.
You need litlle bit more memory, much more processing time. With very
fast operations, the bottle neck will be in interprocess communication
- but it doesn't mean so pg isn't slower than memcached. I repeating
it again - there are no any universal tool for all tasks.


 Also, like for asynch_commit, this is something where users are
 currently turning off fsync.  Any option where we can present users with
 controlled, predictable data loss instead of random corruption is a good
 one.


it isn't too simple. What about statistics? These are used in system table.

 Much more important is smarter cache controlling then we have now -
 maybe with priorities for some tables and some operations
 (applications) - sometimes we don't need use cache for extra large
 scans.

 Well, that would be good *too*.  You working on it?  ;-)


no - just I know about possible problems with memory control.

 --
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Bruce Momjian

Tom Lane wrote:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Josh Berkus j...@agliodbs.com writes:
  a) Eliminate WAL logging entirely

If we elimiate WAL logging, that means a reinstall is required for even
a postmaster crash, which is a new non-durable behavior.

Also, we just added wal_level = minimal, which might end up being a poor
name choice of we want wal_level = off in PG 9.1.  Perhaps we should
have used wal_level = crash_safe in 9.0.

I have added the following TODO:

Consider a non-crash-safe wal_level that eliminates WAL activity

* 
http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Bruce Momjian

Tom Lane wrote:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Josh Berkus j...@agliodbs.com writes:
  a) Eliminate WAL logging entirely
  b) Eliminate checkpointing
  c) Turn off the background writer
  d) Have PostgreSQL refuse to restart after a crash and instead call an
  exteral script (for reprovisioning)
 
  Well I guess I'd prefer a per-transaction setting, allowing to bypass
  WAL logging and checkpointing.
 
 Not going to happen; this is all or nothing.
 
  Forcing the backend to care itself for
  writing the data I'm not sure is a good thing, but if you say so.
 
 Yeah, I think proposal (c) is likely to be a net loss.
 
 (a) and (d) are probably simple, if by reprovisioning you mean
 rm -rf $PGDATA; initdb.  Point (b) will be a bit trickier because
 there are various housekeeping activities tied into checkpoints.
 I think you can't actually remove checkpoints altogether, just
 skip the flush-dirty-pages part.

Based on this thread, I have developed the following documentation patch
that outlines the performance enhancements possible if durability is not
required.  The patch also documents that synchronous_commit = false has
potential committed transaction loss from a database crash (as well as
an OS crash).

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.282
diff -c -c -r1.282 config.sgml
*** doc/src/sgml/config.sgml	22 Jun 2010 02:57:49 -	1.282
--- doc/src/sgml/config.sgml	23 Jun 2010 18:53:26 -
***
*** 1463,1469 
  really guaranteed to be safe against a server crash.  (The maximum
  delay is three times xref linkend=guc-wal-writer-delay.)  Unlike
  xref linkend=guc-fsync, setting this parameter to literaloff/
! does not create any risk of database inconsistency: a crash might
  result in some recent allegedly-committed transactions being lost, but
  the database state will be just the same as if those transactions had
  been aborted cleanly.  So, turning varnamesynchronous_commit/ off
--- 1463,1470 
  really guaranteed to be safe against a server crash.  (The maximum
  delay is three times xref linkend=guc-wal-writer-delay.)  Unlike
  xref linkend=guc-fsync, setting this parameter to literaloff/
! does not create any risk of database inconsistency: an operating
! system or database crash crash might
  result in some recent allegedly-committed transactions being lost, but
  the database state will be just the same as if those transactions had
  been aborted cleanly.  So, turning varnamesynchronous_commit/ off
Index: doc/src/sgml/perform.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v
retrieving revision 1.80
diff -c -c -r1.80 perform.sgml
*** doc/src/sgml/perform.sgml	29 May 2010 21:08:04 -	1.80
--- doc/src/sgml/perform.sgml	23 Jun 2010 18:53:26 -
***
*** 1104,1107 
--- 1104,1169 
/sect2
/sect1
  
+   sect1 id=non-durability
+titleNon-Durable Settings/title
+ 
+indexterm zone=non-durability
+ primarynon-durable/primary
+/indexterm
+ 
+para
+ Durability is a database feature that guarantees the recording of
+ committed transactions even if if the server crashes or loses
+ power.  However, durability adds significant database overhead,
+ so if your site does not require such a guarantee,
+ productnamePostgreSQL/productname can be configured to run
+ much faster.  The following are configuration changes you can make
+ to improve performance in such cases;  they do not invalidate
+ commit guarantees related to database crashes, only abrupt operating
+ system stoppage, except as mentioned below:
+ 
+ itemizedlist
+  listitem
+   para
+Place the database cluster's data directory in a memory-backed
+file system (i.e. acronymRAM/ disk).  This eliminates all
+database disk I/O, but limits data storage to the amount of
+available memory (and perhaps swap).
+   /para
+  /listitem
+ 
+  listitem
+   para
+Turn off xref linkend=guc-fsync;  there is no need to flush
+data to disk.
+   /para
+  /listitem
+ 
+  listitem
+   para
+Turn off xref linkend=guc-full-page-writes;  there is no need
+to guard against partial page writes.
+   /para
+  /listitem
+ 
+  listitem
+   para
+Increase xref linkend=guc-checkpoint-segments and xref
+linkend=guc-checkpoint-timeout ; this reduces the frequency
+of checkpoints, but increases the storage requirements of
+

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Pavel Stehule

2010/6/23 Bruce Momjian br...@momjian.us:
 Tom Lane wrote:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Josh Berkus j...@agliodbs.com writes:
  a) Eliminate WAL logging entirely

 If we elimiate WAL logging, that means a reinstall is required for even
 a postmaster crash, which is a new non-durable behavior.

 Also, we just added wal_level = minimal, which might end up being a poor
 name choice of we want wal_level = off in PG 9.1.  Perhaps we should
 have used wal_level = crash_safe in 9.0.

 I have added the following TODO:

        Consider a non-crash-safe wal_level that eliminates WAL activity

            * 
 http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php

 --

isn't fsync to off enought?

Regards

Pavel

  Bruce Momjian  br...@momjian.us        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + None of us is going to be here forever. +

 --
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Bruce Momjian

Pavel Stehule wrote:
 2010/6/23 Bruce Momjian br...@momjian.us:
  Tom Lane wrote:
  Dimitri Fontaine dfonta...@hi-media.com writes:
   Josh Berkus j...@agliodbs.com writes:
   a) Eliminate WAL logging entirely
 
  If we elimiate WAL logging, that means a reinstall is required for even
  a postmaster crash, which is a new non-durable behavior.
 
  Also, we just added wal_level = minimal, which might end up being a poor
  name choice of we want wal_level = off in PG 9.1. ?Perhaps we should
  have used wal_level = crash_safe in 9.0.
 
  I have added the following TODO:
 
  ? ? ? ?Consider a non-crash-safe wal_level that eliminates WAL activity
 
  ? ? ? ? ? ?* 
  http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php
 
  --
 
 isn't fsync to off enought?

Well,  testing reported in the thread showed other settings also help,
though the checkpoint lengthening was not tested.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Robert Haas

On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Josh Berkus j...@agliodbs.com writes:
  a) Eliminate WAL logging entirely

 If we elimiate WAL logging, that means a reinstall is required for even
 a postmaster crash, which is a new non-durable behavior.

 Also, we just added wal_level = minimal, which might end up being a poor
 name choice of we want wal_level = off in PG 9.1.  Perhaps we should
 have used wal_level = crash_safe in 9.0.

 I have added the following TODO:

        Consider a non-crash-safe wal_level that eliminates WAL activity

            * 
 http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php

I don't think we need a system-wide setting for that.  I believe that
the unlogged tables I'm working on will handle that case.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Dave Page

On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote:
 Tom Lane wrote:
 Dimitri Fontaine dfonta...@hi-media.com writes:
  Josh Berkus j...@agliodbs.com writes:
  a) Eliminate WAL logging entirely

 If we elimiate WAL logging, that means a reinstall is required for even
 a postmaster crash, which is a new non-durable behavior.

 Also, we just added wal_level = minimal, which might end up being a poor
 name choice of we want wal_level = off in PG 9.1.  Perhaps we should
 have used wal_level = crash_safe in 9.0.

 I have added the following TODO:

        Consider a non-crash-safe wal_level that eliminates WAL activity

            * 
 http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php

 I don't think we need a system-wide setting for that.  I believe that
 the unlogged tables I'm working on will handle that case.

Aren't they going to be truncated at startup? If the entire system is
running without WAL, we would only need to do that in case of an
unclean shutdown wouldn't we?


-- 
Dave Page
EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Bruce Momjian

Robert Haas wrote:
 On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote:
  Tom Lane wrote:
  Dimitri Fontaine dfonta...@hi-media.com writes:
   Josh Berkus j...@agliodbs.com writes:
   a) Eliminate WAL logging entirely
 
  If we elimiate WAL logging, that means a reinstall is required for even
  a postmaster crash, which is a new non-durable behavior.
 
  Also, we just added wal_level = minimal, which might end up being a poor
  name choice of we want wal_level = off in PG 9.1. ?Perhaps we should
  have used wal_level = crash_safe in 9.0.
 
  I have added the following TODO:
 
  ? ? ? ?Consider a non-crash-safe wal_level that eliminates WAL activity
 
  ? ? ? ? ? ?* 
  http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php
 
 I don't think we need a system-wide setting for that.  I believe that
 the unlogged tables I'm working on will handle that case.

Uh, will we have some global unlogged setting, like for the system
tables and stuff?  It seems like an heavy burden to tell people they
have to create ever object as unlogged, and we would still generate log
for things like transaction commits.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Tom Lane

Dave Page dp...@pgadmin.org writes:
 On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote:
 I don't think we need a system-wide setting for that.  I believe that
 the unlogged tables I'm working on will handle that case.

 Aren't they going to be truncated at startup? If the entire system is
 running without WAL, we would only need to do that in case of an
 unclean shutdown wouldn't we?

The problem with a system-wide no-WAL setting is it means you can't
trust the system catalogs after a crash.  Which means you are forced to
use initdb to recover from any crash, in return for not a lot of savings
(for typical usages where there's not really much churn in the
catalogs).  I tend to agree with Robert that a way to not log content
updates for individual user tables is likely to be much more useful in
practice.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-23 Thread Bruce Momjian

Tom Lane wrote:
 Dave Page dp...@pgadmin.org writes:
  On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote:
  I don't think we need a system-wide setting for that. ?I believe that
  the unlogged tables I'm working on will handle that case.
 
  Aren't they going to be truncated at startup? If the entire system is
  running without WAL, we would only need to do that in case of an
  unclean shutdown wouldn't we?
 
 The problem with a system-wide no-WAL setting is it means you can't
 trust the system catalogs after a crash.  Which means you are forced to

True, and in fact any postmaster crash could lead to curruption.

 use initdb to recover from any crash, in return for not a lot of savings
 (for typical usages where there's not really much churn in the
 catalogs).  I tend to agree with Robert that a way to not log content
 updates for individual user tables is likely to be much more useful in
 practice.

OK, TODO removed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + None of us is going to be here forever. +

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-21 Thread Robert Haas

On Thu, Jun 17, 2010 at 1:29 PM, Josh Berkus j...@agliodbs.com wrote:
 a) Eliminate WAL logging entirely

In addition to global temporary tables, I am also planning to
implement unlogged tables, which are, precisely, tables for which no
WAL is written.  On restart, any such tables will be truncated.  That
should give you the ability to do this (by making all your tables
unlogged).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-18 Thread Matthew Wakeling


Dimitri Fontaine wrote:

Well I guess I'd prefer a per-transaction setting


Not possible, as many others have said. As soon as you make an unsafe 
transaction, all the other transactions have nothing to rely on.


On Thu, 17 Jun 2010, Pierre C wrote:
A per-table (or per-index) setting makes more sense IMHO. For instance on 
recovery, truncate this table (this was mentioned before).


That would be much more valuable.

I'd like to point out the costs involved in having a whole separate 
version of Postgres that has all this safety switched off. Package 
managers will not thank anyone for having to distribute another version of 
the system, and woe betide the user who installs the wrong version because 
it runs faster. No, this is much better as a configurable option.


Going back to the on recovery, truncate this table. We already have a 
mechanism for skipping the WAL writes on an entire table - we do that for 
tables that have been created in the current transaction. It would surely 
be a small step to allow this to be configurably permanent on a particular 
table.


Moreover, we already have a mechanism for taking a table that has had 
non-logged changes, and turning it into a fully logged table - we do that 
to the above mentioned tables when the transaction commits. I would 
strongly recommend providing an option to ALTER TABLE MAKE SAFE, which may 
involve some more acrobatics if the table is currently in use by multiple 
transactions, but would be valuable.


This would allow users to create temporary tables that can be shared by 
several connections. It would also allow bulk loading in parallel of a 
single large table.


With these suggestions, we would still need to WAL-log all the metadata 
changes, but I think in most circumstances that is not going to be a large 
burden on performance.


Matthew

--
Picard: I was just paid a visit from Q.
Riker:  Q! Any idea what he's up to?
Picard: No. He said he wanted to be nice to me.
Riker:  I'll alert the crew.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-18 Thread Pierre C



I'd like to point out the costs involved in having a whole separate  
version


It must be a setting, not a version.

For instance suppose you have a session table for your website and a users  
table.


- Having ACID on the users table is of course a must ;
- for the sessions table you can drop the D

Server crash would force all users to re-login on your website but if your  
server crashes enough that your users complain about that, you have  
another problem anyway. Having the sessions table not WAL-logged (ie  
faster) would not prevent you from having sessions.user_id REFERENCES  
users( user_id ) ... so mixing safe and unsafe tables would be much more  
powerful than just having unsafe tables.


And I really like the idea of non-WAL-logged indexes, too, since they can  
be rebuilt as needed, the DBA could decide between faster index updates  
but rebuild on crash, or normal updates and fast recovery.


Also materialized views etc, you can rebuild them on crash and the added  
update speed would be good.


Moreover, we already have a mechanism for taking a table that has had  
non-logged changes, and turning it into a fully logged table - we do  
that to the above mentioned tables when the transaction commits. I would  
strongly recommend providing an option to ALTER TABLE MAKE SAFE, which  
may involve some more acrobatics if the table is currently in use by  
multiple transactions, but would be valuable.


I believe the old discussions called this ALTER TABLE SET PERSISTENCE.

This would allow users to create temporary tables that can be shared  
by several connections. It would also allow bulk loading in parallel of  
a single large table.


This would need to WAL-log the entire table to send it to the slaves if  
replication is enabled, but it's a lot faster than replicating each record.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-18 Thread Josh Berkus


 It must be a setting, not a version.
 
 For instance suppose you have a session table for your website and a
 users table.
 
 - Having ACID on the users table is of course a must ;
 - for the sessions table you can drop the D

You're trying to solve a different use-case than the one I am.

Your use-case will be solved by global temporary tables.  I suggest that
you give Robert Haas some help  feedback on that.

My use case is people using PostgreSQL as a cache, or relying entirely
on replication for durability.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-18 Thread Josh Berkus

On 6/18/10 2:15 AM, Matthew Wakeling wrote:
 I'd like to point out the costs involved in having a whole separate
 version of Postgres that has all this safety switched off. Package
 managers will not thank anyone for having to distribute another version
 of the system, and woe betide the user who installs the wrong version
 because it runs faster. No, this is much better as a configurable option.

Agreed, although initial alphas of this concept are likely to in fact be
a separate source code tree.  Eventually when we have it working well it
could become an initdb-time option.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Josh Berkus


All,

So, I've been discussing this because using PostgreSQL on the caching 
layer has become more common that I think most people realize.  Jonathan 
is one of 4 companies I know of who are doing this, and with the growth 
of Hadoop and other large-scale data-processing technologies, I think 
demand will increase.


Especially as, in repeated tests, PostgreSQL with persistence turned off 
is just as fast as the fastest nondurable NoSQL database.  And it has a 
LOT more features.


Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for 
durability, they don't eliminate the CPU time.  Which means that a 
caching version of PostgreSQL could be even faster.   To do that, we'd 
need to:


a) Eliminate WAL logging entirely
b) Eliminate checkpointing
c) Turn off the background writer
d) Have PostgreSQL refuse to restart after a crash and instead call an 
exteral script (for reprovisioning)


Of the three above, (a) is the most difficult codewise.  (b)(c) and (d) 
should be relatively straightforwards, although I believe that we now 
have the bgwriter doing some other essential work besides syncing 
buffers.  There's also a narrower use-case in eliminating (a), since a 
non-fsync'd server which was recording WAL could be used as part of a 
replication chain.


This isn't on hackers because I'm not ready to start working on a patch, 
but I'd like some feedback on the complexities of doing (b) and (c) as 
well as how many people could use a non-persistant, in-memory postgres.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Pierre C



Especially as, in repeated tests, PostgreSQL with persistence turned off  
is just as fast as the fastest nondurable NoSQL database.  And it has a  
LOT more features.


An option to completely disable WAL for such use cases would make it a lot  
faster, especially in the case of heavy concurrent writes.


Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for  
durability, they don't eliminate the CPU time.


Actually the WAL overhead is some CPU and lots of locking.

Which means that a caching version of PostgreSQL could be even faster.
To do that, we'd need to:


a) Eliminate WAL logging entirely
b) Eliminate checkpointing
c) Turn off the background writer
d) Have PostgreSQL refuse to restart after a crash and instead call an  
exteral script (for reprovisioning)


Of the three above, (a) is the most difficult codewise.


Actually, it's pretty easy, look in xlog.c


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Dimitri Fontaine

Hi,

Josh Berkus j...@agliodbs.com writes:
 a) Eliminate WAL logging entirely
 b) Eliminate checkpointing
 c) Turn off the background writer
 d) Have PostgreSQL refuse to restart after a crash and instead call an
 exteral script (for reprovisioning)

Well I guess I'd prefer a per-transaction setting, allowing to bypass
WAL logging and checkpointing. Forcing the backend to care itself for
writing the data I'm not sure is a good thing, but if you say so.

Then you could have the GUC set for a whole cluster, only a database
etc. We already have synchronous_commit to trade durability against
performances, we could maybe support protect_data = off too.

The d) point I'm not sure still applies if you have per transaction
setting, which I think makes the most sense. The data you choose not to
protect is missing at restart, just add some way to register a hook
there. We already have one (shared_preload_libraries) but it requires
coding in C. 

Calling a user function at the end of recovery and before accepting
connection would be good I think. A user function (per database) is
better than a script because if you want to run it before accepting
connections and still cause changes in the database…

Regards,
-- 
dim

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Tom Lane

Dimitri Fontaine dfonta...@hi-media.com writes:
 Josh Berkus j...@agliodbs.com writes:
 a) Eliminate WAL logging entirely
 b) Eliminate checkpointing
 c) Turn off the background writer
 d) Have PostgreSQL refuse to restart after a crash and instead call an
 exteral script (for reprovisioning)

 Well I guess I'd prefer a per-transaction setting, allowing to bypass
 WAL logging and checkpointing.

Not going to happen; this is all or nothing.

 Forcing the backend to care itself for
 writing the data I'm not sure is a good thing, but if you say so.

Yeah, I think proposal (c) is likely to be a net loss.

(a) and (d) are probably simple, if by reprovisioning you mean
rm -rf $PGDATA; initdb.  Point (b) will be a bit trickier because
there are various housekeeping activities tied into checkpoints.
I think you can't actually remove checkpoints altogether, just
skip the flush-dirty-pages part.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Greg Smith


Josh Berkus wrote:

a) Eliminate WAL logging entirely
c) Turn off the background writer


Note that if you turn off full_page_writes and set 
bgwriter_lru_maxpages=0, you'd get a substantial move in both these 
directions without touching any code.  Would help prove those as useful 
directions to move toward or not.  The difference in WAL writes just 
after a checkpoint in particular, due to the full_page_writes behavior, 
is a significant portion of total WAL activity on most systems.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Pierre C




Well I guess I'd prefer a per-transaction setting, allowing to bypass
WAL logging and checkpointing. Forcing the backend to care itself for
writing the data I'm not sure is a good thing, but if you say so.


Well if the transaction touches a system catalog it better be WAL-logged...

A per-table (or per-index) setting makes more sense IMHO. For instance on  
recovery, truncate this table (this was mentioned before).
Another option would be make the table data safe, but on recovery,  
destroy and rebuild this index : because on a not so large, often updated  
table, with often updated indexes, it may not take long to rebuild the  
indexes, but all those wal-logged index updates do add some overhead.



--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Josh Berkus


 Well I guess I'd prefer a per-transaction setting, allowing to bypass
 WAL logging and checkpointing. 

Not even conceiveable.  For this to work, we're talking about the whole
database installation.  This is only a set of settings for a database
*server* which is considered disposable and replaceable, where if it
shuts down unexpectedly, you throw it away and replace it.

 Forcing the backend to care itself for
 writing the data I'm not sure is a good thing, but if you say so.

Oh, yeah, I guess we'd only be turning off the LRU cache operations of
the background writer.  Same with checkpoints.  Copying between
shared_buffers and the LRU cache would still happen.

 Calling a user function at the end of recovery and before accepting
 connection would be good I think. A user function (per database) is
 better than a script because if you want to run it before accepting
 connections and still cause changes in the database…

Hmmm, you're not quite following my idea.  There is no recovery.  If the
database shuts down unexpectedly, it's toast and you replace it from
another copy somewhere else.

 (a) and (d) are probably simple, if by reprovisioning you mean
 rm -rf $PGDATA; initdb.

Exactly.  Followed by scp database_image.  Or heck, just replacing the
whole VM.

 Point (b) will be a bit trickier because
 there are various housekeeping activities tied into checkpoints.
 I think you can't actually remove checkpoints altogether, just
 skip the flush-dirty-pages part.

Yes, and we'd want to flush dirty pages on an actual shutdown command.
We do want to be able to shut down the DB on purpose.

 Well if the transaction touches a system catalog it better be
 WAL-logged...

Given the above, why?


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-17 Thread Tom Lane

Josh Berkus j...@agliodbs.com writes:
 (a) and (d) are probably simple, if by reprovisioning you mean
 rm -rf $PGDATA; initdb.

 Exactly.  Followed by scp database_image.  Or heck, just replacing the
 whole VM.

Right, that would work.  I don't think you really need to implement that
inside Postgres.  I would envision having the startup script do it, ie

rm -rf $PGDATA
cp -pr prepared-database-image $PGDATA

# this loop exits when postmaster exits normally
while ! postmaster ...
do
rm -rf $PGDATA
cp -pr prepared-database-image $PGDATA
done

Then all you need is a tweak to make the postmaster exit(1) after
a crash instead of trying to launch recovery.

regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net

On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote:
 On 6/15/10 10:37 AM, Chris Browne wrote:

 I'd like to see some figures about WAL on RAMfs vs. simply turning off
 fsync and full_page_writes.  Per Gavin's tests, PostgreSQL is already
 close to TokyoCabinet/MongoDB performance just with those turned off; I
 wonder if actually having the WAL on a memory partition would make any
 real difference in throughput.

 I've seen a lot of call for this recently, especially since PostgreSQL
 seems to be increasingly in use as a reporting server for Hadoop.  Might
 be worth experimenting with just making wal writing a no-op.  We'd also
 want to disable checkpointing, of course.


My back-of-the-envelope experiment: Inserting single integers into a
table without indexes using a prepared query via psycopg2.

Python Script:
import psycopg2
from time import time
conn = psycopg2.connect(database='jgardner')
cursor = conn.cursor()
cursor.execute(CREATE TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
conn.commit()
start = time()
tx = 0
while time() - start  1.0:
cursor.execute(EXECUTE ins(%s), (tx,));
conn.commit()
tx += 1
print tx
cursor.execute(DROP TABLE test);
conn.commit();

Local disk, WAL on same FS:
* Default config = 90
* full_page_writes=off = 90
* synchronous_commit=off = 4,500
* fsync=off = 5,100
* fsync=off and synchronous_commit=off = 5,500
* fsync=off and full_page_writes=off = 5,150
* fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off = 5,200
* fsync=off = 5,250
* synchronous_commit=off = 5,200
* fsync=off and synchronous_commit=off = 5,450
* fsync=off and full_page_writes=off = 5,250
* fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.

If there's anything else I should tinker with, I'm all ears.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Mark Kirkwood


On 16/06/10 18:30, jgard...@jonathangardner.net wrote:

On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote:
   

On 6/15/10 10:37 AM, Chris Browne wrote:

I'd like to see some figures about WAL on RAMfs vs. simply turning off
fsync and full_page_writes.  Per Gavin's tests, PostgreSQL is already
close to TokyoCabinet/MongoDB performance just with those turned off; I
wonder if actually having the WAL on a memory partition would make any
real difference in throughput.

I've seen a lot of call for this recently, especially since PostgreSQL
seems to be increasingly in use as a reporting server for Hadoop.  Might
be worth experimenting with just making wal writing a no-op.  We'd also
want to disable checkpointing, of course.

 

My back-of-the-envelope experiment: Inserting single integers into a
table without indexes using a prepared query via psycopg2.

Python Script:
import psycopg2
from time import time
conn = psycopg2.connect(database='jgardner')
cursor = conn.cursor()
cursor.execute(CREATE TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
conn.commit()
start = time()
tx = 0
while time() - start  1.0:
 cursor.execute(EXECUTE ins(%s), (tx,));
 conn.commit()
 tx += 1
print tx
cursor.execute(DROP TABLE test);
conn.commit();

Local disk, WAL on same FS:
* Default config =  90
* full_page_writes=off =  90
* synchronous_commit=off =  4,500
* fsync=off =  5,100
* fsync=off and synchronous_commit=off =  5,500
* fsync=off and full_page_writes=off =  5,150
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500

tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off =  5,200
* fsync=off =  5,250
* synchronous_commit=off =  5,200
* fsync=off and synchronous_commit=off =  5,450
* fsync=off and full_page_writes=off =  5,250
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.

If there's anything else I should tinker with, I'm all ears.

   


Seeing some profiler output (e.g oprofile) for the fastest case (and 
maybe 'em all later) might be informative about what limit is being hit 
here.


regards

Mark

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C




Have you tried connecting using a UNIX socket instead of a TCP socket on
localhost ? On such very short queries, the TCP overhead is significant.


Actually UNIX sockets are the default for psycopg2, had forgotten that.

I get 7400 using UNIX sockets and 3000 using TCP (host=localhost)

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Greg Smith


jgard...@jonathangardner.net wrote:

NOTE: If I do one giant commit instead of lots of littler ones, I get
much better speeds for the slower cases, but I never exceed 5,500
which appears to be some kind of wall I can't break through.
  


That's usually about where I run into the upper limit on how many 
statements Python can execute against the database per second.  Between 
that and the GIL preventing better multi-core use, once you pull the 
disk out and get CPU bound it's hard to use Python for load testing of 
small statements and bottleneck anywhere except in Python itself.


I normally just write little performance test cases in the pgbench 
scripting language, then I get multiple clients and (in 9.0) multiple 
driver threads all for free.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C



FYI I've tweaked this program a bit :

import psycopg2
from time import time
conn = psycopg2.connect(database='peufeu')
cursor = conn.cursor()
cursor.execute(CREATE TEMPORARY TABLE test (data int not null))
conn.commit()
cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
cursor.execute(PREPARE sel AS SELECT 1)
conn.commit()
start = time()
tx = 0
N = 100
d = 0
while d  10:
for n in xrange( N ):
cursor.execute(EXECUTE ins(%s), (tx,));
#~ conn.commit()
#~ cursor.execute(EXECUTE sel );
conn.commit()
d = time() - start
tx += N
print result : %d tps % (tx / d)
cursor.execute(DROP TABLE test);
conn.commit();

Results (Core 2 quad, ubuntu 10.04 64 bits) :

SELECT 1 : 21000 queries/s (I'd say 50 us per query isn't bad !)
INSERT with commit every 100 inserts : 17800 insets/s
INSERT with commit every INSERT : 7650 tps

fsync is on but not synchronous_commit.


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Alvaro Herrera

Excerpts from jgard...@jonathangardner.net's message of mié jun 16 02:30:30 
-0400 2010:

 NOTE: If I do one giant commit instead of lots of littler ones, I get
 much better speeds for the slower cases, but I never exceed 5,500
 which appears to be some kind of wall I can't break through.
 
 If there's anything else I should tinker with, I'm all ears.

increase wal_buffers?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Josh Berkus


 * fsync=off = 5,100
 * fsync=off and synchronous_commit=off = 5,500

Now, this *is* interesting ... why should synch_commit make a difference
if fsync is off?

Anyone have any ideas?

 tmpfs, WAL on same tmpfs:
 * Default config: 5,200
 * full_page_writes=off = 5,200
 * fsync=off = 5,250
 * synchronous_commit=off = 5,200
 * fsync=off and synchronous_commit=off = 5,450
 * fsync=off and full_page_writes=off = 5,250
 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500

So, in this test, it seems like having WAL on tmpfs doesn't make a
significant difference for everything == off.

I'll try running some tests on Amazon when I have a chance.  It would be
worthwhile to get figures without Python's ceiling.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner

On Wed, Jun 16, 2010 at 1:27 AM, Greg Smith g...@2ndquadrant.com wrote:

 I normally just write little performance test cases in the pgbench scripting
 language, then I get multiple clients and (in 9.0) multiple driver threads
 all for free.


See, this is why I love these mailing lists. I totally forgot about
pgbench. I'm going to dump my cheesy python script and play with that
for a while.

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner

On Wed, Jun 16, 2010 at 4:22 AM, Pierre C li...@peufeu.com wrote:

 import psycopg2
 from time import time
 conn = psycopg2.connect(database='peufeu')
 cursor = conn.cursor()
 cursor.execute(CREATE TEMPORARY TABLE test (data int not null))
 conn.commit()
 cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1))
 cursor.execute(PREPARE sel AS SELECT 1)
 conn.commit()
 start = time()
 tx = 0
 N = 100
 d = 0
 while d  10:
        for n in xrange( N ):
                cursor.execute(EXECUTE ins(%s), (tx,));
                #~ conn.commit()
                #~ cursor.execute(EXECUTE sel );
        conn.commit()
        d = time() - start
        tx += N
 print result : %d tps % (tx / d)
 cursor.execute(DROP TABLE test);
 conn.commit();


I'm not surprised that Python add is so slow, but I am surprised that
I didn't remember it was... ;-)

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner

On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote:

 * fsync=off = 5,100
 * fsync=off and synchronous_commit=off = 5,500

 Now, this *is* interesting ... why should synch_commit make a difference
 if fsync is off?

 Anyone have any ideas?


I may have stumbled upon this by my ignorance, but I thought I read
that synchronous_commit controlled whether it tries to line up commits
or has a more free-for-all that may cause some intermediate weirdness.

-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Balkrishna Sharma

http://www.postgresql.org/docs/current/static/wal-async-commit.html
 the server waits for the transaction's WAL records to be flushed to permanent 
storage before returning a success indication to the client.
I think with fynch=off, whether WAL gets written to disk or not is still 
controlled by synchronous_commit parameter. guessing here...

 Date: Wed, 16 Jun 2010 12:19:20 -0700
 Subject: Re: [PERFORM] PostgreSQL as a local in-memory cache
 From: jgard...@jonathangardner.net
 To: j...@agliodbs.com
 CC: pgsql-performance@postgresql.org

 On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote:

  * fsync=off = 5,100
  * fsync=off and synchronous_commit=off = 5,500

  Now, this *is* interesting ... why should synch_commit make a difference
  if fsync is off?

  Anyone have any ideas?

 I may have stumbled upon this by my ignorance, but I thought I read
 that synchronous_commit controlled whether it tries to line up commits
 or has a more free-for-all that may cause some intermediate weirdness.

 -- 
 Jonathan Gardner
 jgard...@jonathangardner.net

 -- 
 Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-performance

_
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Jonathan Gardner

On Wed, Jun 16, 2010 at 12:51 AM, Pierre C li...@peufeu.com wrote:

 Have you tried connecting using a UNIX socket instead of a TCP socket on
 localhost ? On such very short queries, the TCP overhead is significant.


Unfortunately, this isn't an option for my use case. Carbonado only
supports TCP connections.


-- 
Jonathan Gardner
jgard...@jonathangardner.net

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Craig James


On 6/16/10 12:00 PM, Josh Berkus wrote:



* fsync=off =  5,100
* fsync=off and synchronous_commit=off =  5,500


Now, this *is* interesting ... why should synch_commit make a difference
if fsync is off?

Anyone have any ideas?


I found that pgbench has noise of about 20% (I posted about this a couple 
days ago using data from 1000 identical pgbench runs).  Unless you make a bunch of runs 
and average them, a difference of 5,100 to 5,500 appears to be meaningless.

Craig




tmpfs, WAL on same tmpfs:
* Default config: 5,200
* full_page_writes=off =  5,200
* fsync=off =  5,250
* synchronous_commit=off =  5,200
* fsync=off and synchronous_commit=off =  5,450
* fsync=off and full_page_writes=off =  5,250
* fsync=off, synchronous_commit=off and full_page_writes=off =  5,500


So, in this test, it seems like having WAL on tmpfs doesn't make a
significant difference for everything == off.

I'll try running some tests on Amazon when I have a chance.  It would be
worthwhile to get figures without Python's ceiling.




--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread Pierre C




I'm not surprised that Python add is so slow, but I am surprised that
I didn't remember it was... ;-)


it's not the add(), it's the time.time()...


--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net

We have a fairly unique need for a local, in-memory cache. This will
store data aggregated from other sources. Generating the data only
takes a few minutes, and it is updated often. There will be some
fairly expensive queries of arbitrary complexity run at a fairly high
rate. We're looking for high concurrency and reasonable performance
throughout.

The entire data set is roughly 20 MB in size. We've tried Carbonado in
front of SleepycatJE only to discover that it chokes at a fairly low
concurrency and that Carbonado's rule-based optimizer is wholly
insufficient for our needs. We've also tried Carbonado's Map
Repository which suffers the same problems.

I've since moved the backend database to a local PostgreSQL instance
hoping to take advantage of PostgreSQL's superior performance at high
concurrency. Of course, at the default settings, it performs quite
poorly compares to the Map Repository and Sleepycat JE.

My question is how can I configure the database to run as quickly as
possible if I don't care about data consistency or durability? That
is, the data is updated so often and it can be reproduced fairly
rapidly so that if there is a server crash or random particles from
space mess up memory we'd just restart the machine and move on.

I've never configured PostgreSQL to work like this and I thought maybe
someone here had some ideas on a good approach to this.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-16 Thread jgard...@jonathangardner.net

On Jun 14, 7:14 pm, jgard...@jonathangardner.net
jgard...@jonathangardner.net wrote:
 We have a fairly unique need for a local, in-memory cache. This will
 store data aggregated from other sources. Generating the data only
 takes a few minutes, and it is updated often. There will be some
 fairly expensive queries of arbitrary complexity run at a fairly high
 rate. We're looking for high concurrency and reasonable performance
 throughout.

 The entire data set is roughly 20 MB in size. We've tried Carbonado in
 front of SleepycatJE only to discover that it chokes at a fairly low
 concurrency and that Carbonado's rule-based optimizer is wholly
 insufficient for our needs. We've also tried Carbonado's Map
 Repository which suffers the same problems.

 I've since moved the backend database to a local PostgreSQL instance
 hoping to take advantage of PostgreSQL's superior performance at high
 concurrency. Of course, at the default settings, it performs quite
 poorly compares to the Map Repository and Sleepycat JE.

 My question is how can I configure the database to run as quickly as
 possible if I don't care about data consistency or durability? That
 is, the data is updated so often and it can be reproduced fairly
 rapidly so that if there is a server crash or random particles from
 space mess up memory we'd just restart the machine and move on.

 I've never configured PostgreSQL to work like this and I thought maybe
 someone here had some ideas on a good approach to this.

Just to summarize what I've been able to accomplish so far. By turning
fsync and synchronize_commit off, and moving the data dir to tmpfs,
I've been able to run the expensive queries much faster than BDB or
the MapRepository that comes with Carbonado. This is because
PostgreSQL's planner is so much faster and better than whatever
Carbonado has. Tweaking indexes has only made things run faster.

Right now I'm wrapping up the project so that we can do some serious
performance benchmarks. I'll let you all know how it goes.

Also, just a note that setting up PostgreSQL for these weird scenarios
turned out to be just a tiny bit harder than setting up SQLite. I
remember several years ago when there was a push to simplify the
configuration and installation of PostgreSQL, and I believe that that
has born fruit.

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread Steve Wampler


Chris Browne wrote:

jgard...@jonathangardner.net jgard...@jonathangardner.net writes:

My question is how can I configure the database to run as quickly as
possible if I don't care about data consistency or durability? That
is, the data is updated so often and it can be reproduced fairly
rapidly so that if there is a server crash or random particles from
space mess up memory we'd just restart the machine and move on.


For such a scenario, I'd suggest you:

- Set up a filesystem that is memory-backed.  On Linux, RamFS or TmpFS
  are reasonable options for this.

- The complication would be that your restart the machine and move
  on needs to consist of quite a few steps:

  - recreating the filesystem
  - fixing permissions as needed
  - running initdb to set up new PG instance
  - automating any needful fiddling with postgresql.conf, pg_hba.conf
  - starting up that PG instance
  - creating users, databases, schemas, ...


Doesn't PG now support putting both WAL and user table files onto
file systems other than the one holding the PG config files and PG
'admin' tables?  Wouldn't doing so simplify the above considertably
by allowing just the WAL and user tables on the memory-backed file
systems?  I wouldn't think the performance impact of leaving
the rest of the stuff on disk would be that large.

Or does losing WAL files mandate a new initdb?

--
Steve Wampler -- swamp...@noao.edu
The gods that smiled on your birth are now laughing out loud.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread Craig James


[oops, didn't hit reply to list first time, resending...]

On 6/15/10 9:02 AM, Steve Wampler wrote:

Chris Browne wrote:

jgard...@jonathangardner.net jgard...@jonathangardner.net writes:

My question is how can I configure the database to run as quickly as
possible if I don't care about data consistency or durability? That
is, the data is updated so often and it can be reproduced fairly
rapidly so that if there is a server crash or random particles from
space mess up memory we'd just restart the machine and move on.


For such a scenario, I'd suggest you:

- Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS
are reasonable options for this.

- The complication would be that your restart the machine and move
on needs to consist of quite a few steps:

- recreating the filesystem
- fixing permissions as needed
- running initdb to set up new PG instance
- automating any needful fiddling with postgresql.conf, pg_hba.conf
- starting up that PG instance
- creating users, databases, schemas, ...


How about this: Set up a database entirely on a RAM disk, then install a 
WAL-logging warm standby.  If the production computer goes down, you bring the 
warm standby online, shut it down, and use tar(1) to recreate the database on 
the production server when you bring it back online.  You have speed and you 
have near-100% backup.

Craig

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread jgard...@jonathangardner.net

On Jun 15, 8:47 am, Chris Browne cbbro...@acm.org wrote:
 jgard...@jonathangardner.net jgard...@jonathangardner.net writes:
  My question is how can I configure the database to run as quickly as
  possible if I don't care about data consistency or durability? That
  is, the data is updated so often and it can be reproduced fairly
  rapidly so that if there is a server crash or random particles from
  space mess up memory we'd just restart the machine and move on.

 For such a scenario, I'd suggest you:

 - Set up a filesystem that is memory-backed.  On Linux, RamFS or TmpFS
   are reasonable options for this.


I had forgotten about this. I will try this out.

 - The complication would be that your restart the machine and move
   on needs to consist of quite a few steps:

   - recreating the filesystem
   - fixing permissions as needed
   - running initdb to set up new PG instance
   - automating any needful fiddling with postgresql.conf, pg_hba.conf
   - starting up that PG instance
   - creating users, databases, schemas, ...


I'm going to have a system in place to create these databases when I
restart the service.

 ...

 I wonder if this kind of installation comes into its own for more
 realistic scenarios in the presence of streaming replication.  If you
 know the WAL files have gotten to disk on another server, that's a
 pretty good guarantee :-).


I have found that pre-computing and storing values in a general
relational-type database without durability is an ideal use case to
help improve services that need to return calculated results quickly.
A simple hash lookup is no longer sufficient. Perhaps PostgreSQL
running in this mode will be the ideal solution.

Nowadays, no one is really surprised that it takes 30 seconds or so to
replicate your data everywhere, but they do detest not getting answers
to their complicated queries immediately.


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread Chris Browne

swamp...@noao.edu (Steve Wampler) writes:
 Or does losing WAL files mandate a new initdb?

Losing WAL would mandate initdb, so I'd think this all fits into the
set of stuff worth putting onto ramfs/tmpfs.  Certainly it'll all be
significant to the performance focus.
-- 
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/internet.html
MS  apparently now  has a  team dedicated  to tracking  problems with
Linux  and publicizing them.   I guess  eventually they'll  figure out
this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread Jaime Casanova

On Tue, Jun 15, 2010 at 12:37 PM, Chris Browne cbbro...@acm.org wrote:
 swamp...@noao.edu (Steve Wampler) writes:
 Or does losing WAL files mandate a new initdb?

 Losing WAL would mandate initdb, so I'd think this all fits into the
 set of stuff worth putting onto ramfs/tmpfs.  Certainly it'll all be
 significant to the performance focus.

why is that? isn't simply execute pg_resetxlog enough? specially
'cause OP doesn't care about loosing some transactions

-- 
Jaime Casanova www.2ndQuadrant.com
Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: [PERFORM] PostgreSQL as a local in-memory cache

2010-06-15 Thread Josh Berkus

On 6/15/10 10:37 AM, Chris Browne wrote:
 swamp...@noao.edu (Steve Wampler) writes:
 Or does losing WAL files mandate a new initdb?
 
 Losing WAL would mandate initdb, so I'd think this all fits into the
 set of stuff worth putting onto ramfs/tmpfs.  Certainly it'll all be
 significant to the performance focus.

I'd like to see some figures about WAL on RAMfs vs. simply turning off
fsync and full_page_writes.  Per Gavin's tests, PostgreSQL is already
close to TokyoCabinet/MongoDB performance just with those turned off; I
wonder if actually having the WAL on a memory partition would make any
real difference in throughput.

I've seen a lot of call for this recently, especially since PostgreSQL
seems to be increasingly in use as a reporting server for Hadoop.  Might
be worth experimenting with just making wal writing a no-op.  We'd also
want to disable checkpointing, of course.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

71 matches

Mail list logo