Re: [PERFORM] PostgreSQL as a local in-memory cache
On Tue, 2010-06-29 at 21:39 -0400, Bruce Momjian wrote: Jignesh Shah wrote: On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. ?Thanks. I am personally surprised it was designed that way; ?I thought we would just delay fsync. That would require writing and syncing to be separable actions. ?If you're using O_SYNC or similar, they aren't. Ah, very good point. ?I have added a C comment to clarify why this is the current behavior; ?attached and applied. -- ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com Though has anybody seen a behaviour where synchronous_commit=off is slower than synchronous_commit=on ? Again there are two cases here one with O_* flag and other with f*sync flags. But I had seen that behavior with PostgreSQL 9.0 beta(2 I think) though havent really investigated it much yet .. (though now I dont remember which wal_sync_method flag) . Just curious if anybody has seen that behavior.. I have trouble believing how synchronous_commit=off could be slower than 'on'. I wonder if it could be contention on wal buffers? Say I've turned synchronous_commit off, I drive enough traffic fill up my wal_buffers. I assume that we would have to start writing buffers down to disk before allocating to the new process. -- Brad Nicholson 416-673-4106 Database Administrator, Afilias Canada Corp. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Brad Nicholson wrote: Ah, very good point. ?I have added a C comment to clarify why this is the current behavior; ?attached and applied. -- ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com Though has anybody seen a behaviour where synchronous_commit=off is slower than synchronous_commit=on ? Again there are two cases here one with O_* flag and other with f*sync flags. But I had seen that behavior with PostgreSQL 9.0 beta(2 I think) though havent really investigated it much yet .. (though now I dont remember which wal_sync_method flag) . Just curious if anybody has seen that behavior.. I have trouble believing how synchronous_commit=off could be slower than 'on'. I wonder if it could be contention on wal buffers? Say I've turned synchronous_commit off, I drive enough traffic fill up my wal_buffers. I assume that we would have to start writing buffers down to disk before allocating to the new process. Uh, good question. I know this report showed ynchronous_commit=off as faster than 'on': http://archives.postgresql.org/pgsql-performance/2010-06/msg00277.php -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
I haven't jumped in yet on this thread, but here goes If you're really looking for query performance, then any database which is designed with reliability and ACID consistency in mind is going to inherently have some mis-fit features. Some other ideas to consider, depending on your query mix: 1. MySQL with the MyISAM database (non-ACID) 2. Put an in-application generic query cache in front of the DB, that runs in the app address space, e.g. Cache' if using Java 3. Using a DB is a good way to get generic querying capability, but if the where clause in the querying is over a small set of meta-data, and SQL syntax is not a big requirement, consider non-RDBMS alternatives, e.g. use XPath over a W3C DOM object tree to get primary keys to in-memory hash tables (possibly distributed with something like memcached) On Mon, Jun 14, 2010 at 9:14 PM, jgard...@jonathangardner.net jgard...@jonathangardner.net wrote: We have a fairly unique need for a local, in-memory cache. This will store data aggregated from other sources. Generating the data only takes a few minutes, and it is updated often. There will be some fairly expensive queries of arbitrary complexity run at a fairly high rate. We're looking for high concurrency and reasonable performance throughout. The entire data set is roughly 20 MB in size. We've tried Carbonado in front of SleepycatJE only to discover that it chokes at a fairly low concurrency and that Carbonado's rule-based optimizer is wholly insufficient for our needs. We've also tried Carbonado's Map Repository which suffers the same problems. I've since moved the backend database to a local PostgreSQL instance hoping to take advantage of PostgreSQL's superior performance at high concurrency. Of course, at the default settings, it performs quite poorly compares to the Map Repository and Sleepycat JE. My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. I've never configured PostgreSQL to work like this and I thought maybe someone here had some ideas on a good approach to this. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 6/30/10 9:42 AM, Dave Crooke wrote: I haven't jumped in yet on this thread, but here goes If you're really looking for query performance, then any database which is designed with reliability and ACID consistency in mind is going to inherently have some mis-fit features. Some other ideas to consider, depending on your query mix: 1. MySQL with the MyISAM database (non-ACID) 2. Put an in-application generic query cache in front of the DB, that runs in the app address space, e.g. Cache' if using Java 3. Using a DB is a good way to get generic querying capability, but if the where clause in the querying is over a small set of meta-data, and SQL syntax is not a big requirement, consider non-RDBMS alternatives, e.g. use XPath over a W3C DOM object tree to get primary keys to in-memory hash tables (possibly distributed with something like memcached) These would be good suggestions if the throwaway database was the only one. But in real life, these throwaway databases are built from other databases that are NOT throwaway, where the data matters and ACID is critical. In other words, they'll probably need Postgres anyway. Sure, you could use both Postgres and MySQL/ISAM, but that means installing and maintaining both, plus building all of the other application layers to work on both systems. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Tue, Jun 29, 2010 at 9:39 PM, Bruce Momjian br...@momjian.us wrote: Jignesh Shah wrote: On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. ?Thanks. I am personally surprised it was designed that way; ?I thought we would just delay fsync. That would require writing and syncing to be separable actions. ?If you're using O_SYNC or similar, they aren't. Ah, very good point. ?I have added a C comment to clarify why this is the current behavior; ?attached and applied. -- ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com Though has anybody seen a behaviour where synchronous_commit=off is slower than synchronous_commit=on ? Again there are two cases here one with O_* flag and other with f*sync flags. But I had seen that behavior with PostgreSQL 9.0 beta(2 I think) though havent really investigated it much yet .. (though now I dont remember which wal_sync_method flag) . Just curious if anybody has seen that behavior.. I have trouble believing how synchronous_commit=off could be slower than 'on'. -- Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Hi Bruce, Let me clarify the problem a bit.. If the underlying WAL disk is SSD then it seems I can get synchronous_commit=on to work faster than synchronous_commit=off.. Yes sounds unintuitive to me. But the results seems to point in that direction. It could be that it hit some other bottleneck with synchronous_commit=off reaches that synchronous_commit=on does not hit (or has not hit yet). Brads point of wal buffers could be valid. Though typically I havent seen the need to increase it beyond 1024kB yet. Hopefully I will retry it with the latest PostgreSQL 9.0 bits and see it happens again. More on that later. Regards, Jignesh -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 6/30/2010 2:21 PM, Jignesh Shah wrote: If the underlying WAL disk is SSD then it seems I can get synchronous_commit=on to work faster than synchronous_commit=off.. The first explanation that pops to mind is that synchronous_commit is writing all the time, which doesn't have the same sort of penalty on SSD. Whereas if you turn it off, then there are some idle periods where the SSD could be writing usefully, but instead it's buffering for the next burst instead. The importance of that can be magnified on operating systems that do their own buffering and tend to lag behind writes until they see an fsync call, like is the case on Linux with ext3. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote: The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). Is this actually true? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Robert Haas wrote: On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote: The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). Is this actually true? I asked on IRC and was told it is true, and looking at the C code it looks true. What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Bruce Momjian br...@momjian.us wrote: What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync Ah, that answers the question Josh Berkus asked here: http://archives.postgresql.org/pgsql-performance/2010-06/msg00285.php (which is something I was wondering about, too.) -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Tue, Jun 29, 2010 at 9:32 AM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote: The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). Is this actually true? I asked on IRC and was told it is true, and looking at the C code it looks true. What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. Thanks. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Robert Haas wrote: On Tue, Jun 29, 2010 at 9:32 AM, Bruce Momjian br...@momjian.us wrote: Robert Haas wrote: On Mon, Jun 28, 2010 at 5:57 PM, Bruce Momjian br...@momjian.us wrote: The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). Is this actually true? I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. Thanks. I am personally surprised it was designed that way; I thought we would just delay fsync. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. Thanks. I am personally surprised it was designed that way; I thought we would just delay fsync. That would require writing and syncing to be separable actions. If you're using O_SYNC or similar, they aren't. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. Thanks. I am personally surprised it was designed that way; I thought we would just delay fsync. That would require writing and syncing to be separable actions. If you're using O_SYNC or similar, they aren't. Ah, very good point. I have added a C comment to clarify why this is the current behavior; attached and applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Index: src/backend/access/transam/xact.c === RCS file: /cvsroot/pgsql/src/backend/access/transam/xact.c,v retrieving revision 1.291 diff -c -c -r1.291 xact.c *** src/backend/access/transam/xact.c 13 May 2010 11:39:30 - 1.291 --- src/backend/access/transam/xact.c 29 Jun 2010 18:33:47 - *** *** 1028,1034 if (XactSyncCommit || forceSyncCommit || haveNonTemp) { /* ! * Synchronous commit case. * * Sleep before flush! So we can flush more than one commit records * per single fsync. (The idea is some other backend may do the --- 1028,1034 if (XactSyncCommit || forceSyncCommit || haveNonTemp) { /* ! * Synchronous commit case: * * Sleep before flush! So we can flush more than one commit records * per single fsync. (The idea is some other backend may do the *** *** 1054,1060 else { /* ! * Asynchronous commit case. * * Report the latest async commit LSN, so that the WAL writer knows to * flush this commit. --- 1054,1065 else { /* ! * Asynchronous commit case: ! * ! * This enables possible committed transaction loss in the case of a ! * postmaster crash because WAL buffers are left unwritten. ! * Ideally we could issue the WAL write without the fsync, but ! * some wal_sync_methods do not allow separate write/fsync. * * Report the latest async commit LSN, so that the WAL writer knows to * flush this commit. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. Thanks. I am personally surprised it was designed that way; I thought we would just delay fsync. That would require writing and syncing to be separable actions. If you're using O_SYNC or similar, they aren't. Ah, very good point. I have added a C comment to clarify why this is the current behavior; attached and applied. -- Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com Though has anybody seen a behaviour where synchronous_commit=off is slower than synchronous_commit=on ? Again there are two cases here one with O_* flag and other with f*sync flags. But I had seen that behavior with PostgreSQL 9.0 beta(2 I think) though havent really investigated it much yet .. (though now I dont remember which wal_sync_method flag) . Just curious if anybody has seen that behavior.. Regards, Jignesh -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Jignesh Shah wrote: On Tue, Jun 29, 2010 at 2:45 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Bruce Momjian br...@momjian.us writes: I asked on IRC and was told it is true, and looking at the C code it looks true. ?What synchronous_commit = false does is to delay writing the wal buffers to disk and fsyncing them, not just fsync, which is where the commit loss due to db process crash comes from. Ah, I see. ?Thanks. I am personally surprised it was designed that way; ?I thought we would just delay fsync. That would require writing and syncing to be separable actions. ?If you're using O_SYNC or similar, they aren't. Ah, very good point. ?I have added a C comment to clarify why this is the current behavior; ?attached and applied. -- ?Bruce Momjian ?br...@momjian.us ? ? ? ?http://momjian.us ?EnterpriseDB ? ? ? ? ? ? ? ? ? ? ? ? ? ? http://enterprisedb.com Though has anybody seen a behaviour where synchronous_commit=off is slower than synchronous_commit=on ? Again there are two cases here one with O_* flag and other with f*sync flags. But I had seen that behavior with PostgreSQL 9.0 beta(2 I think) though havent really investigated it much yet .. (though now I dont remember which wal_sync_method flag) . Just curious if anybody has seen that behavior.. I have trouble believing how synchronous_commit=off could be slower than 'on'. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Bruce Momjian wrote: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Not going to happen; this is all or nothing. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Yeah, I think proposal (c) is likely to be a net loss. (a) and (d) are probably simple, if by reprovisioning you mean rm -rf $PGDATA; initdb. Point (b) will be a bit trickier because there are various housekeeping activities tied into checkpoints. I think you can't actually remove checkpoints altogether, just skip the flush-dirty-pages part. Based on this thread, I have developed the following documentation patch that outlines the performance enhancements possible if durability is not required. The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). Applied. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Tom Lane t...@sss.pgh.pa.us writes: The problem with a system-wide no-WAL setting is it means you can't trust the system catalogs after a crash. Which means you are forced to use initdb to recover from any crash, in return for not a lot of savings (for typical usages where there's not really much churn in the catalogs). What about having a catalog only WAL setting, userset ? I'm not yet clear on the point but it well seems that the per transaction WAL setting is impossible because of catalogs (meaning mainly DDL support), but I can see us enforcing durability and crash safety there. That would probably mean that setting WAL level this low yet doing any kind of DDL would need to be either an ERROR, or better yet, a WARNING telling that the WAL level can not be that low so has been raised by the system. Regards, -- dim -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote: It must be a setting, not a version. For instance suppose you have a session table for your website and a users table. - Having ACID on the users table is of course a must ; - for the sessions table you can drop the D You're trying to solve a different use-case than the one I am. Your use-case will be solved by global temporary tables. I suggest that you give Robert Haas some help feedback on that. My use case is people using PostgreSQL as a cache, or relying entirely on replication for durability. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com Is he? Wouldn't a global temporary table have content that is not visible between db connections? A db session many not be the same as a user session. -- Rob Wultsch wult...@gmail.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Thu, Jun 24, 2010 at 4:40 AM, Rob Wultsch wult...@gmail.com wrote: On Fri, Jun 18, 2010 at 1:55 PM, Josh Berkus j...@agliodbs.com wrote: It must be a setting, not a version. For instance suppose you have a session table for your website and a users table. - Having ACID on the users table is of course a must ; - for the sessions table you can drop the D You're trying to solve a different use-case than the one I am. Your use-case will be solved by global temporary tables. I suggest that you give Robert Haas some help feedback on that. My use case is people using PostgreSQL as a cache, or relying entirely on replication for durability. Is he? Wouldn't a global temporary table have content that is not visible between db connections? A db session many not be the same as a user session. I'm planning to implement global temporary tables, which can have different contents for each user session. And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). Yep. And it's quite possible that this will be adequate for most users. And it's also possible that the extra CPU which Robert isn't getting rid of (bgwriter, checkpointing, etc.) does not have a measurable impact on performance. At this point, my idea (which I call RunningWithScissorsDB) is only an idea for experimentation and performance testing. It's pretty far off from being a TODO. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
2010/6/24 Josh Berkus j...@agliodbs.com: And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Much more important is smarter cache controlling then we have now - maybe with priorities for some tables and some operations (applications) - sometimes we don't need use cache for extra large scans. Regards Pavel Stehule Yep. And it's quite possible that this will be adequate for most users. And it's also possible that the extra CPU which Robert isn't getting rid of (bgwriter, checkpointing, etc.) does not have a measurable impact on performance. At this point, my idea (which I call RunningWithScissorsDB) is only an idea for experimentation and performance testing. It's pretty far off from being a TODO. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote: 2010/6/24 Josh Berkus j...@agliodbs.com: And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Because memcache is yet another layer and increases overhead to the application developers by adding yet another layer to work with. Non logged tables would rock. SELECT * FROM foo; :D JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
2010/6/24 Joshua D. Drake j...@commandprompt.com: On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote: 2010/6/24 Josh Berkus j...@agliodbs.com: And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Because memcache is yet another layer and increases overhead to the application developers by adding yet another layer to work with. Non logged tables would rock. I see only one positive point - it can help to people with broken design application with migration to PostgreSQL. There are different interesting feature - cached procedure's results like Oracle 11. - it's more general. only idea. For me memory tables are nonsens, but what about memory cached materialised views (maybe periodically refreshed)? Regards Pavel SELECT * FROM foo; :D :) JD -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote: 2010/6/24 Joshua D. Drake j...@commandprompt.com: On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote: 2010/6/24 Josh Berkus j...@agliodbs.com: And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Because memcache is yet another layer and increases overhead to the application developers by adding yet another layer to work with. Non logged tables would rock. I see only one positive point - it can help to people with broken design application with migration to PostgreSQL. The broken design is being required to work around PostgreSQL's lack of this optimization. There are different interesting feature - cached procedure's results like Oracle 11. - it's more general. only idea. For me memory tables are nonsens, but what about memory cached materialised views (maybe periodically refreshed)? Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory tables. Such tables simply contain transient information. One can already make memory tables in PostgreSQL by making a tablespace in a tmpfs partition. I have been eagerly waiting for this feature for six years so that I can write proper queries against ever-changing session data with transactional semantics (which memcached cannot offer). The only restriction I see for these transient data tables is that they cannot be referenced by standard tables using foreign key constraints. Otherwise, these tables behave like any other. That's the benefit. Cheers, M -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
2010/6/24 A.M. age...@themactionfaction.com: On Jun 24, 2010, at 4:01 PM, Pavel Stehule wrote: 2010/6/24 Joshua D. Drake j...@commandprompt.com: On Thu, 2010-06-24 at 21:14 +0200, Pavel Stehule wrote: 2010/6/24 Josh Berkus j...@agliodbs.com: And I'm also planning to implement unlogged tables, which have the same contents for all sessions but are not WAL-logged (and are truncated on startup). this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Because memcache is yet another layer and increases overhead to the application developers by adding yet another layer to work with. Non logged tables would rock. I see only one positive point - it can help to people with broken design application with migration to PostgreSQL. The broken design is being required to work around PostgreSQL's lack of this optimization. There are different interesting feature - cached procedure's results like Oracle 11. - it's more general. only idea. For me memory tables are nonsens, but what about memory cached materialised views (maybe periodically refreshed)? Non-WAL-logged, non-fsynced tables are not equivalent to MySQL memory tables. Such tables simply contain transient information. One can already make memory tables in PostgreSQL by making a tablespace in a tmpfs partition. I have been eagerly waiting for this feature for six years so that I can write proper queries against ever-changing session data with transactional semantics (which memcached cannot offer). The only restriction I see for these transient data tables is that they cannot be referenced by standard tables using foreign key constraints. Otherwise, these tables behave like any other. That's the benefit. if you remove WAL, then there are MVCC still - you have to do VACUUM, you have to do ANALYZE, you have to thinking about indexes ... Processing pipe for simple query is long too. The removing WAL doesn't do memory database from Postgres. But You have to know best, what do you do. Regards Pavel Stehule p.s. maybe memcached is too simply for you - there are more NoSQL db Cheers, M -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Thing is, if you only have one table (say, a sessions table) which you don't want logged, you don't necessarily want to fire up a 2nd software application just for that. Plus, recent testing seems to show that with no logging, memcached isn't really faster than PG. Also, like for asynch_commit, this is something where users are currently turning off fsync. Any option where we can present users with controlled, predictable data loss instead of random corruption is a good one. Much more important is smarter cache controlling then we have now - maybe with priorities for some tables and some operations (applications) - sometimes we don't need use cache for extra large scans. Well, that would be good *too*. You working on it? ;-) -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
2010/6/24 Josh Berkus j...@agliodbs.com: this is similar MySQL's memory tables. Personally, I don't see any practical sense do same work on PostgreSQL now, when memcached exists. Thing is, if you only have one table (say, a sessions table) which you don't want logged, you don't necessarily want to fire up a 2nd software application just for that. Plus, recent testing seems to show that with no logging, memcached isn't really faster than PG. sorry, I thinking some else. Not only WAL does significant overhead. You need litlle bit more memory, much more processing time. With very fast operations, the bottle neck will be in interprocess communication - but it doesn't mean so pg isn't slower than memcached. I repeating it again - there are no any universal tool for all tasks. Also, like for asynch_commit, this is something where users are currently turning off fsync. Any option where we can present users with controlled, predictable data loss instead of random corruption is a good one. it isn't too simple. What about statistics? These are used in system table. Much more important is smarter cache controlling then we have now - maybe with priorities for some tables and some operations (applications) - sometimes we don't need use cache for extra large scans. Well, that would be good *too*. You working on it? ;-) no - just I know about possible problems with memory control. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: Consider a non-crash-safe wal_level that eliminates WAL activity * http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Not going to happen; this is all or nothing. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Yeah, I think proposal (c) is likely to be a net loss. (a) and (d) are probably simple, if by reprovisioning you mean rm -rf $PGDATA; initdb. Point (b) will be a bit trickier because there are various housekeeping activities tied into checkpoints. I think you can't actually remove checkpoints altogether, just skip the flush-dirty-pages part. Based on this thread, I have developed the following documentation patch that outlines the performance enhancements possible if durability is not required. The patch also documents that synchronous_commit = false has potential committed transaction loss from a database crash (as well as an OS crash). -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.282 diff -c -c -r1.282 config.sgml *** doc/src/sgml/config.sgml 22 Jun 2010 02:57:49 - 1.282 --- doc/src/sgml/config.sgml 23 Jun 2010 18:53:26 - *** *** 1463,1469 really guaranteed to be safe against a server crash. (The maximum delay is three times xref linkend=guc-wal-writer-delay.) Unlike xref linkend=guc-fsync, setting this parameter to literaloff/ ! does not create any risk of database inconsistency: a crash might result in some recent allegedly-committed transactions being lost, but the database state will be just the same as if those transactions had been aborted cleanly. So, turning varnamesynchronous_commit/ off --- 1463,1470 really guaranteed to be safe against a server crash. (The maximum delay is three times xref linkend=guc-wal-writer-delay.) Unlike xref linkend=guc-fsync, setting this parameter to literaloff/ ! does not create any risk of database inconsistency: an operating ! system or database crash crash might result in some recent allegedly-committed transactions being lost, but the database state will be just the same as if those transactions had been aborted cleanly. So, turning varnamesynchronous_commit/ off Index: doc/src/sgml/perform.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/perform.sgml,v retrieving revision 1.80 diff -c -c -r1.80 perform.sgml *** doc/src/sgml/perform.sgml 29 May 2010 21:08:04 - 1.80 --- doc/src/sgml/perform.sgml 23 Jun 2010 18:53:26 - *** *** 1104,1107 --- 1104,1169 /sect2 /sect1 + sect1 id=non-durability +titleNon-Durable Settings/title + +indexterm zone=non-durability + primarynon-durable/primary +/indexterm + +para + Durability is a database feature that guarantees the recording of + committed transactions even if if the server crashes or loses + power. However, durability adds significant database overhead, + so if your site does not require such a guarantee, + productnamePostgreSQL/productname can be configured to run + much faster. The following are configuration changes you can make + to improve performance in such cases; they do not invalidate + commit guarantees related to database crashes, only abrupt operating + system stoppage, except as mentioned below: + + itemizedlist + listitem + para +Place the database cluster's data directory in a memory-backed +file system (i.e. acronymRAM/ disk). This eliminates all +database disk I/O, but limits data storage to the amount of +available memory (and perhaps swap). + /para + /listitem + + listitem + para +Turn off xref linkend=guc-fsync; there is no need to flush +data to disk. + /para + /listitem + + listitem + para +Turn off xref linkend=guc-full-page-writes; there is no need +to guard against partial page writes. + /para + /listitem + + listitem + para +Increase xref linkend=guc-checkpoint-segments and xref +linkend=guc-checkpoint-timeout ; this reduces the frequency +of checkpoints, but increases the storage requirements of +
Re: [PERFORM] PostgreSQL as a local in-memory cache
2010/6/23 Bruce Momjian br...@momjian.us: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: Consider a non-crash-safe wal_level that eliminates WAL activity * http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php -- isn't fsync to off enought? Regards Pavel Bruce Momjian br...@momjian.us http://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Pavel Stehule wrote: 2010/6/23 Bruce Momjian br...@momjian.us: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. ?Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: ? ? ? ?Consider a non-crash-safe wal_level that eliminates WAL activity ? ? ? ? ? ?* http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php -- isn't fsync to off enought? Well, testing reported in the thread showed other settings also help, though the checkpoint lengthening was not tested. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: Consider a non-crash-safe wal_level that eliminates WAL activity * http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php I don't think we need a system-wide setting for that. I believe that the unlogged tables I'm working on will handle that case. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote: On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: Consider a non-crash-safe wal_level that eliminates WAL activity * http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php I don't think we need a system-wide setting for that. I believe that the unlogged tables I'm working on will handle that case. Aren't they going to be truncated at startup? If the entire system is running without WAL, we would only need to do that in case of an unclean shutdown wouldn't we? -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Robert Haas wrote: On Wed, Jun 23, 2010 at 3:37 PM, Bruce Momjian br...@momjian.us wrote: Tom Lane wrote: Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely If we elimiate WAL logging, that means a reinstall is required for even a postmaster crash, which is a new non-durable behavior. Also, we just added wal_level = minimal, which might end up being a poor name choice of we want wal_level = off in PG 9.1. ?Perhaps we should have used wal_level = crash_safe in 9.0. I have added the following TODO: ? ? ? ?Consider a non-crash-safe wal_level that eliminates WAL activity ? ? ? ? ? ?* http://archives.postgresql.org/pgsql-performance/2010-06/msg00300.php I don't think we need a system-wide setting for that. I believe that the unlogged tables I'm working on will handle that case. Uh, will we have some global unlogged setting, like for the system tables and stuff? It seems like an heavy burden to tell people they have to create ever object as unlogged, and we would still generate log for things like transaction commits. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Dave Page dp...@pgadmin.org writes: On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote: I don't think we need a system-wide setting for that. I believe that the unlogged tables I'm working on will handle that case. Aren't they going to be truncated at startup? If the entire system is running without WAL, we would only need to do that in case of an unclean shutdown wouldn't we? The problem with a system-wide no-WAL setting is it means you can't trust the system catalogs after a crash. Which means you are forced to use initdb to recover from any crash, in return for not a lot of savings (for typical usages where there's not really much churn in the catalogs). I tend to agree with Robert that a way to not log content updates for individual user tables is likely to be much more useful in practice. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Tom Lane wrote: Dave Page dp...@pgadmin.org writes: On Wed, Jun 23, 2010 at 9:25 PM, Robert Haas robertmh...@gmail.com wrote: I don't think we need a system-wide setting for that. ?I believe that the unlogged tables I'm working on will handle that case. Aren't they going to be truncated at startup? If the entire system is running without WAL, we would only need to do that in case of an unclean shutdown wouldn't we? The problem with a system-wide no-WAL setting is it means you can't trust the system catalogs after a crash. Which means you are forced to True, and in fact any postmaster crash could lead to curruption. use initdb to recover from any crash, in return for not a lot of savings (for typical usages where there's not really much churn in the catalogs). I tend to agree with Robert that a way to not log content updates for individual user tables is likely to be much more useful in practice. OK, TODO removed. -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + None of us is going to be here forever. + -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Thu, Jun 17, 2010 at 1:29 PM, Josh Berkus j...@agliodbs.com wrote: a) Eliminate WAL logging entirely In addition to global temporary tables, I am also planning to implement unlogged tables, which are, precisely, tables for which no WAL is written. On restart, any such tables will be truncated. That should give you the ability to do this (by making all your tables unlogged). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Dimitri Fontaine wrote: Well I guess I'd prefer a per-transaction setting Not possible, as many others have said. As soon as you make an unsafe transaction, all the other transactions have nothing to rely on. On Thu, 17 Jun 2010, Pierre C wrote: A per-table (or per-index) setting makes more sense IMHO. For instance on recovery, truncate this table (this was mentioned before). That would be much more valuable. I'd like to point out the costs involved in having a whole separate version of Postgres that has all this safety switched off. Package managers will not thank anyone for having to distribute another version of the system, and woe betide the user who installs the wrong version because it runs faster. No, this is much better as a configurable option. Going back to the on recovery, truncate this table. We already have a mechanism for skipping the WAL writes on an entire table - we do that for tables that have been created in the current transaction. It would surely be a small step to allow this to be configurably permanent on a particular table. Moreover, we already have a mechanism for taking a table that has had non-logged changes, and turning it into a fully logged table - we do that to the above mentioned tables when the transaction commits. I would strongly recommend providing an option to ALTER TABLE MAKE SAFE, which may involve some more acrobatics if the table is currently in use by multiple transactions, but would be valuable. This would allow users to create temporary tables that can be shared by several connections. It would also allow bulk loading in parallel of a single large table. With these suggestions, we would still need to WAL-log all the metadata changes, but I think in most circumstances that is not going to be a large burden on performance. Matthew -- Picard: I was just paid a visit from Q. Riker: Q! Any idea what he's up to? Picard: No. He said he wanted to be nice to me. Riker: I'll alert the crew. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
I'd like to point out the costs involved in having a whole separate version It must be a setting, not a version. For instance suppose you have a session table for your website and a users table. - Having ACID on the users table is of course a must ; - for the sessions table you can drop the D Server crash would force all users to re-login on your website but if your server crashes enough that your users complain about that, you have another problem anyway. Having the sessions table not WAL-logged (ie faster) would not prevent you from having sessions.user_id REFERENCES users( user_id ) ... so mixing safe and unsafe tables would be much more powerful than just having unsafe tables. And I really like the idea of non-WAL-logged indexes, too, since they can be rebuilt as needed, the DBA could decide between faster index updates but rebuild on crash, or normal updates and fast recovery. Also materialized views etc, you can rebuild them on crash and the added update speed would be good. Moreover, we already have a mechanism for taking a table that has had non-logged changes, and turning it into a fully logged table - we do that to the above mentioned tables when the transaction commits. I would strongly recommend providing an option to ALTER TABLE MAKE SAFE, which may involve some more acrobatics if the table is currently in use by multiple transactions, but would be valuable. I believe the old discussions called this ALTER TABLE SET PERSISTENCE. This would allow users to create temporary tables that can be shared by several connections. It would also allow bulk loading in parallel of a single large table. This would need to WAL-log the entire table to send it to the slaves if replication is enabled, but it's a lot faster than replicating each record. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
It must be a setting, not a version. For instance suppose you have a session table for your website and a users table. - Having ACID on the users table is of course a must ; - for the sessions table you can drop the D You're trying to solve a different use-case than the one I am. Your use-case will be solved by global temporary tables. I suggest that you give Robert Haas some help feedback on that. My use case is people using PostgreSQL as a cache, or relying entirely on replication for durability. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 6/18/10 2:15 AM, Matthew Wakeling wrote: I'd like to point out the costs involved in having a whole separate version of Postgres that has all this safety switched off. Package managers will not thank anyone for having to distribute another version of the system, and woe betide the user who installs the wrong version because it runs faster. No, this is much better as a configurable option. Agreed, although initial alphas of this concept are likely to in fact be a separate source code tree. Eventually when we have it working well it could become an initdb-time option. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
All, So, I've been discussing this because using PostgreSQL on the caching layer has become more common that I think most people realize. Jonathan is one of 4 companies I know of who are doing this, and with the growth of Hadoop and other large-scale data-processing technologies, I think demand will increase. Especially as, in repeated tests, PostgreSQL with persistence turned off is just as fast as the fastest nondurable NoSQL database. And it has a LOT more features. Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for durability, they don't eliminate the CPU time. Which means that a caching version of PostgreSQL could be even faster. To do that, we'd need to: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Of the three above, (a) is the most difficult codewise. (b)(c) and (d) should be relatively straightforwards, although I believe that we now have the bgwriter doing some other essential work besides syncing buffers. There's also a narrower use-case in eliminating (a), since a non-fsync'd server which was recording WAL could be used as part of a replication chain. This isn't on hackers because I'm not ready to start working on a patch, but I'd like some feedback on the complexities of doing (b) and (c) as well as how many people could use a non-persistant, in-memory postgres. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Especially as, in repeated tests, PostgreSQL with persistence turned off is just as fast as the fastest nondurable NoSQL database. And it has a LOT more features. An option to completely disable WAL for such use cases would make it a lot faster, especially in the case of heavy concurrent writes. Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for durability, they don't eliminate the CPU time. Actually the WAL overhead is some CPU and lots of locking. Which means that a caching version of PostgreSQL could be even faster. To do that, we'd need to: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Of the three above, (a) is the most difficult codewise. Actually, it's pretty easy, look in xlog.c -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Hi, Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Then you could have the GUC set for a whole cluster, only a database etc. We already have synchronous_commit to trade durability against performances, we could maybe support protect_data = off too. The d) point I'm not sure still applies if you have per transaction setting, which I think makes the most sense. The data you choose not to protect is missing at restart, just add some way to register a hook there. We already have one (shared_preload_libraries) but it requires coding in C. Calling a user function at the end of recovery and before accepting connection would be good I think. A user function (per database) is better than a script because if you want to run it before accepting connections and still cause changes in the database… Regards, -- dim -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Dimitri Fontaine dfonta...@hi-media.com writes: Josh Berkus j...@agliodbs.com writes: a) Eliminate WAL logging entirely b) Eliminate checkpointing c) Turn off the background writer d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning) Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Not going to happen; this is all or nothing. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Yeah, I think proposal (c) is likely to be a net loss. (a) and (d) are probably simple, if by reprovisioning you mean rm -rf $PGDATA; initdb. Point (b) will be a bit trickier because there are various housekeeping activities tied into checkpoints. I think you can't actually remove checkpoints altogether, just skip the flush-dirty-pages part. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Josh Berkus wrote: a) Eliminate WAL logging entirely c) Turn off the background writer Note that if you turn off full_page_writes and set bgwriter_lru_maxpages=0, you'd get a substantial move in both these directions without touching any code. Would help prove those as useful directions to move toward or not. The difference in WAL writes just after a checkpoint in particular, due to the full_page_writes behavior, is a significant portion of total WAL activity on most systems. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Well if the transaction touches a system catalog it better be WAL-logged... A per-table (or per-index) setting makes more sense IMHO. For instance on recovery, truncate this table (this was mentioned before). Another option would be make the table data safe, but on recovery, destroy and rebuild this index : because on a not so large, often updated table, with often updated indexes, it may not take long to rebuild the indexes, but all those wal-logged index updates do add some overhead. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Well I guess I'd prefer a per-transaction setting, allowing to bypass WAL logging and checkpointing. Not even conceiveable. For this to work, we're talking about the whole database installation. This is only a set of settings for a database *server* which is considered disposable and replaceable, where if it shuts down unexpectedly, you throw it away and replace it. Forcing the backend to care itself for writing the data I'm not sure is a good thing, but if you say so. Oh, yeah, I guess we'd only be turning off the LRU cache operations of the background writer. Same with checkpoints. Copying between shared_buffers and the LRU cache would still happen. Calling a user function at the end of recovery and before accepting connection would be good I think. A user function (per database) is better than a script because if you want to run it before accepting connections and still cause changes in the database… Hmmm, you're not quite following my idea. There is no recovery. If the database shuts down unexpectedly, it's toast and you replace it from another copy somewhere else. (a) and (d) are probably simple, if by reprovisioning you mean rm -rf $PGDATA; initdb. Exactly. Followed by scp database_image. Or heck, just replacing the whole VM. Point (b) will be a bit trickier because there are various housekeeping activities tied into checkpoints. I think you can't actually remove checkpoints altogether, just skip the flush-dirty-pages part. Yes, and we'd want to flush dirty pages on an actual shutdown command. We do want to be able to shut down the DB on purpose. Well if the transaction touches a system catalog it better be WAL-logged... Given the above, why? -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Josh Berkus j...@agliodbs.com writes: (a) and (d) are probably simple, if by reprovisioning you mean rm -rf $PGDATA; initdb. Exactly. Followed by scp database_image. Or heck, just replacing the whole VM. Right, that would work. I don't think you really need to implement that inside Postgres. I would envision having the startup script do it, ie rm -rf $PGDATA cp -pr prepared-database-image $PGDATA # this loop exits when postmaster exits normally while ! postmaster ... do rm -rf $PGDATA cp -pr prepared-database-image $PGDATA done Then all you need is a tweak to make the postmaster exit(1) after a crash instead of trying to launch recovery. regards, tom lane -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote: On 6/15/10 10:37 AM, Chris Browne wrote: I'd like to see some figures about WAL on RAMfs vs. simply turning off fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already close to TokyoCabinet/MongoDB performance just with those turned off; I wonder if actually having the WAL on a memory partition would make any real difference in throughput. I've seen a lot of call for this recently, especially since PostgreSQL seems to be increasingly in use as a reporting server for Hadoop. Might be worth experimenting with just making wal writing a no-op. We'd also want to disable checkpointing, of course. My back-of-the-envelope experiment: Inserting single integers into a table without indexes using a prepared query via psycopg2. Python Script: import psycopg2 from time import time conn = psycopg2.connect(database='jgardner') cursor = conn.cursor() cursor.execute(CREATE TABLE test (data int not null)) conn.commit() cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1)) conn.commit() start = time() tx = 0 while time() - start 1.0: cursor.execute(EXECUTE ins(%s), (tx,)); conn.commit() tx += 1 print tx cursor.execute(DROP TABLE test); conn.commit(); Local disk, WAL on same FS: * Default config = 90 * full_page_writes=off = 90 * synchronous_commit=off = 4,500 * fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 * fsync=off and full_page_writes=off = 5,150 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 tmpfs, WAL on same tmpfs: * Default config: 5,200 * full_page_writes=off = 5,200 * fsync=off = 5,250 * synchronous_commit=off = 5,200 * fsync=off and synchronous_commit=off = 5,450 * fsync=off and full_page_writes=off = 5,250 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 NOTE: If I do one giant commit instead of lots of littler ones, I get much better speeds for the slower cases, but I never exceed 5,500 which appears to be some kind of wall I can't break through. If there's anything else I should tinker with, I'm all ears. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 16/06/10 18:30, jgard...@jonathangardner.net wrote: On Jun 15, 4:18 pm, j...@agliodbs.com (Josh Berkus) wrote: On 6/15/10 10:37 AM, Chris Browne wrote: I'd like to see some figures about WAL on RAMfs vs. simply turning off fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already close to TokyoCabinet/MongoDB performance just with those turned off; I wonder if actually having the WAL on a memory partition would make any real difference in throughput. I've seen a lot of call for this recently, especially since PostgreSQL seems to be increasingly in use as a reporting server for Hadoop. Might be worth experimenting with just making wal writing a no-op. We'd also want to disable checkpointing, of course. My back-of-the-envelope experiment: Inserting single integers into a table without indexes using a prepared query via psycopg2. Python Script: import psycopg2 from time import time conn = psycopg2.connect(database='jgardner') cursor = conn.cursor() cursor.execute(CREATE TABLE test (data int not null)) conn.commit() cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1)) conn.commit() start = time() tx = 0 while time() - start 1.0: cursor.execute(EXECUTE ins(%s), (tx,)); conn.commit() tx += 1 print tx cursor.execute(DROP TABLE test); conn.commit(); Local disk, WAL on same FS: * Default config = 90 * full_page_writes=off = 90 * synchronous_commit=off = 4,500 * fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 * fsync=off and full_page_writes=off = 5,150 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 tmpfs, WAL on same tmpfs: * Default config: 5,200 * full_page_writes=off = 5,200 * fsync=off = 5,250 * synchronous_commit=off = 5,200 * fsync=off and synchronous_commit=off = 5,450 * fsync=off and full_page_writes=off = 5,250 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 NOTE: If I do one giant commit instead of lots of littler ones, I get much better speeds for the slower cases, but I never exceed 5,500 which appears to be some kind of wall I can't break through. If there's anything else I should tinker with, I'm all ears. Seeing some profiler output (e.g oprofile) for the fastest case (and maybe 'em all later) might be informative about what limit is being hit here. regards Mark -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Have you tried connecting using a UNIX socket instead of a TCP socket on localhost ? On such very short queries, the TCP overhead is significant. Actually UNIX sockets are the default for psycopg2, had forgotten that. I get 7400 using UNIX sockets and 3000 using TCP (host=localhost) -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
jgard...@jonathangardner.net wrote: NOTE: If I do one giant commit instead of lots of littler ones, I get much better speeds for the slower cases, but I never exceed 5,500 which appears to be some kind of wall I can't break through. That's usually about where I run into the upper limit on how many statements Python can execute against the database per second. Between that and the GIL preventing better multi-core use, once you pull the disk out and get CPU bound it's hard to use Python for load testing of small statements and bottleneck anywhere except in Python itself. I normally just write little performance test cases in the pgbench scripting language, then I get multiple clients and (in 9.0) multiple driver threads all for free. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support g...@2ndquadrant.com www.2ndQuadrant.us -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
FYI I've tweaked this program a bit : import psycopg2 from time import time conn = psycopg2.connect(database='peufeu') cursor = conn.cursor() cursor.execute(CREATE TEMPORARY TABLE test (data int not null)) conn.commit() cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1)) cursor.execute(PREPARE sel AS SELECT 1) conn.commit() start = time() tx = 0 N = 100 d = 0 while d 10: for n in xrange( N ): cursor.execute(EXECUTE ins(%s), (tx,)); #~ conn.commit() #~ cursor.execute(EXECUTE sel ); conn.commit() d = time() - start tx += N print result : %d tps % (tx / d) cursor.execute(DROP TABLE test); conn.commit(); Results (Core 2 quad, ubuntu 10.04 64 bits) : SELECT 1 : 21000 queries/s (I'd say 50 us per query isn't bad !) INSERT with commit every 100 inserts : 17800 insets/s INSERT with commit every INSERT : 7650 tps fsync is on but not synchronous_commit. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Excerpts from jgard...@jonathangardner.net's message of mié jun 16 02:30:30 -0400 2010: NOTE: If I do one giant commit instead of lots of littler ones, I get much better speeds for the slower cases, but I never exceed 5,500 which appears to be some kind of wall I can't break through. If there's anything else I should tinker with, I'm all ears. increase wal_buffers? -- Álvaro Herrera alvhe...@commandprompt.com The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
* fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 Now, this *is* interesting ... why should synch_commit make a difference if fsync is off? Anyone have any ideas? tmpfs, WAL on same tmpfs: * Default config: 5,200 * full_page_writes=off = 5,200 * fsync=off = 5,250 * synchronous_commit=off = 5,200 * fsync=off and synchronous_commit=off = 5,450 * fsync=off and full_page_writes=off = 5,250 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 So, in this test, it seems like having WAL on tmpfs doesn't make a significant difference for everything == off. I'll try running some tests on Amazon when I have a chance. It would be worthwhile to get figures without Python's ceiling. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 16, 2010 at 1:27 AM, Greg Smith g...@2ndquadrant.com wrote: I normally just write little performance test cases in the pgbench scripting language, then I get multiple clients and (in 9.0) multiple driver threads all for free. See, this is why I love these mailing lists. I totally forgot about pgbench. I'm going to dump my cheesy python script and play with that for a while. -- Jonathan Gardner jgard...@jonathangardner.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 16, 2010 at 4:22 AM, Pierre C li...@peufeu.com wrote: import psycopg2 from time import time conn = psycopg2.connect(database='peufeu') cursor = conn.cursor() cursor.execute(CREATE TEMPORARY TABLE test (data int not null)) conn.commit() cursor.execute(PREPARE ins AS INSERT INTO test VALUES ($1)) cursor.execute(PREPARE sel AS SELECT 1) conn.commit() start = time() tx = 0 N = 100 d = 0 while d 10: for n in xrange( N ): cursor.execute(EXECUTE ins(%s), (tx,)); #~ conn.commit() #~ cursor.execute(EXECUTE sel ); conn.commit() d = time() - start tx += N print result : %d tps % (tx / d) cursor.execute(DROP TABLE test); conn.commit(); I'm not surprised that Python add is so slow, but I am surprised that I didn't remember it was... ;-) -- Jonathan Gardner jgard...@jonathangardner.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote: * fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 Now, this *is* interesting ... why should synch_commit make a difference if fsync is off? Anyone have any ideas? I may have stumbled upon this by my ignorance, but I thought I read that synchronous_commit controlled whether it tries to line up commits or has a more free-for-all that may cause some intermediate weirdness. -- Jonathan Gardner jgard...@jonathangardner.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
http://www.postgresql.org/docs/current/static/wal-async-commit.html the server waits for the transaction's WAL records to be flushed to permanent storage before returning a success indication to the client. I think with fynch=off, whether WAL gets written to disk or not is still controlled by synchronous_commit parameter. guessing here... Date: Wed, 16 Jun 2010 12:19:20 -0700 Subject: Re: [PERFORM] PostgreSQL as a local in-memory cache From: jgard...@jonathangardner.net To: j...@agliodbs.com CC: pgsql-performance@postgresql.org On Wed, Jun 16, 2010 at 12:00 PM, Josh Berkus j...@agliodbs.com wrote: * fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 Now, this *is* interesting ... why should synch_commit make a difference if fsync is off? Anyone have any ideas? I may have stumbled upon this by my ignorance, but I thought I read that synchronous_commit controlled whether it tries to line up commits or has a more free-for-all that may cause some intermediate weirdness. -- Jonathan Gardner jgard...@jonathangardner.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance _ The New Busy is not the old busy. Search, chat and e-mail from your inbox. http://www.windowslive.com/campaign/thenewbusy?ocid=PID28326::T:WLMTAGL:ON:WL:en-US:WM_HMP:042010_3
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Wed, Jun 16, 2010 at 12:51 AM, Pierre C li...@peufeu.com wrote: Have you tried connecting using a UNIX socket instead of a TCP socket on localhost ? On such very short queries, the TCP overhead is significant. Unfortunately, this isn't an option for my use case. Carbonado only supports TCP connections. -- Jonathan Gardner jgard...@jonathangardner.net -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 6/16/10 12:00 PM, Josh Berkus wrote: * fsync=off = 5,100 * fsync=off and synchronous_commit=off = 5,500 Now, this *is* interesting ... why should synch_commit make a difference if fsync is off? Anyone have any ideas? I found that pgbench has noise of about 20% (I posted about this a couple days ago using data from 1000 identical pgbench runs). Unless you make a bunch of runs and average them, a difference of 5,100 to 5,500 appears to be meaningless. Craig tmpfs, WAL on same tmpfs: * Default config: 5,200 * full_page_writes=off = 5,200 * fsync=off = 5,250 * synchronous_commit=off = 5,200 * fsync=off and synchronous_commit=off = 5,450 * fsync=off and full_page_writes=off = 5,250 * fsync=off, synchronous_commit=off and full_page_writes=off = 5,500 So, in this test, it seems like having WAL on tmpfs doesn't make a significant difference for everything == off. I'll try running some tests on Amazon when I have a chance. It would be worthwhile to get figures without Python's ceiling. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
I'm not surprised that Python add is so slow, but I am surprised that I didn't remember it was... ;-) it's not the add(), it's the time.time()... -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
[PERFORM] PostgreSQL as a local in-memory cache
We have a fairly unique need for a local, in-memory cache. This will store data aggregated from other sources. Generating the data only takes a few minutes, and it is updated often. There will be some fairly expensive queries of arbitrary complexity run at a fairly high rate. We're looking for high concurrency and reasonable performance throughout. The entire data set is roughly 20 MB in size. We've tried Carbonado in front of SleepycatJE only to discover that it chokes at a fairly low concurrency and that Carbonado's rule-based optimizer is wholly insufficient for our needs. We've also tried Carbonado's Map Repository which suffers the same problems. I've since moved the backend database to a local PostgreSQL instance hoping to take advantage of PostgreSQL's superior performance at high concurrency. Of course, at the default settings, it performs quite poorly compares to the Map Repository and Sleepycat JE. My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. I've never configured PostgreSQL to work like this and I thought maybe someone here had some ideas on a good approach to this. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Jun 14, 7:14 pm, jgard...@jonathangardner.net jgard...@jonathangardner.net wrote: We have a fairly unique need for a local, in-memory cache. This will store data aggregated from other sources. Generating the data only takes a few minutes, and it is updated often. There will be some fairly expensive queries of arbitrary complexity run at a fairly high rate. We're looking for high concurrency and reasonable performance throughout. The entire data set is roughly 20 MB in size. We've tried Carbonado in front of SleepycatJE only to discover that it chokes at a fairly low concurrency and that Carbonado's rule-based optimizer is wholly insufficient for our needs. We've also tried Carbonado's Map Repository which suffers the same problems. I've since moved the backend database to a local PostgreSQL instance hoping to take advantage of PostgreSQL's superior performance at high concurrency. Of course, at the default settings, it performs quite poorly compares to the Map Repository and Sleepycat JE. My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. I've never configured PostgreSQL to work like this and I thought maybe someone here had some ideas on a good approach to this. Just to summarize what I've been able to accomplish so far. By turning fsync and synchronize_commit off, and moving the data dir to tmpfs, I've been able to run the expensive queries much faster than BDB or the MapRepository that comes with Carbonado. This is because PostgreSQL's planner is so much faster and better than whatever Carbonado has. Tweaking indexes has only made things run faster. Right now I'm wrapping up the project so that we can do some serious performance benchmarks. I'll let you all know how it goes. Also, just a note that setting up PostgreSQL for these weird scenarios turned out to be just a tiny bit harder than setting up SQLite. I remember several years ago when there was a push to simplify the configuration and installation of PostgreSQL, and I believe that that has born fruit. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
Chris Browne wrote: jgard...@jonathangardner.net jgard...@jonathangardner.net writes: My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. For such a scenario, I'd suggest you: - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS are reasonable options for this. - The complication would be that your restart the machine and move on needs to consist of quite a few steps: - recreating the filesystem - fixing permissions as needed - running initdb to set up new PG instance - automating any needful fiddling with postgresql.conf, pg_hba.conf - starting up that PG instance - creating users, databases, schemas, ... Doesn't PG now support putting both WAL and user table files onto file systems other than the one holding the PG config files and PG 'admin' tables? Wouldn't doing so simplify the above considertably by allowing just the WAL and user tables on the memory-backed file systems? I wouldn't think the performance impact of leaving the rest of the stuff on disk would be that large. Or does losing WAL files mandate a new initdb? -- Steve Wampler -- swamp...@noao.edu The gods that smiled on your birth are now laughing out loud. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
[oops, didn't hit reply to list first time, resending...] On 6/15/10 9:02 AM, Steve Wampler wrote: Chris Browne wrote: jgard...@jonathangardner.net jgard...@jonathangardner.net writes: My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. For such a scenario, I'd suggest you: - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS are reasonable options for this. - The complication would be that your restart the machine and move on needs to consist of quite a few steps: - recreating the filesystem - fixing permissions as needed - running initdb to set up new PG instance - automating any needful fiddling with postgresql.conf, pg_hba.conf - starting up that PG instance - creating users, databases, schemas, ... How about this: Set up a database entirely on a RAM disk, then install a WAL-logging warm standby. If the production computer goes down, you bring the warm standby online, shut it down, and use tar(1) to recreate the database on the production server when you bring it back online. You have speed and you have near-100% backup. Craig -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Jun 15, 8:47 am, Chris Browne cbbro...@acm.org wrote: jgard...@jonathangardner.net jgard...@jonathangardner.net writes: My question is how can I configure the database to run as quickly as possible if I don't care about data consistency or durability? That is, the data is updated so often and it can be reproduced fairly rapidly so that if there is a server crash or random particles from space mess up memory we'd just restart the machine and move on. For such a scenario, I'd suggest you: - Set up a filesystem that is memory-backed. On Linux, RamFS or TmpFS are reasonable options for this. I had forgotten about this. I will try this out. - The complication would be that your restart the machine and move on needs to consist of quite a few steps: - recreating the filesystem - fixing permissions as needed - running initdb to set up new PG instance - automating any needful fiddling with postgresql.conf, pg_hba.conf - starting up that PG instance - creating users, databases, schemas, ... I'm going to have a system in place to create these databases when I restart the service. ... I wonder if this kind of installation comes into its own for more realistic scenarios in the presence of streaming replication. If you know the WAL files have gotten to disk on another server, that's a pretty good guarantee :-). I have found that pre-computing and storing values in a general relational-type database without durability is an ideal use case to help improve services that need to return calculated results quickly. A simple hash lookup is no longer sufficient. Perhaps PostgreSQL running in this mode will be the ideal solution. Nowadays, no one is really surprised that it takes 30 seconds or so to replicate your data everywhere, but they do detest not getting answers to their complicated queries immediately. -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
swamp...@noao.edu (Steve Wampler) writes: Or does losing WAL files mandate a new initdb? Losing WAL would mandate initdb, so I'd think this all fits into the set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be significant to the performance focus. -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/internet.html MS apparently now has a team dedicated to tracking problems with Linux and publicizing them. I guess eventually they'll figure out this back fires... ;) -- William Burrow aa...@delete.fan.nb.ca -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On Tue, Jun 15, 2010 at 12:37 PM, Chris Browne cbbro...@acm.org wrote: swamp...@noao.edu (Steve Wampler) writes: Or does losing WAL files mandate a new initdb? Losing WAL would mandate initdb, so I'd think this all fits into the set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be significant to the performance focus. why is that? isn't simply execute pg_resetxlog enough? specially 'cause OP doesn't care about loosing some transactions -- Jaime Casanova www.2ndQuadrant.com Soporte y capacitación de PostgreSQL -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance
Re: [PERFORM] PostgreSQL as a local in-memory cache
On 6/15/10 10:37 AM, Chris Browne wrote: swamp...@noao.edu (Steve Wampler) writes: Or does losing WAL files mandate a new initdb? Losing WAL would mandate initdb, so I'd think this all fits into the set of stuff worth putting onto ramfs/tmpfs. Certainly it'll all be significant to the performance focus. I'd like to see some figures about WAL on RAMfs vs. simply turning off fsync and full_page_writes. Per Gavin's tests, PostgreSQL is already close to TokyoCabinet/MongoDB performance just with those turned off; I wonder if actually having the WAL on a memory partition would make any real difference in throughput. I've seen a lot of call for this recently, especially since PostgreSQL seems to be increasingly in use as a reporting server for Hadoop. Might be worth experimenting with just making wal writing a no-op. We'd also want to disable checkpointing, of course. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance