All,

So, I've been discussing this because using PostgreSQL on the caching layer has become more common that I think most people realize. Jonathan is one of 4 companies I know of who are doing this, and with the growth of Hadoop and other large-scale data-processing technologies, I think demand will increase.

Especially as, in repeated tests, PostgreSQL with persistence turned off is just as fast as the fastest nondurable NoSQL database. And it has a LOT more features.

Now, while fsync=off and tmpfs for WAL more-or-less eliminate the IO for durability, they don't eliminate the CPU time. Which means that a caching version of PostgreSQL could be even faster. To do that, we'd need to:

a) Eliminate WAL logging entirely
b) Eliminate checkpointing
c) Turn off the background writer
d) Have PostgreSQL refuse to restart after a crash and instead call an exteral script (for reprovisioning)

Of the three above, (a) is the most difficult codewise. (b)(c) and (d) should be relatively straightforwards, although I believe that we now have the bgwriter doing some other essential work besides syncing buffers. There's also a narrower use-case in eliminating (a), since a non-fsync'd server which was recording WAL could be used as part of a replication chain.

This isn't on hackers because I'm not ready to start working on a patch, but I'd like some feedback on the complexities of doing (b) and (c) as well as how many people could use a non-persistant, in-memory postgres.

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Reply via email to