On Thu, 22 Jun 2006, Agent M wrote: > > On Jun 22, 2006, at 9:56 PM, Christopher Kings-Lynne wrote: > > >> The example is a very active web site, the flow is this: > >> query for session information > >> process HTTP request > >> update session information > >> This happens for EVERY http request. Chances are that you won't have > >> concurrent requests for the same row, but you may have well over 100 > >> HTTP > >> server processes/threads answering queries in your web server farm. > > > > You're crazy :) Use memcache, not the DB :) > > Still, the database is the one central location that the apaches can > connect too- postgres already has a lot of application platform > features- locking synchronization, asynchronous notifications, > arbitrary pl code. > > Personally, I think that a special non-MVCC table type could be > created- the catalogs are similarly flat. What I envision is a table > type that can only be accessed "outside" transactions (like AutoCommit > mode)- this is already possible to implement in plperl for a single > session. It would be more efficient to have something like a global > temp table hanging around... > > Just some random ideas...
Unfortunately, it's not so simple. What if a user enters a transaction block, modifies a normal table, modifies this 'special table'... then rolls back? This is the problem MySQL has with innodb and myisam tables. Well... we could just document that. If only. What happens if, as a part the update to the special table, we encounter and error? MVCC currently guarantees that this modification will be invisible. Without MVCC, we have no such capability. There seems to be a bit of confusion about what MVCC is. PostgreSQL is not the only MVCC database. InnoDB is MVCC. Oracle is MVCC. As far as I know, PostgreSQL is the only MVCC database with a 'non-overwriting storage manager'. The other MVCC databases maintain UNDO logs outside of the table's data files. When an update occurs, the existing row version is copied to te UNDO file, the new data replaces the old and a backward pointer from the table row to the UNDO log is created. Concurrent reads must go into the UNDO log and find the version visible to them. This implements the principle of MVCC and uses snapshot isolation, like we do, to isolate read/write concurrency. Overwriting MVCC comes with its own baggage. Ask any Oracle user about error ORA-01555[1]. There's also the added cost of managing the UNDO logs, the cost of jumping around between files to get row versions and so on. Also, it leads to inefficiency with variable size data types. The new version of a row might be longer or shorter than the previous version and this has to be causing them a headaches and performance penalties. As for single version databases -- they have a tonne of problems themselves. They need to maintain UNDO functionality so as to deal with the error handling I detailed above but they also cannot implement MVCC rules for all cases: readers to not block writes, writers do not block readers. Instead, they implement a large lock matrix and certain types of queries use certain types of granular locks. Some potentially interesting reading material on these ideas: J. Gray & A Reuter, Transaction Processing: Concepts and Techniques US Patent Number 5,870,758 -- (one of?) Oracle's snapshot isolation patent Tom Lane's MVCC talk: http://www.postgresql.org/files/developer/transactions.pdf Thanks, Gavin --- [1] Basically, the UNDO logs are a circular buffer and, remarkably, Oracle doesnt seem to let the buffer expand if there is a long running transaction. (We do this for WAL). Basically, in this case Oracle finds itself in a compromising position because if any more data is written to the UNDO log the query affected will not be able to roll back and/or concurrent queries will not be able to find the correct row version for their snapshot. UNDO log size can be adjusted but this situation bites Oracle users constantly. ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq