Re: [agi] database access fast enough?

J. Andrew Rogers Thu, 17 Apr 2008 13:38:09 -0700


On Apr 17, 2008, at 12:20 PM, Mark Waser wrote:

It has always been posssible to tweak any of the databases to theother's transactional model.

Eh? Choices in concurrency control and scheduling run very deep in adatabase engine, with ramifications that cascade through every otherpart of the system. Equivalent transaction isolation levels canbehave very different in practice depending on the internaltransaction representation and management model. You cannot turn offthese side-effects, and you cannot "tweak" a non-MVCC-ish model tobehave like an MVCC-ish model at runtime in any way that matters.

Second of all, it was not a weakness -- it was a deliberate choiceof optimization -- it was a choice of OLAP over OLTP (and, let's behonest, for most databases on limited memory machines with low OLTPrequirements, this was the correct choice until ballooning memoriesmade the reverse true).

The rise of the Internet, with its massive OLTP load characteristic,kind of settled the issue. It is true though that Oracle-like OLTPmonsters have significantly higher resource overhead for storing thesame set of records. These days it is concurrency bottlenecks thatwill kill you.

So, is your claim that Oracle distributes better than Microsoft? Ifso, why?

Very mature implementation of the concepts, and almost everyconceivable mechanism and model for doing it is hidden under thehood. Remember, they started introducing the relevant concepts agesago in Oracle 7, though in practice it was mostly unusable untilrelatively recently. Consequently, their implementation is easilythe most general in that it works moderately well across the broadestnumber of use cases because they've been tweaking that aspect foryears. Other commercial implementations tend to only work for a muchnarrower set of use cases. In short, Oracle has a long head start.

There are new transactional architectures in academia that shouldwork better in a modern distributed environment than any of thecurrent commercial adaptations of classical architectures todistributed environments.
And PostgreSQL will probably implement them long before Oracle or MS.

Ironically, a specific design decision that has created a fair amountof argument for years makes PostgreSQL the engine starting from theclosest design point. PostgreSQL does not support threading and onlyuses a single process per query execution, originally for portabilityand data safety reasons -- the extreme hackability would be difficultto do otherwise. This made certain types of trivial parallelism forOLAP difficult. On the other hand, it has had distributed lockfunctionality for a number of versions now.

If you look at newer models explicitly designed to make transactionaldatabase scale better across distributed systems, you find that theyare built on a design requirement of single processes per resource,strict access serialization, no local parallelism, and distributedlocks. Which is not that far removed from where PostgreSQL is today,if you remove massive local concurrency support and its high overhead.There are a number of outfits (see www.greenplum.com for a veryadvanced implementation) that have hacked PostgreSQL to scale acrossvery large clusters for OLAP by essentially making the necessarytweaks to approximate these types of models. The next step would beto rip out a lot of expensive bits based on classical designassumptions that make distributed write loads scale poorly.

In a sense, a design choice that has traditionally put some limits onscaling PostgreSQL for OLAP put it in exactly the right place to makeimplementation of next-generation architectures as natural of anevolution as can be expected in this case.



J. Andrew Rogers

-------------------------------------------
agi
Archives: http://www.listbox.com/member/archive/303/=now
RSS Feed: http://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
http://www.listbox.com/member/?member_id=8660244&id_secret=101455710-f059c4
Powered by Listbox: http://www.listbox.com

Re: [agi] database access fast enough?

Reply via email to