Traditional database journalling logged page before-images (to allow a database to be rolled back to a point in time), page after-images (to roll forward from a backup for disaster recovery), or both.  Ether technique eliminates a single point of failure (the disk).

Interbase originally offered both before and after image journalling to  separate journal server that handled both before and after images and supported multiple databases.

A write ahead log is a totally different animal.  A write ahead log writes page changes to a single serial file so when a transaction commits, only the serial log gets flushed to storage rather than all dirty pages in the cache.  It doesn't, however, solve the problem of a single point of failure without RAID.

During the Borland years, Interbase tried to do a write ahead log and flushed the journal;omg code and server.  When it was pointed out that this introduced a single point of failure, they abandoned the write ahead log on concentrated on shadowing.

The Falcon storage engine used a write ahead log so a transaction could be committed with a single non-buffered write.

I also put a write ahead log for replication messages for NuoDB storage managers to meet some customer's insistence that every piece of data be stored on at least two devices before a transaction could be reported as committed.  And one of these days I gotta write one for Amorphous for the same reason.

Write ahead logs are implemented in almost all commercial database systems.

I've forgotten the details of the InnoDB logs, but they implemented MVCC with a pointer in their lock manager to a prior version of a record in their log.  It does (or used to) have some crock where it stops working when the lock space is exhausted.

I haven't a clue as to how contemporary Interbase works.

Almost everyone lies about serializability.  Everyone should know the formal definition: A database is serializable if for any set of concurrent transactions there exists a transaction order such execute transaction in that order yields the same database state.

Here's a test case:  Given a database with variables a and b initialized to 1 and variable c and d initialized to zero, consider two concurrent transaction A and B.  Transaction A copies b to c and bumps a.  Transaction B copies a to d and bumps b.

A serializable database will either deadlock or have c and d with values of either 1 and 2 or 2 and 1.  An MVCC database will have both c and d with values of 1.

CockroachDB, which implements MVCC with record timestamps, claims to be both MVCC and serializable.  Some of their literature say they are "virtually serializable" which when translated from marketing to English means "not serializable."  I haven't been able to find anything that says they can handle the above test case.  It is possible that they retain the full record read set and re-read and verify every record before commit, but they don't say they do with and the cost would be prohibitive.  If anyone knows, I'd like to hear about it.

Two-phase locking without phantom control, however, isn't serializable but the concurrency cost for phantom control is too expensive for most database systems.  Many systems implement a truly serializable mode to get a marketing check mark that they expect nobody to ever use in practice (Interbase implemented a two phase locking scheme for tables, which was both serializable and unusable.

Personally, I believe what while serializable is a sufficient condition for consistency, it isn't a necessary condition.  In my book, consistency means:

1. A transaction sees a consistent view of the database plus its own
   updates.
2. A transaction sees only committed data
3. A transaction can't overwrite any data it couldn't see
4. The database enforces any additional declare consistency constraints.

Work for MVCC.

On 6/3/2022 4:05 AM, Pól Ua L. via Firebird-devel wrote:

Hi again Jim, and thanks for your replies - it's interesting reading about the 
history of MVCC inter alia - see below.

Your answers bring up a couple of questions though.


One day I was driving down Route 3 in Manchester, New Hampshire, that
rather than keeping multiple page images, I could keep multiple record
versions, hopefully on the same page, and with clever bookkeeping have
individual transactions keep track of which of server record versions it
should see. So it solved concurrency control, transaction backout,
garbage collection, and database restart without journalling.
Q.1) If MVCC doesn't require jounalling, then why does Interbase now tout the 
fact that it has a Write Ahead Log (WAL - which I assume is a synonym for 
journalling)?

 From the page  (https://en.wikipedia.org/wiki/InterBase):

RESILIENT
Live Backups
Distinguished Data Dumps
Write-Ahead Logging       <<--------**
Point-in-Time Recovery
Oracle and MySQL (InnoDB engine) use MVCC and have Redo logs - which (at least 
AFAICS) are a WAL by another name.

Refs:

https://docs.oracle.com/cd/E18283_01/server.112/e17120/onlineredo001.htm

https://dev.mysql.com/blog-archive/mysql-8-0-new-lock-free-scalable-wal-design/


========================================================

And, in another reply, there's this:


For what it's worth, David Reed's dissertation was on a
non-transactional distributed directory system.  Bernstein and Goodman's
book "proved" that MVCC was serializable, which it most definitely was not.
Q.2) How then do the various MVCC systems implement SERIALIZABLE?

It's quite a confusing topic - there's an excellent article (which I haven't 
fully digested yet) here:

https://medium.com/paypal-tech/think-twice-before-dropping-acid-and-throw-your-cap-away-dbe0d6171dc0https://medium.com/paypal-tech/think-twice-before-dropping-acid-and-throw-your-cap-away-dbe0d6171dc0

which appears to imply that none of the major systems have a true SERIALIZABLE 
transaction isolation level?


Thanks to anyone for any input.


Best and regards,


Pól...





Firebird-Devel mailing list, web interface 
athttps://lists.sourceforge.net/lists/listinfo/firebird-devel
--
Jim Starkey, AmorphousDB, LLC
Firebird-Devel mailing list, web interface at 
https://lists.sourceforge.net/lists/listinfo/firebird-devel

Reply via email to