Kathey Marsden wrote:
I have always told users they have to have their databases on a local disk to ensure data integrity and that a system crash for an NFS mounted database could cause fatal corruption, but had a user this morning take me to task on this and ask me to explain exactly why. I gave my general response about not being able to guarantee a sync to disk over the network, but want to have a more authoritative reference for why you cannot count on an NFS mounted disk although I did find several places where the sync option "favors data integrity" which certainly doesn't sound like a guarantee. Does anyone know a good general reference I can use on this topic to support my "you gotta use a local disk" mantra.

The problem is one of documentation and implementation of nfs.  I don't
think there is just one "nfs" out there. And there are definitely all sorts of other remote mounting options.

Some of the problems that can arise, that are avoided in local disk and
thus why to be safe we have documented we can't guarantee support include:

1) We may not be able to prevent dual booting and thus db may get corrupted.
All of our algorithms for preventing dual booting rely on the jvms that
are accessing the database to be on the same machine. Once 2 machines can access the same file we have no way to prevent corruption.

2) Derby depends on synchonous write behavior when requested. Basically at certain times Derby asks the JVM to guarantee that data to a table or recovery log file has been written and forced to disk before returning.
If this syncing is not correct a number of database problems can happen
such as:
1) we tell user a transaction was commited because we believe the log
   was forced, but the nfs was caching the result and crashes.  Now
   the committed xact is not there.
2) we want to remove some recovery log so we force data to disk, wait for it to hit disk and the delete the log file for those disk updates.
But data is actually cached and lost and now we have old data in the
db and no log files to recover it from.

When this was first documented I don't believe any JVM implementation on top of nfs could guarantee a completed synchronous write. It may be the case that certain remote file system implementations now can guarantee this, and it may be the case that the JVM implementations make the right calls to the nfs file system to do this - but I believe it is a support nightmare to try and support this.

A quick google of nfs topics seems to indicate that there may be some versions of nfs that do support write sync. I believe this because most of the hits that I got were descriptions of how to disable the syncing to get better performance, indicating that many of nfs that might support write sync actually have it disabled. I did not see anyway that a java program could find out if the required syncing was being enforced.

Note that we also can not guarantee recovery on disks with write cache
enabled, which I believe many users have set.  Many may not even know it
as I believe it is the default for some disk installations.



Also I think our documentation on this topic should be a bit stronger. Currently we just say it may not work and probably should be clearer that data corruption could occur. I will file an issue to beef up the language based on the conversation in this thread.

http://db.apache.org/derby/docs/10.5/devguide/cdevdvlp40350.html

Thanks

Kathey



Reply via email to