Hi,

When Lily tested my prototype for the automatic index statistics feature, she got errors in OSReadOnlyTest. I didn't, and since the failures came when trying to delete files on Windows I immediately suspected open file handles. After having investigated enough to be confident that my resource management (i.e. closing transactions, conglomerates etc) wasn't the culprit, I ended up studying the cache management. It turned out that the container cache failed to properly shut down.

When writing out a dirty container, a NonWritableChannelException was thrown. This was indeed written into derby.log, but I failed to spot it amongst the other stack traces (db not found, db shutdown, insert failed etc). The exception isn't handled down in store, and it bubbles all the way up into ContextManager. For reasons I don't fully know, it is logged but it isn't severe enough to replace the database shutdown exception.

So, when this exception is thrown, the cleaning of the container cache (and the shutdown) is aborted - resulting in open file handles. But why does this happen in a database that is supposed to be read-only? What made the container, but not any pages, dirty? The answer is FileContainer.setEstimatedRowCount, which my prototype indeed calls. But this can be called due to other actions as well, for instance table scans. However, it has code to avoid setting the dirty flag when the database is read-only. Further investigation revealed that while the files in the database directory are made read-only, directories aren't. This fools Derby to believe that the media is read-write, since it is able to write the database lock file.

Since the comment in DERBY-3837 says the purpose of the test is to test operation on read-only media, do you agree that we should make the database directory truly read-only?

Doing that will require some modifications to the test, but nothing major. For instance, the exception thrown when trying to insert data will change. And while I'm at it, I might want to move some of the file-system helper methods into PrivilegedFileOpsForTests. The motivating factor is that when a removal of a directory fails (recursive delete), it may be useful to see which files couldn't be deleted. The question is if it is okay to add a public static persistent recursive delete method running in a privileged block to the test code?


--
Kristian

Reply via email to