We've been trying to figure out a database corruption issue for a week, and think we've got it figured out. It involves the "zombie database handle" state induced by sqlite3_close_v2:
> If sqlite3_close_v2() is called on a database connection that still has > outstanding prepared statements, BLOB handles, and/orsqlite3_backup objects > then it returns SQLITE_OK and the deallocation of resources is deferred until > all prepared statements, BLOB handles, and sqlite3_backup objects are also > destroyed. In general this is useful behavior for us, as using Java bindings to our API can result in statement handles hanging around longer than they ought to until the JVM runs finalizers. With sqlite3_close_v2, closing the database doesn't fail in these circumstances. (Which is, I believe, why this function was added to SQLite in the first place.) The situation where things go wrong is: 1. Database handle A is opened 2. A compiled statement is created and run. There's a Java object abstracting this sqlite3_stmt. 3. The database is closed with sqlite3_close_v2. The statement is still alive, so the database handle secretly remains open in "zombie mode". 4. We delete the database's parent directory and all files in it. 5. We create a new directory (with the same path as the old one) and call sqlite3_open_v2 to create a new empty database with handle B. 6. The JVM gets around to running finalizers, causing the sqlite3_stmt to be freed, which makes the zombie handle closable. 7. The internal function sqlite3LeaveMutexAndCloseZombie() closes the file handle … and also deletes the -wal and -shm files. Now we're in trouble: the -wal and -shm files that just got deleted happen to belong to the *newly created* database, not the old one, so SQLite just nuked a live WAL. At this point the next SQLite call tends to return error 522, "disk I/O error", and the new db is unreadable. This scenario isn't covered in "How To Corrupt An SQLite Database File", although what sqlite3LeaveMutexAndCloseZombie() does is similar to sections 1.3 or 2.4. This sequence of events may look weird at this low level, but at a higher level they're not uncommon, especially in our own unit tests. The test suite is using our API to create a database (which we bundle in a directory), run a test on it, close it, then delete it. After that, the _next_ unit test does exactly the same thing, just with a different set of test functions. So the same database path ends up getting opened, closed, deleted, reopened over and over. (This pattern happens in real use too, not just in testing.) At the moment I'm not sure what to do about this. The two options seem to be (a) Stop using sqlite3_close_v2. This means we have to implement our own mechanism to forcibly close all open statements when requested to close a database, and clear the statement handles in the objects holding them. (b) Avoid ever reusing the same path. This means we'd have to give the database file inside the wrapper directory a different name every time — instead of just "db.sqlite3" it'd include a UUID or timestamp or something. That of course complicates opening a database, necessitating a directory traversal to discover the file in the directory. Anyone else run into this? Anyone agree it should be added to "How To Corrupt"? I don't know if it can be fixed inside SQLite... —Jens _______________________________________________ sqlite-users mailing list [email protected] http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

