On Wed, Oct 10, 2012 at 7:52 AM, Clemens Ladisch <clem...@ladisch.de> wrote:
> (CC'd sqlite-users ML) > Theodore Ts'o wrote: > > > > What bothers me is no one is asking the question, *why* is Android > > (and more specifically SQLite and the applications which call SQLite) > > using fsync's so often? These aren't transaction processing systems, > > after all. > SQLite (and every other transactional storage system) needs a write-barrier operation in order to prevent database corruption on an power-loss or hard reset. By "write-barrier" I mean some method of ensuring that all write operations (on a particular pair of files, collectively) that occur before the write-barrier must persist to flash prior to any write operations that occur after the write-barrier. In other words, no write operations are allowed to be re-ordered across the write-barrier. Without a write-barrier of some kind, it is not possible to ensure the integrity of a transaction across a power-loss. The only write-barrier operation available to us on unix is fsync(). In the default rollback-journal modes of SQLite, a write-barrier is required for every SQL-level transaction. This means that with SQLite in rollback mode, lots of fsync() operations are occurring. SQLite also support a write-ahead log (WAL) mode also exists that works fine on android, and WAL mode only requires a write-barrier on a checkpoint. Checkpoints (normally) occur far less often than transactions, and hence far fewer fsyncs() are required. However WAL mode requires a shared-memory segment accessible to all processes. The shared memory is used to coordinate access to the WAL file. On unix, this shared-memory is obtained using mmap() of a temporary file that is created in the same directory as the original database. Unprivileged processes running in sandboxes without write access to the directory containing the database file cannot create this temporary file used to implement shared memory, and thus cannot use WAL mode. And due to security concerns, a lot of processes on phones tend to run in unprivileged sandboxes, meaning that they have difficulty with WAL mode. There are ways to work around this limitation of sandboxes, and iOS does make use of those work-arounds. But Android never has tried to do so. The shared-memory temporary files do not have to be in the same directory as the database. You can recompile SQLite with the SQLITE_SHM_DIRECTORY=/dev/shm compile-time option to cause all shared-memory temporary files to be put in some common place (like /dev/shm) which is accessible to all processes. This works fine for many processes, but fails utterly for processes that try to open SQLite databases following a chroot(). So it is not the default mode of operation. Are there any chroot() processes on Android that use SQLite? If not, then the SQLITE_SHM_DIRECTORY compile-time option might be a good idea there. We would really, really love to have some kind of write-barrier that is lighter than fsync(). If there is some method other than fsync() for forcing a write-barrier on Linux that we don't know about, please enlighten us. We would also love to have guidance on alternative techniques for obtaining memory shared across multiple processes that does not involve mmap() of temporary files. -- D. Richard Hipp d...@sqlite.org _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users