Tom Lane wrote: > Kevin Brown <[EMAIL PROTECTED]> writes: > > This is why I sometimes wonder whether or not it would be a win to use > > mmap() to access the data and index files -- > > mmap() is Right Out because it does not afford us sufficient control > over when changes to the in-memory data will propagate to disk. The > address-space-management problems you describe are also a nasty > headache, but that one is the showstopper.
Huh? Surely fsync() or fdatasync() of the file descriptor associated with the mmap()ed region at the appropriate times would accomplish much of this? I'm particularly confused since PG's entire approach to disk I/O is predicated on the notion that the OS, and not PG, is the best arbiter of when data hits the disk. Otherwise it would be using raw partitions for the highest-speed data store, yes? Also, there isn't any particular requirement to use mmap() for everything -- you can use traditional open/write/close calls for the WAL and mmap() for the data/index files (but it wouldn't surprise me if this would require some extensive code changes). That said, if it's typical for many changes to made to a page internally before PG needs to commit that page to disk, then your argument makes sense, and that's especially true if we simply cannot have the page written to disk in a partially-modified state (something I can easily see being an issue for the WAL -- would the same hold true of the index/data files?). -- Kevin Brown [EMAIL PROTECTED] ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])