On 2014-01-13 12:34:35 -0800, James Bottomley wrote: > On Mon, 2014-01-13 at 14:32 -0600, Jim Nasby wrote: > > Well, if we were to collaborate with the kernel community on this then > > presumably we can do better than that for eviction... even to the > > extent of "here's some data from this range in this file. It's (clean| > > dirty). Put it in your cache. Just trust me on this." > > This should be the madvise() interface (with MADV_WILLNEED and > MADV_DONTNEED) is there something in that interface that is > insufficient?
For one, postgres doesn't use mmap for files (and can't without major new interfaces). Frequently mmap()/madvise()/munmap()ing 8kb chunks has horrible consequences for performance/scalability - very quickly you contend on locks in the kernel. Also, that will mark that page dirty, which isn't what we want in this case. One major usecase is transplanting a page comming from postgres' buffers into the kernel's buffercache because the latter has a much better chance of properly allocating system resources across independent applications running. Oh, and the kernel's page-cache management while far from perfect, actually scales much better than postgres'. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers