Tom Lane wrote:
> Neil Conway <[EMAIL PROTECTED]> writes:
> > On Sun, 2003-10-05 at 19:50, Neil Conway wrote:
> > I was hoping you'd reply to this, Tom -- you were referring to O_DIRECT,
> > right?
> Not necessarily --- as you point out, it's not clear that O_DIRECT would
> help us. What would be way cool is something similar to what James
> Rogers was talking about: a way to tell the kernel not to promote this
> page all the way to the top of its LRU list. I'm not sure that *any*
> Unixen have such an API, let alone one that's common across more than
> one platform :-(
Solaris has "free-behind", which prevents a large kernel sequential scan
from blowing out the cache.
I only read about it in the Mauro Solaris Internals book, and it seems
to be done automatically. I guess most OS's don't do this optimization
because they usually don't read files larger than their cache.
I see BSD/OS madvise() has:
#define MADV_NORMAL 0 /* no further special treatment */
#define MADV_RANDOM 1 /* expect random page references */
#define MADV_SEQUENTIAL 2 /* expect sequential references */
#define MADV_WILLNEED 3 /* will need these pages */
--> #define MADV_DONTNEED 4 /* don't need these pages */
#define MADV_SPACEAVAIL 5 /* insure that resources are reserved */
The marked one seems to have the control we need. Of course, the kernel
madvise() code has:
/* Not yet implemented */
Looks like NetBSD implements it, but it also unmaps the page from the
address space, which might be more than we want. NetBSD alao has:
#define MADV_FREE 6 /* pages are empty, free them */
which frees the page. I am unclear on its us.
FreeBSD has this comment:
* Cache, deactivate, or do nothing as appropriate. This routine
* is typically used by madvise() MADV_DONTNEED.
* Generally speaking we want to move the page into the cache so
* it gets reused quickly. However, this can result in a silly syndrome
* due to the page recycling too quickly. Small objects will not be
* fully cached. On the otherhand, if we move the page to the inactive
* queue we wind up with a problem whereby very large objects
* unnecessarily blow away our inactive and cache queues.
* The solution is to move the pages based on a fixed weighting. We
* either leave them alone, deactivate them, or move them to the cache,
* where moving them to the cache has the highest weighting.
* By forcing some pages into other queues we eventually force the
* system to balance the queues, potentially recovering other unrelated
* space from active. The idea is to not force this to happen too
The Linux comment is:
* Application no longer needs these pages. If the pages are dirty,
* it's OK to just throw them away. The app will be more careful about
* data it wants to keep. Be sure to free swap resources too. The
* zap_page_range call sets things up for refill_inactive to actually free
* these pages later if no one else has touched them in the meantime,
* although we could add these pages to a global reuse list for
* refill_inactive to pick up before reclaiming other pages.
* NB: This interface discards data rather than pushes it out to swap,
* as some implementations do. This has performance implications for
* applications like large transactional databases which want to discard
* pages in anonymous maps after committing to backing store the data
* that was kept in them. There is no reason to write this data out to
* the swap area if the application is discarding it.
* An interface that causes the system to free clean pages and flush
* dirty pages is already available as msync(MS_INVALIDATE).
It seems mmap is more for controlling the memory mapping of files rather
than controlling the cache itself.
Bruce Momjian | http://candle.pha.pa.us
[EMAIL PROTECTED] | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings