> > > > The reason I mention it is that Postgres already supports
> > > > O_DIRECT I think on some other platforms (for whatever
> > > > reason).
> > >
> > > [ sounds of grepping... ] No. The only occurrence of O_DIRECT in the
> > > source tree is in TODO:
> > >
> > > * Consider use of open/fcntl(O_DIRECT) to minimize OS caching
> > >
> > > I personally disagree with this TODO item for the same reason
> > > Sean cited: Postgres is designed and tuned to rely on OS-level
> > > disk caching, and bypassing that cache is far more likely to
> > > hurt our performance than help it.
> > DB2 and Oracle, from memory, allow users to pass hints to the
> > planner to use/not use file system caching. This could be useful
> > if you had an application retrieving a large amount of data on an
> > adhoc basis. The large retrieval would empty out the disk cache
> > there by negatively impacting upon other applications operating on
> > data which should be cached.
> I've recently been bitten by this. On DB2, I could change what
> bufferpool the large tables were using and set it fairly small, but
> obviously not an option with PGSQL. But, if pgsql could stop caching
> from occuring on user-specified queries, large table or index scans,
> etc., it would be very helpful.
Actually, now that I think about this, if the planner is going to read
more than X number of bytes as specified in a GUC, it would be useful
to have the fd marked as O_DIRECT to avoid polluting the disk
cache... I have a few tables with about 300M rows (~9GB on disk) that
I have to perform nightly seq scans over for reports and it does wipe
out some of the other fast movers that come through and depend on the
disk cache to be there for their speed. Because they're performed in
the middle of the night, I don't care that much, but my avg query
times during that period of time are slower... whether it's load or
the disk buffer being emptied and having to be refilled, I'm not sure,
but thinking about it, use of a GUC threshold to have an FD marked as
O_DIRECT does make sense (0 == disabled and the default, but tunable
in Kbytes as an admin sees fit) and could be nice for big queries that
have lots of smaller queries running around at the same time.
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?