"Decibel!" <[EMAIL PROTECTED]> writes:

> When a read() call returns, surely the kernel knows whether it  actually 
> issued
> a physical read request to satisfy that. I don't see  any reason why you
> couldn't have a version of read() that returns  that information. I also 
> rather
> doubt that we're the only userland  software that would make use of that.

I'm told this exists on Windows for the async interface. But AFAIK it doesn't
on Unix. The visibility into things like this is what makes DTrace so
remarkable.

I think there aren't many userland software interested in this. The only two
cases I can think of are databases -- which use direct I/O partly because of
this issue -- and real-time software like multimedia software -- which use
aren't so much interested in measuring it as forcing things to be preloaded
with stuff like posix_fadvise() or mlock().

I don't think DTrace is overkill either. The programmatic interface is
undocumented (but I've gotten Sun people to admit it exists -- I just have to
reverse engineer it from the existing code samples) but should be more or less
exactly what we need.

But the lowest-common-denominator of just timing read() and seeing if it took
long enough to involve either a context switch or sleeping on physical i/o
should be a pretty close approximation. The case where it would be least
accurate is when most or all of the data is actually in the cache. Then even
with a low false-positive rate detecting cache misses it'll still dominate the
true near-zero rate of cache misses.

We could mitigate that somewhat by describing it in the plan as something like

................... (... I/O fast=nnn slow=nnn)

instead of the more descriptive "physical" and "logical"

-- 
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to