(For those not knowing - it's ReadFile/WriteFile where you pass an array
of "this many bytes to this address" as parameters)
Isn't that like the BSD writev()/readv() that Linux supports also? Is
that something we should be using on Unix if it is supported by the OS?
Nope, readv()/writev() read/w
Merlin Moncure wrote:
readv and writev are in the single unix spec...and yes ...
On some systems they might just be implemented as a loop inside the
library, or even as a macro.
You sure?
Requirements like this:
http://www.opengroup.org/onlinepubs/007908799/xsh/write.html
"Write requests of {PIPE
"Merlin Moncure" <[EMAIL PROTECTED]> writes:
> Is there a reason why readv/writev have not been considered in the past?
Lack of portability, and lack of obvious usefulness that would justify
dealing with the lack of portability.
I don't think there's any value in trying to write ordinary buffers
> Magnus Hagander wrote:
> > I don't think that's correct either. Scatter/Gather I/O is used to
SQL
> > Server can issue reads for several blocks from disks into it's own
> > buffer cache with a single syscall even if these buffers are not
> > sequential. It did make significant performance improve
>> I don't think that's correct either. Scatter/Gather I/O is
>used to SQL
>> Server can issue reads for several blocks from disks into it's own
>> buffer cache with a single syscall even if these buffers are not
>> sequential. It did make significant performance improvements
>when they
>> added
Magnus Hagander wrote:
> I don't think that's correct either. Scatter/Gather I/O is used to SQL
> Server can issue reads for several blocks from disks into it's own
> buffer cache with a single syscall even if these buffers are not
> sequential. It did make significant performance improvements when
Josh Berkus wrote:
Now you can see why other DBMSs don't use the OS disk cache. ...
...as long as we use the OS disk cache, we can't
eliminate checkpoint spikes, at least on Linux.
Wouldn't the VM settings like the ones under /proc/sys/vm
and/or the commit=XXX mount option if using ext3 be a go
PFC <[EMAIL PROTECTED]> writes:
> You can also read 'internal not yet developed postgres cache manager'
> instead of OS if you don't feel like talking kernel developers into
> implementing this thing.
It exists already, it's called aio.
But there are a *lot* of details you skipped over.
In the 'wishful hand waving' department :
read index -> determine (tuple id,page) to hit in table -> for each of
these, tell the OS 'I'm gonna need these' via a NON BLOCKING call. Non
blocking because you feed the information to the OS as you read the index,
streaming it.
Meanwhile, th
In the last exciting episode, [EMAIL PROTECTED] ("Merlin Moncure") wrote:
>> It seems inevitable that Postgres will eventually eliminate that
>> redundant layer of buffering. Since mmap is not workable, that
>> means using O_DIRECT to read table and index data.
>
> What about going the other way an
>Josh Berkus wrote:
>> Now you can see why other DBMSs don't use the OS disk cache. There's
>> other
>> issues as well; for example, as long as we use the OS disk cache, we
>can't
>> eliminate checkpoint spikes, at least on Linux. No matter what we do
>with
>> the bgwriter, fsyncing the OS disk c
Josh Berkus writes:
> Why is mmap not workable?It would require far-reaching changes to our
> code
> -- certainly -- but I don't think it can be eliminated from consideration.
Fundamentally because there is no facility for being notified by the OS before
a page is written to disk. And the
Josh Berkus wrote:
> Now you can see why other DBMSs don't use the OS disk cache. There's
> other
> issues as well; for example, as long as we use the OS disk cache, we
can't
> eliminate checkpoint spikes, at least on Linux. No matter what we do
with
> the bgwriter, fsyncing the OS disk cache cau
Josh Berkus writes:
> Why is mmap not workable?
We can't control write order. There are other equally bad problems,
but that one alone eliminates it from consideration. See past discussions.
regards, tom lane
---(end of broadcast)---
Tom, Greg, Merlin,
> But for example,
> if our buffer management algorithm recognizes an index page as being
> heavily hit and therefore keeps it in cache for a long time, then when
> it does fall out of cache you can be sure it's going to need to be read
> from disk when it's next used, because
> It seems inevitable that Postgres will eventually eliminate that
redundant
> layer of buffering. Since mmap is not workable, that means using
O_DIRECT
> to
> read table and index data.
What about going the other way and simply letting the o/s do all the
caching? How bad (or good) would the perf
Tom Lane <[EMAIL PROTECTED]> writes:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > 8.0, on the other hand, has a new algorithm that specifically tries to
> > protect against the shared buffers being blown out by a sequential
> > scan. But that will only help if it's the shared buffers being
> > thr
Greg Stark <[EMAIL PROTECTED]> writes:
> 8.0, on the other hand, has a new algorithm that specifically tries to
> protect against the shared buffers being blown out by a sequential
> scan. But that will only help if it's the shared buffers being
> thrashed that's hurting you, not the entire OS file
Mark Aufflick <[EMAIL PROTECTED]> writes:
> Obviously Q2 is faster than Q1,
That's not really obvious at all. If there are lots of records being returned
the index might not be faster than a sequential scan.
> My assumption is that the sequential scan is blowing the index from any cache
> it mi
Hi Rod,
> Any solution fixing buffers should probably not take into consideration
> the method being performed (do you really want to skip caching a
> sequential scan of a 2 tuple table because it didn't use an index) but
> the volume of data involved as compared to the size of the
The world rejoiced as [EMAIL PROTECTED] (Mark Aufflick) wrote:
> Hi All,
>
> I have boiled my situation down to the following simple case:
> (postgres version 7.3)
>
> * Query 1 is doing a sequential scan over a table (courtesy of field
> ILIKE 'foo%') and index joins to a few others
> * Query 2 is
> My concern is that this kind of testing has very little relevance to the
> real world of multiuser processing where contention for the cache becomes an
> issue. It may be that, at least in the current situation, postgres is
> giving too much weight to seq scans based on single user, straight
Hi,
I think there was some discussion about seq scans messing up the cache, and
talk about doing something about it but I don't think it has been addressed
yet. Maybe worth a troll through the archives.
It is certainly true that in many situations, a seq scan is preferable to
using an index. I
23 matches
Mail list logo