subject:"\[HACKERS\] posix

Re: [HACKERS] posix_fadvise v22

2009-01-28 Thread Gregory Stark


Tom Lane t...@sss.pgh.pa.us writes:

 What I intend to do over the next day or so is commit the prefetch
 infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
 indexscan part back for rework.  I think that it should be implemented
 in or near index_getnext() and pay attention to
 effective_io_concurrency.

The biggest question I have here is about doing it at the index_* abstraction
level. I've looked pretty hard at how to do this and run into some pretty
major problems.

I don't see any way to do it without changing the access method api. The
index_* methods cannot start fiddling with the current scan position without
messing up things like CURRENT OF and mark/restore irrecoverably.

But if we're going to change the index am api then we lose all of the
advantages of putting the logic in indexam.c in the first place. It won't
help any other index am without special code in each one

The best plan I came up with at this level is to add an am method 

amgetprefetchbitmap(IndexScanDesc scan, 
  ScanDirection direction, 
  TIDBitmap *bitmap, 
  int nblocks)

Which returns up to nblocks worth of bitmap starting from the current scan
position in the specified direction based on whatever's convenient to the
internal representation. I think it would be best if it stopped at the end
of the page at least if the next index page isn't in shared buffers.

Then nodeIndexscan.c would keep track of how the value of target_prefetch
just like nodeBitmapHeapScan, incrementing it whenever the caller continues
the scan and resetting it to zero if the direction changes.

However the getprefetchbitmap() call would only remove duplicates from the
upcoming blocks. It wouldn't know which blocks have already been prefetched
from previous calls. So nodeIndexscan would also have to remove duplicates
itself.

This splitting the work between three layers of abstraction is pretty messy
and creates a lot of duplicated work and doesn't seem to buy us anything. It
*still* wouldn't help any non-btree index types until they add the new method
-- and if they add the new method they might as well just add the USE_PREFETCH
code anyways. I don't see how the new method is useful for anything else.



I have another plan which would be a lot simpler but is a much more
brute-force approach. If we add a new scan pointer in addition to the current
position and the mark and add a slew of new methods like getnext_prefetch()
and reset_prefetch() to reset it to the current position. Also, add a hash
table internal to PrefetchBuffer and have it return a boolean indicating
whether it actually did a prefetch. Then index_prefetch would reset the
prefetch pointer and scan forward using it calling PrefetchBuffer on *every*
pointer counting how many trues returns it sees from PrefetchBuffer.*

*[Hm, not quite, we want to count recent prefetches that probably are still
queued separately from old ones that are probably in cache. And we also have
to think about how long to treat prefetches as probably still being in cache.
But with some additional thought I think this could be made to work.]





-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-12 Thread Tom Lane

Gregory Stark st...@enterprisedb.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 2. I fixed it so that setting effective_io_concurrency to zero disables
 prefetching altogether; there was no way to do that in the patch as
 submitted.

 Hm. the original intent was that effective_io_concurrency 1 meant no
 prefetching because there was only one drive.

Well, no prefetch is an entirely different behavior from prefetch one
block ahead.  Given the way you've defined the GUC, a setting of one
has to mean the latter.  My complaint was basically that with the patch
applied, the code was physically incapable of providing the former.
Which you'd surely want if only for testing/comparison purposes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

I've started to look through this, and the only part I seriously don't
like is the nbtsearch.c changes.  I've got three complaints about that:

* Doing it inside the index AMs is wrong, or at the very least forces
us to do it over for each index AM (which the patch fails to do).

* As coded, it generates prefetch bursts that are much too large and too
widely spaced to be effective, not to mention that they entirely
ignore the effective_io_concurrency control knob as well as the order
in which the pages will actually be needed.  I wonder now whether
Robert's inability to see any benefit came because he was testing
indexscans and not bitmap scans.

* It's only accidental that it's not kicking in during a bitmap
indexscan and bollixing up the much-more-carefully-written
nodeBitmapHeapscan prefetch logic.

What I intend to do over the next day or so is commit the prefetch
infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
indexscan part back for rework.  I think that it should be implemented
in or near index_getnext() and pay attention to
effective_io_concurrency.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Robert Haas

 * As coded, it generates prefetch bursts that are much too large and too
 widely spaced to be effective, not to mention that they entirely
 ignore the effective_io_concurrency control knob as well as the order
 in which the pages will actually be needed.  I wonder now whether
 Robert's inability to see any benefit came because he was testing
 indexscans and not bitmap scans.

Nope, bitmap index scans.  For some reason I was having trouble
generating a plan that included an index scan, so I took the easy way
out and tested what was in front of me.  :-)

 What I intend to do over the next day or so is commit the prefetch
 infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
 indexscan part back for rework.  I think that it should be implemented
 in or near index_getnext() and pay attention to
 effective_io_concurrency.

FWIW, following your first set of commits, I extracted, cleaned up,
and re-posted the uncommitted portion of the patch last night.  Based
on this it doesn't sound like there's much point in continuing to do
that, but I'm happy to do so if anyone thinks otherwise.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

 I've started to look through this, and the only part I seriously don't
 like is the nbtsearch.c changes.  I've got three complaints about that:

 * Doing it inside the index AMs is wrong, or at the very least forces
 us to do it over for each index AM (which the patch fails to do).

ok

 * As coded, it generates prefetch bursts that are much too large and too
 widely spaced to be effective, not to mention that they entirely
 ignore the effective_io_concurrency control knob as well as the order
 in which the pages will actually be needed.  I wonder now whether
 Robert's inability to see any benefit came because he was testing
 indexscans and not bitmap scans.

Well experiments showed that it was very effective, even more so than for
bitmap scans. So much so that it nearly obliterated bitmap scans' advantage
over index scans.

I originally thought like you that all that logic was integral to the thing
but eventually came around to think the opposite. That logic is to overcome a
fundamental problem with bitmap scans -- that there's no logical group to
prefetch and a potentially unbounded stream of pages. Index scans just don't
have that problem so they don't need that extra logic.

 * It's only accidental that it's not kicking in during a bitmap
 indexscan and bollixing up the much-more-carefully-written
 nodeBitmapHeapscan prefetch logic.

ok.

 What I intend to do over the next day or so is commit the prefetch
 infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
 indexscan part back for rework.  I think that it should be implemented
 in or near index_getnext() and pay attention to
 effective_io_concurrency.

So to do that I would have a few questions.

a) ISTM necessary to keep a dynahash of previously prefetched pointers around
   to avoid repeatedly prefetching the same pages. That could get quite large
   though. Or do you think it would be fine to visit the buffer cache,
   essentially using that as the hash, for every index pointer?

b) How would index_getnext keep two read pointers like bitmap heap scans? I
   think this would require adding a new index AM api similar to your
   tuplestore api where the caller can maintain multiple read pointers in the
   scan. And I'm not sure how that would work with mark/restore.

c) I would be afraid that pushing the index scan to reach for the next index
   leaf page prematurely (and not just a async prefetch, but a blocking read)
   would cause extra random i/o which would slow down the critical path of
   reading the current index tuples. So I think we would still want to pause
   when we hit the end of the current leaf page. That would require some form
   of feedback in the index am api as well.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 FWIW, following your first set of commits, I extracted, cleaned up,
 and re-posted the uncommitted portion of the patch last night.  Based
 on this it doesn't sound like there's much point in continuing to do
 that, but I'm happy to do so if anyone thinks otherwise.

Probably not; I'm busy hacking on the code anyway, fixing minor issues
like its failure to do anything sane with temp tables.

A question about that part of the code: I see that PrefetchBuffer simply
does nothing if it finds the target page already in a buffer.  I wonder
whether that's sufficient, or whether we ought to do something to try
to ensure that the buffer doesn't get evicted in the near future (ie,
before we actually get to use it).

Simply advancing the usage_count seems too strong because when you
add in the upcoming ReadBuffer, we'd be advancing usage_count twice
for a single use of the page whenever it happened to get prefetched.
We could alleviate that by only advancing usage_count if it's 0, but
that seems like a band-aid not a real fix.

I also thought about setting a buffer flag bit showing that prefetch has
occurred, but then you have to worry about when/how to clear it if the
actual read never occurs.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

I've applied most of this, with a couple of notable revisions:

1. The runtime check on whether posix_fadvise works is really gone.
We'll see whether we need to put anything back, but I think our first
try should be under the assumption that it works.  (BTW, I was wrong
in my earlier claim that the buildfarm wouldn't exercise posix_fadvise
by default --- the patch defaults to io_concurrency = 1 if fadvise
is available, so it will try to prefetch one page ahead.)

2. I fixed it so that setting effective_io_concurrency to zero disables
prefetching altogether; there was no way to do that in the patch as
submitted.

3. I did not apply the changes in nbtsearch.c, since as I mentioned
earlier I wasn't convinced it's a reasonable approach.  We can argue
about that some more, but even if 8.4 ships with concurrency only
for bitmap scans, it's still useful.

The two main loose ends that could stand further work are the
plain-index-scan case and the question of whether PrefetchBuffer
should try to protect an already-existing buffer from being evicted.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

 I've applied most of this, with a couple of notable revisions:

 1. The runtime check on whether posix_fadvise works is really gone.
 We'll see whether we need to put anything back, but I think our first
 try should be under the assumption that it works.  (BTW, I was wrong
 in my earlier claim that the buildfarm wouldn't exercise posix_fadvise
 by default --- the patch defaults to io_concurrency = 1 if fadvise
 is available, so it will try to prefetch one page ahead.)

 2. I fixed it so that setting effective_io_concurrency to zero disables
 prefetching altogether; there was no way to do that in the patch as
 submitted.

Hm. the original intent was that effective_io_concurrency 1 meant no
prefetching because there was only one drive.

I wonder if that worked earlier and got lost along the way or if I always had
this wrong.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [SPAM] Re: [HACKERS] posix_fadvise v22

2009-01-10 Thread Tom Lane

Peter Eisentraut pete...@gmx.net writes:
 The way I read this is that this was a temporary kernel/libc mismatch in 
 a development version of Debian 3 years ago that was fixed within 2 
 months of being reported and was never released to the general public. 
 So it would be on the same level as any of a million temporary breakages 
 in Linux distributions under development.

This is incorrect, as the problem was in fact present on Red Hat and
presumably all other distros as well.

 Unless there are other reports of this problem, I wouldn't bother 
 testing or working around this at all.  If people are running PostgreSQL 
   8.4+ on Debian unstable June 2005 with kernel 2.4, they cannot be helped.

While I would like to agree with that position, I can't help noticing
lines 2438-2461 of xlog.c, which represent the still-smoking wreckage of
our last attempt to do something with posix_fadvise.  It's not that old
either --- we gave up on it less than three years ago:
http://archives.postgresql.org/pgsql-hackers/2006-06/msg01481.php

I think at a minimum there should be a manual configuration switch
(ie something in pg_config_manual.h) to allow the builder to disable
use of posix_fadvise, even if configure thinks it's there.  Depending
on buildfarm results we may have to do more than that.

BTW, I intend to un-disable the xlog change when committing the fadvise
patch.  In for a penny, in for a pound ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [SPAM] Re: [HACKERS] posix_fadvise v22

2009-01-10 Thread Bruce Momjian

Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
  The way I read this is that this was a temporary kernel/libc mismatch in 
  a development version of Debian 3 years ago that was fixed within 2 
  months of being reported and was never released to the general public. 
  So it would be on the same level as any of a million temporary breakages 
  in Linux distributions under development.
 
 This is incorrect, as the problem was in fact present on Red Hat and
 presumably all other distros as well.
 
  Unless there are other reports of this problem, I wouldn't bother 
  testing or working around this at all.  If people are running PostgreSQL 
8.4+ on Debian unstable June 2005 with kernel 2.4, they cannot be helped.
 
 While I would like to agree with that position, I can't help noticing
 lines 2438-2461 of xlog.c, which represent the still-smoking wreckage of
 our last attempt to do something with posix_fadvise.  It's not that old
 either --- we gave up on it less than three years ago:
 http://archives.postgresql.org/pgsql-hackers/2006-06/msg01481.php
 
 I think at a minimum there should be a manual configuration switch
 (ie something in pg_config_manual.h) to allow the builder to disable
 use of posix_fadvise, even if configure thinks it's there.  Depending
 on buildfarm results we may have to do more than that.
 
 BTW, I intend to un-disable the xlog change when committing the fadvise
 patch.  In for a penny, in for a pound ...

I assumed if effective_io_concurrency  2 that no posix_fadvise() calls
would be made.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [SPAM] Re: [HACKERS] posix_fadvise v22

2009-01-10 Thread Robert Haas

 I think at a minimum there should be a manual configuration switch
 (ie something in pg_config_manual.h) to allow the builder to disable
 use of posix_fadvise, even if configure thinks it's there.  Depending
 on buildfarm results we may have to do more than that.

This seems pretty pointless, since there is already a GUC to control
the behavior, and the default is off.  The only value here would be if
you could demonstrate that even with effective_io_concurrency set to 1
there is a significant performance dropoff.  But that would probably
be an argument for rejecting the patch outright, not adding a
compile-time switch.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [SPAM] Re: [HACKERS] posix_fadvise v22

2009-01-05 Thread Peter Eisentraut


Gregory Stark wrote:

Peter Eisentraut pete...@gmx.net writes:


On Friday 02 January 2009 06:49:57 Greg Stark wrote:
The guc run-time check is checking for known-buggy versions of glibc  
using sysconf to check what version of glibc you have.

Could you please mention the bug number in the relevant source code comments?


It's Debian bug# 312406 which was fixed in Debian release 2.3.5-3. So it's
probably one of these but searching for posix_fadvise doesn't find anything in
their bug tracker:


The way I read this is that this was a temporary kernel/libc mismatch in 
a development version of Debian 3 years ago that was fixed within 2 
months of being reported and was never released to the general public. 
So it would be on the same level as any of a million temporary breakages 
in Linux distributions under development.


Unless there are other reports of this problem, I wouldn't bother 
testing or working around this at all.  If people are running PostgreSQL 
 8.4+ on Debian unstable June 2005 with kernel 2.4, they cannot be helped.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-03 Thread Peter Eisentraut

On Friday 02 January 2009 06:49:57 Greg Stark wrote:
 The guc run-time check is checking for known-buggy versions of glibc  
 using sysconf to check what version of glibc you have.

Could you please mention the bug number in the relevant source code comments?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [SPAM] Re: [HACKERS] posix_fadvise v22

2009-01-03 Thread Gregory Stark

Peter Eisentraut pete...@gmx.net writes:

 On Friday 02 January 2009 06:49:57 Greg Stark wrote:
 The guc run-time check is checking for known-buggy versions of glibc  
 using sysconf to check what version of glibc you have.

 Could you please mention the bug number in the relevant source code comments?

It's Debian bug# 312406 which was fixed in Debian release 2.3.5-3. So it's
probably one of these but searching for posix_fadvise doesn't find anything in
their bug tracker:

Version 2.3.6

* The following bugs are resolved with this release:

  38, 253, 549, 622, 653, 721, 758, 851, 877, 915, 934, 955, 961,
  1016, 1037, 1076, 1079, 1080, 1081, 1082, 1083, 1084, 1085, 1086,
  1087, 1088, 1090, 1091, 1092, 1093, 1094, 1095, 1096, 1097, 1098,
  1099, 1100, 1101, 1102, 1103, 1104, 1105, 1106, 1107, 1108, 1109,
  1110, , 1112, 1113, 1125, 1137, 1138, 1249, 1250, 1251, 1252,
  1253, 1254, 1350, 1358, 1394, 1438, 1498, 1534

  Visit http://sources.redhat.com/bugzilla/ for the details of each bug.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Greg Smith


On Thu, 1 Jan 2009, Robert Haas wrote:

The only thing I haven't been able to do is demonstrate that this change 
actually produces a performance improvement.  Either I'm testing the 
wrong thing, or it just doesn't provide any benefit on a single-spindle 
system.


When I did a round of testing on the earlier prefetch test program Greg 
Stark put together, one of my single-spindle Linux system didn't show any 
real benefit.  So as long as you didn't see performance degrade, your not 
seeing any improvement isn't bad news.


I've got a stack of hardware I can do performance testing of this patch 
on, what I haven't been able to find time for is setting up any sort of 
test harness right now.  If you or Greg have any benchmark or test program 
you could suggest that should show off the improvements here, I'd be glad 
to run it on a bunch of systems and report back--I've already got a stack 
of candidate ones I ran the earlier tests on to compare results against.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Tom Lane

Greg Smith gsm...@gregsmith.com writes:
 On Thu, 1 Jan 2009, Robert Haas wrote:
 The only thing I haven't been able to do is demonstrate that this change 
 actually produces a performance improvement.  Either I'm testing the 
 wrong thing, or it just doesn't provide any benefit on a single-spindle 
 system.

 When I did a round of testing on the earlier prefetch test program Greg 
 Stark put together, one of my single-spindle Linux system didn't show any 
 real benefit.  So as long as you didn't see performance degrade, your not 
 seeing any improvement isn't bad news.

ISTM that you *should* be able to see an improvement on even
single-spindle systems, due to better overlapping of CPU and I/O effort.
If the test case is either 100% CPU-bound or 100% I/O-bound then no,
but for anything in between there ought to be improvement.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Greg Smith


On Fri, 2 Jan 2009, Tom Lane wrote:


ISTM that you *should* be able to see an improvement on even
single-spindle systems, due to better overlapping of CPU and I/O effort.


The earlier synthetic tests I did:

http://archives.postgresql.org/pgsql-hackers/2008-09/msg01401.php

Showed a substantial speedup even in the single spindle case on a couple 
of systems, but one didn't really seem to benefit.  So we could theorize 
that Robert's test system is more like that one.  If someone can help out 
with making a more formal test case showing this in action, I'll dig into 
the details of what's different between that system and the others.


--
* Greg Smith gsm...@gregsmith.com http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Tom Lane

Greg Smith gsm...@gregsmith.com writes:
 On Fri, 2 Jan 2009, Tom Lane wrote:
 ISTM that you *should* be able to see an improvement on even
 single-spindle systems, due to better overlapping of CPU and I/O effort.

 The earlier synthetic tests I did:
 http://archives.postgresql.org/pgsql-hackers/2008-09/msg01401.php
 Showed a substantial speedup even in the single spindle case on a couple 
 of systems, but one didn't really seem to benefit.  So we could theorize 
 that Robert's test system is more like that one.  If someone can help out 
 with making a more formal test case showing this in action, I'll dig into 
 the details of what's different between that system and the others.

Well, I claim that if you start with a query that's about 50% CPU and
50% I/O effort, you ought to be able to get something approaching 2X
speedup if this patch really works.  Consider something like

create function waste_time(int) returns int as $$
begin
  for i in 1 .. $1 loop
null;
  end loop;
  return 1;
end $$ language plpgsql;

select count(waste_time(42)) from very_large_table;

In principle you should be able to adjust the constant so that vmstat
shows about 50% CPU busy, and then enabling fadvise should improve
matters significantly.

Now the above proposed test case is too simple because it will generate
a seqscan, and if the kernel is not completely brain-dead it will not
need any fadvise hinting to do read-ahead.  But you should be able to
adapt the idea for whatever indexscan-based test case you are really
using.

Note: on a multi-CPU system you need to take vmstat or top numbers with
a grain of salt, since they might consider one CPU 50% busy as
system only 50/N % busy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 In principle you should be able to adjust the constant so that vmstat
 shows about 50% CPU busy, and then enabling fadvise should improve
 matters significantly.

I think in practice individual queries don't interleave much cpu with i/o
work. A single random page fetch is 5ms which is an awful lot of cpu cycles to
be sinking somewhere. In practice I think this is going to be single-digit
percentages.

Aside from big sorts and such which tend not to be interleaved with i/o the
main time I've seen queries have a significant cpu load is when the data is
mostly in cache. In which case prefetching would be hard pressed to help at
all.

We could construct a synthetic case but the main point of this feature is to
make use of raid arrays that are currently going idle, not to pick up a few
percentage points for single spindle systems.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Bruce Momjian

Greg Smith wrote:
 On Fri, 2 Jan 2009, Tom Lane wrote:
 
  ISTM that you *should* be able to see an improvement on even
  single-spindle systems, due to better overlapping of CPU and I/O effort.
 
 The earlier synthetic tests I did:
 
 http://archives.postgresql.org/pgsql-hackers/2008-09/msg01401.php
 
 Showed a substantial speedup even in the single spindle case on a couple 
 of systems, but one didn't really seem to benefit.  So we could theorize 
 that Robert's test system is more like that one.  If someone can help out 
 with making a more formal test case showing this in action, I'll dig into 
 the details of what's different between that system and the others.

I think for an I/O-bound workload on a single drive system you would
need a drive that did some kind of tagged queuing (reordering/grouping)
of requests to see a benefit from the patch.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Robert Haas

 I've got a stack of hardware I can do performance testing of this patch on,
 what I haven't been able to find time for is setting up any sort of test
 harness right now.  If you or Greg have any benchmark or test program you
 could suggest that should show off the improvements here, I'd be glad to run
 it on a bunch of systems and report back--I've already got a stack of
 candidate ones I ran the earlier tests on to compare results against.

Unfortunately I don't have anything useful.  I took the skewed TPC-H
data that Lawrence Ramon posted and created a table based on lineitem
using something along the lines of:

SELECT *, g FROM lineitem, generate_series(1,8) AS g;

Then I made an index on one of the columns and ran some queries that
ended up being planned as bitmap index scans.  The disk seemed to be
doing its thing, but it didn't do it's thing any faster when I changed
effective_io_concurrency to 4 (in fact there might have been a small
slowdown but I didn't take the time to measure carefully, so I can't
refute the null hypothesis).

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Greg Stark

On Fri, Jan 2, 2009 at 5:36 PM, Robert Haas robertmh...@gmail.com wrote:
 I've got a stack of hardware I can do performance testing of this patch on,
 what I haven't been able to find time for is setting up any sort of test
 harness right now.  If you or Greg have any benchmark or test program you
 could suggest that should show off the improvements here, I'd be glad to run
 it on a bunch of systems and report back--I've already got a stack of
 candidate ones I ran the earlier tests on to compare results against.

 Then I made an index on one of the columns and ran some queries that
 ended up being planned as bitmap index scans.

Hm, what were those plans? You might want to put the old code back in
explain.c to print the prefetching target to see how well it's doing.

The queries I ran to test it tended to look like this kind of thing,
where r was a column filled with random values. If it's clustered
there will be hardly any benefit as even random i/o would be
prefetched by the OS.

select * from h where r = any (array(select
(1+random()*200)::integer from generate_series(1,1000)));

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Tom Lane

Gregory Stark st...@enterprisedb.com writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 In principle you should be able to adjust the constant so that vmstat
 shows about 50% CPU busy, and then enabling fadvise should improve
 matters significantly.

 I think in practice individual queries don't interleave much cpu with i/o
 work. A single random page fetch is 5ms which is an awful lot of cpu cycles to
 be sinking somewhere. In practice I think this is going to be single-digit
 percentages.

The point of the suggestion is to prove that the patch works as
advertised.  How wide the sweet spot is for this test isn't nearly as
interesting as proving that there *is* a sweet spot.  If you can't
find one it suggests that either the patch or the local posix_fadvise
doesn't work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Robert Haas

 Hm, what were those plans? You might want to put the old code back in
 explain.c to print the prefetching target to see how well it's doing.

Well, bad news.  Here's one where prefetching seems to make it WORSE.

rhaas=# explain select sum(1) from enormous where l_shipdate in
('1992-01-01', '1993-01-01', '1994-01-01', '1995-01-01', '1996-01-01',
'1997-01-01', '1998-01-01', '1999-01-01', '2000-01-01', '2001-01-01');

 QUERY PLAN

--

 Aggregate  (cost=455072.75..455072.76 rows=1 width=0)
   -  Bitmap Heap Scan on enormous  (cost=3327.59..454634.09
rows=175464 width=0)
 Recheck Cond: (l_shipdate = ANY
('{1992-01-01,1993-01-01,1994-01-01,1995-01-01,1996-01-01,1997-01-01,1998-01-01,1999-01-01,2000-01-01,2001-01-01}'::d
ate[]))
 -  Bitmap Index Scan on enormous_l_shipdate
(cost=0.00..3283.72 rows=175464 width=0)
   Index Cond: (l_shipdate = ANY
('{1992-01-01,1993-01-01,1994-01-01,1995-01-01,1996-01-01,1997-01-01,1998-01-01,1999-01-01,2000-01-01,2001-01-01}
'::date[]))
(5 rows)

With effective_io_concurrency set to 1, this took 32 s.  With
effective_io_concurrency set to 4, it took 50 s.  The table was
created like this:

create table enormous as select l.*, l_instance from lineitem l,
generate_series(1, 8) l_instance;
create index enormous_l_shipdate on enormous (l_shipdate);
vacuum analyze enormous;

...where lineitem is from the skewed TPC-H data for the histojoin patch.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Greg Stark

On Fri, Jan 2, 2009 at 8:42 PM, Robert Haas robertmh...@gmail.com wrote:
 Hm, what were those plans? You might want to put the old code back in
 explain.c to print the prefetching target to see how well it's doing.

 Well, bad news.  Here's one where prefetching seems to make it WORSE.

 rhaas=# explain select sum(1) from enormous where l_shipdate in
 ('1992-01-01', '1993-01-01', '1994-01-01', '1995-01-01', '1996-01-01',
 '1997-01-01', '1998-01-01', '1999-01-01', '2000-01-01', '2001-01-01');

 QUERY PLAN

 --
 
  Aggregate  (cost=455072.75..455072.76 rows=1 width=0)
   -  Bitmap Heap Scan on enormous  (cost=3327.59..454634.09
 rows=175464 width=0)


Any chance you could put back the code in explain.c which showed
whether posix_fadvise is actually getting used? Another thing I did
when testing was attaching with strace to see if posix_fadvise (the
syscall on linux was actually fadvise64 iirc) is actually getting
called.

And is this on a system with multiple spindles? How many?

And how much of the data is in shared buffers or in filesystem cache?
Is this consistent for repeated queries? Is it only when you're
repeating a query for dates that you've already selected?

And how fast is it with effective_io_concurrency set to 1,2,3,5,6,7,8,...?

Do you see the same effect if you use a self-contained test case
instead of the TPC-H data so I can try it?

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Robert Haas

 Any chance you could put back the code in explain.c which showed
 whether posix_fadvise is actually getting used? Another thing I did
 when testing was attaching with strace to see if posix_fadvise (the
 syscall on linux was actually fadvise64 iirc) is actually getting
 called.

I tried changing this:
returnCode = posix_fadvise(VfdCache[file].fd, offset, amount,
POSIX_FADV_WILLNEED);
to this:
returnCode = 0;

When I did that, it when back from 50 s to 33 s, which I think means
that posix_fadvise is getting called and that that is what is making
it slower.

 And is this on a system with multiple spindles? How many?

Latitude D830 laptop.  Single disk.  Fedora 9.  kernel-2.6.27.9-73.fc9.x86_64.

 And how much of the data is in shared buffers or in filesystem cache?
 Is this consistent for repeated queries? Is it only when you're
 repeating a query for dates that you've already selected?

I stopped the cluster, dropped the page cache, and restarted the
cluster just before testing.  Repeated tests are fast due to caching
effects.  shared_buffers is 240MB.  System has 2GB RAM, steady state
is about 1GB of page cache.

 And how fast is it with effective_io_concurrency set to 1,2,3,5,6,7,8,...?

I do not currently have this information.  :-)

I will try to run some more tests over the weekend, but I'm too tired
now and am starting to make mistakes.

 Do you see the same effect if you use a self-contained test case
 instead of the TPC-H data so I can try it?

Not sure exactly what you have in mind here.  If you send a script or
something to reproduce I will try it.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Gregory Stark


Tom Lane t...@sss.pgh.pa.us writes:

 The point of the suggestion is to prove that the patch works as
 advertised.  How wide the sweet spot is for this test isn't nearly as
 interesting as proving that there *is* a sweet spot.  If you can't
 find one it suggests that either the patch or the local posix_fadvise
 doesn't work.

I posted tons of reproducible test cases with graphs of results for various
raid stripe widths a while back. There was a very slight benefit on a single
spindle at some prefetch depths but it wasn't very consistent and it varied
heavily depending on the prefetch depth.

I don't know what to make of this test. I don't know how to reproduce the same
data distribution, I have no idea what raid configuration it's been run on,
what version of what OS it's on, etc. It's quite possible posix_fadvise isn't
working on it, I don't know.

It's also possible the overhead of the extra buffer lookups and syscalls
outweight any benefit of overlapping i/o and cpu on a single spindle.

Trying to contrive a situation where a single spindle sees a significant
benefit is going to be very tricky. Avoiding caching effects and the
confounding effect of any overhead will make it hard to see a consistent
benefit without a raid array.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-02 Thread Greg Stark

On Fri, Jan 2, 2009 at 11:13 PM, Robert Haas robertmh...@gmail.com wrote:
 When I did that, it when back from 50 s to 33 s, which I think means
 that posix_fadvise is getting called and that that is what is making
 it slower.

 And is this on a system with multiple spindles? How many?

 Latitude D830 laptop.  Single disk.  Fedora 9.  kernel-2.6.27.9-73.fc9.x86_64.

 And how much of the data is in shared buffers or in filesystem cache?
 Is this consistent for repeated queries? Is it only when you're
 repeating a query for dates that you've already selected?

 I stopped the cluster, dropped the page cache, and restarted the
 cluster just before testing.  Repeated tests are fast due to caching
 effects.  shared_buffers is 240MB.  System has 2GB RAM, steady state
 is about 1GB of page cache.

Ahhh. So this is a test of how much impact the extra syscalls and
buffer lookups have on a system where they're not really helping. I'm
still surprised, a 50% performance penalty is a lot worse than I would
have expected, especially when the buffers aren't in cache. I did one
quick test and saw about 10% performance penalty in that test.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Robert Haas

I tried this on my laptop running FC9, and because I forgot to run
autoconf, I got this error message when I tried to turn on
posix_fadvise.

rhaas=# set effective_io_concurrency to 3;
ERROR:  could not determine if this system has a working posix_fadvise
DETAIL:  Check configure.log produced by configure for more information

Am I correct in thinking that the only thing we're really checking for
here is whether a trivial posix_fadvise() call returns success?  If
so, is this test really worth doing?  It seems to me that since users
can always switch off the feature by leaving effective_io_concurrency
set to the default value of 1, there is not much value in having a
configure test that forcibly disables it.  If the user has a broken
posix_fadvise() and later fixes it, they shouldn't have to recompile
PostgreSQL to use this feature, especially in this day when the build
system and the run system are often different.  A user who somehow
ends up with RPMs that generate this error message will be utterly at
a loss as to what to do about it.

One minor nit: If we're going to keep this test, we should change the
detail string to say config.log rather than configure.log, as that is
the actual file name.

...Robert

On Thu, Dec 11, 2008 at 4:35 PM, Gregory Stark st...@enterprisedb.com wrote:
 Here's the update

 I also skimmed through and cleaned a couple other things. There's *still* a
 function prototype which I don't see what header file to put it in, that's the
 one in port/posix_fadvise.c which contains one function with one caller, 
 guc.c.


 --
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 Am I correct in thinking that the only thing we're really checking for
 here is whether a trivial posix_fadvise() call returns success?  If
 so, is this test really worth doing?

Runtime tests performed during configure are generally a bad idea to
start with --- it's impossible to do any such thing in a
cross-compilation scenario, for example.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Robert Haas

On Thu, Jan 1, 2009 at 3:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Am I correct in thinking that the only thing we're really checking for
 here is whether a trivial posix_fadvise() call returns success?  If
 so, is this test really worth doing?

 Runtime tests performed during configure are generally a bad idea to
 start with --- it's impossible to do any such thing in a
 cross-compilation scenario, for example.

OK, here's an update of Greg's patch with the runtime configure test
ripped out, some minor documentation tweaks, and a few unnecessary
whitespace diff hunks quashed.  I think this is about ready for
committer review.  The only thing I haven't been able to do is
demonstrate that this change actually produces a performance
improvement.  Either I'm testing the wrong thing, or it just doesn't
provide any benefit on a single-spindle system.  However, I believe
that Greg has previously posted some fairly impressive performance
results, so I'm not sure that my shortcomings in this area should be a
bar to having a committer pick this one up.  If more testing is
needed, it would at least be helpful to have a committer specify what
areas they are concerned about.

...Robert


posix_fadvise_v23_rh1.diff.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Greg Stark

In theory there should be no benefit on a single spindle system. There  
could be a slight benefit due to reordering of I/o but only on a raid  
array would you see a significant speedup -- which should be about  
equal to the number of spindles.


What would be interesting is whether you see a noticable speed  
*decrease* from having prefetching enabled when it isn't helping.  
Either due to having everything fit in shared buffers or everything  
fit in the filesystem cache (the latter should be more of a hit)


Even if there is it doesn't really worry me. By default the feature is  
disabled and you should only really turn it on if ulu do have a raid  
array and want an individual query to make use if it.



Now that there's an actual run-time sysconf check for the buggy glibc  
called by the guc function we arguably don't need the autoconf  
check_run check anymore anyways.


--
Greg


On 1 Jan 2009, at 21:43, Robert Haas robertmh...@gmail.com wrote:


On Thu, Jan 1, 2009 at 3:55 PM, Tom Lane t...@sss.pgh.pa.us wrote:

Robert Haas robertmh...@gmail.com writes:
Am I correct in thinking that the only thing we're really checking  
for

here is whether a trivial posix_fadvise() call returns success?  If
so, is this test really worth doing?


Runtime tests performed during configure are generally a bad idea to
start with --- it's impossible to do any such thing in a
cross-compilation scenario, for example.


OK, here's an update of Greg's patch with the runtime configure test
ripped out, some minor documentation tweaks, and a few unnecessary
whitespace diff hunks quashed.  I think this is about ready for
committer review.  The only thing I haven't been able to do is
demonstrate that this change actually produces a performance
improvement.  Either I'm testing the wrong thing, or it just doesn't
provide any benefit on a single-spindle system.  However, I believe
that Greg has previously posted some fairly impressive performance
results, so I'm not sure that my shortcomings in this area should be a
bar to having a committer pick this one up.  If more testing is
needed, it would at least be helpful to have a committer specify what
areas they are concerned about.

...Robert
posix_fadvise_v23_rh1.diff.gz


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Robert Haas

 Now that there's an actual run-time sysconf check for the buggy glibc called
 by the guc function we arguably don't need the autoconf check_run check
 anymore anyways.

Isn't that the check I just removed for you, or are you talking about
some other check that can also be removed?

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Greg Stark


Sorry for top-posting -- phone mail client sucks.

I thought the autoconf ac_run_check was the test that people were  
questioning. That calls posix_fadvise to see if it crashes at  
configure time.


The guc run-time check is checking for known-buggy versions of glibc  
using sysconf to check what version of glibc you have.


--
Greg


On 1 Jan 2009, at 23:11, Robert Haas robertmh...@gmail.com wrote:

Now that there's an actual run-time sysconf check for the buggy  
glibc called
by the guc function we arguably don't need the autoconf check_run  
check

anymore anyways.


Isn't that the check I just removed for you, or are you talking about
some other check that can also be removed?

...Robert


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2009-01-01 Thread Robert Haas

On Thu, Jan 1, 2009 at 11:49 PM, Greg Stark greg.st...@enterprisedb.com wrote:
 Sorry for top-posting -- phone mail client sucks.

 I thought the autoconf ac_run_check was the test that people were
 questioning. That calls posix_fadvise to see if it crashes at configure
 time.

Yes, that's what I removed.

 The guc run-time check is checking for known-buggy versions of glibc using
 sysconf to check what version of glibc you have.

Right - that check is still in my updated patch.

I think the confusion may stem from the fact that Tom and I used the
word runtime to refer to the ac_run_check thing, because it is
checking something about the runtime environment (namely, whether
posix_fadvise works or not) at configure-time.

In any event, it seems as though we are all on the same page.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2008-12-11 Thread Greg Stark

I'll send another path with at least 1 and 3 fixed and hunt around  
again for a header file to put this guc into.


On 10 Dec 2008, at 04:22, ITAGAKI Takahiro [EMAIL PROTECTED] 
 wrote:



Hello,

Gregory Stark [EMAIL PROTECTED] wrote:

Here's an update to eliminate two small bitrot conflicts.


I read your patch with interest, but found some trivial bad manners.

* LET_OS_MANAGE_FILESIZE is already obsoleted.
You don't have to cope with the option.


Huh I didn't realize that. I guess the idea is  that users just  
configure a very large segment size to get the old behaviour.





* Type mismatch in prefetch_pages
A variable prefetch_pages is defined as unsigned or int
in some places. Why don't you define it only once in a header
and include the header in source files?


Just... Which header?



* Assignment to prefetch_pages
What do +0.99 means here?
  [assign_io_concurrency()]
  +prefetch_pages = new_prefetch_pages+0.99;
You want to do as follows, right?
  +prefetch_pages = (int) ceil(new_prefetch_pages);


Sure


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2008-12-11 Thread Tom Lane

Greg Stark [EMAIL PROTECTED] writes:
 A variable prefetch_pages is defined as unsigned or int
 in some places. Why don't you define it only once in a header
 and include the header in source files?

 Just... Which header?

MHO: the header that goes with the source file that is most concerned with
implementing the variable's behavior (which is also the file that should
have the actual variable definition).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2008-12-11 Thread Greg Stark

On Thu, Dec 11, 2008 at 4:29 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark greg.st...@enterprisedb.com writes:
 A variable prefetch_pages is defined as unsigned or int
 in some places. Why don't you define it only once in a header
 and include the header in source files?

 Just... Which header?

 MHO: the header that goes with the source file that is most concerned with
 implementing the variable's behavior (which is also the file that should
 have the actual variable definition).

Well the trick here is that the variable actually affects how many
PrefetchBuffer() calls *callers* should make. The callers are various
places which are doing lots of ReadBuffer calls and know what buffer's
they'll need in the future. The main places are in
nodeBitmapHeapScan.c and nbtsearch.c. Neither of those are remotely
relevant.

I think i'm settling in that it should be in the same place as the
PrefetchBuffer() prototype since anyone who needs prefetch_buffers
will need that as well (except for guc.c). So I'll put it in bufmgr.h
for now.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2008-12-11 Thread Gregory Stark


Here's the update

I also skimmed through and cleaned a couple other things. There's *still* a
function prototype which I don't see what header file to put it in, that's the
one in port/posix_fadvise.c which contains one function with one caller, guc.c.



posix_fadvise_v23.diff.gz
Description: Binary data


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] posix_fadvise v22

2008-12-10 Thread ITAGAKI Takahiro

Hello,

Gregory Stark [EMAIL PROTECTED] wrote:
 Here's an update to eliminate two small bitrot conflicts.

I read your patch with interest, but found some trivial bad manners.

* LET_OS_MANAGE_FILESIZE is already obsoleted.
You don't have to cope with the option.

* Type mismatch in prefetch_pages
A variable prefetch_pages is defined as unsigned or int
in some places. Why don't you define it only once in a header
and include the header in source files?

* Assignment to prefetch_pages
What do +0.99 means here?
[assign_io_concurrency()]
+   prefetch_pages = new_prefetch_pages+0.99;
You want to do as follows, right?
+   prefetch_pages = (int) ceil(new_prefetch_pages);


Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] posix_fadvise v22

2008-12-09 Thread Gregory Stark


Here's an update to eliminate two small bitrot conflicts.



posix_fadvise_v22.diff.gz
Description: Binary data


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

41 matches

Mail list logo