subject:"Re\: \[HACKERS\] Commitfest patches"

Re: [HACKERS] Commitfest: patches Ready for Committer

2014-10-06 Thread Michael Paquier

On Mon, Oct 6, 2014 at 10:53 PM, Tom Lane  wrote:

> The levenshtein-distance thingy seems to still be a topic of debate
> as well, both as to how we're going to refactor the code and as to
> what the exact hinting rules ought to be.  If some committer wants
> to take charge of it and resolve those issues, fine; but I don't want
> to see it done by just blindly committing whatever the last submitted
> version was.
>

My 2c. I imagine that in this case the consensus is going to be first:
- Move a minimum of the core functions of fuzzystrmatch and rename them
(str_distance?)
- Do not dump fuzzystrmatch and have the levenshtein call those functions
In any case levenshtein code needs a careful refactoring and some
additional thoughts first before the hint code is touched.
Regards.
-- 
Michael

Re: [HACKERS] Commitfest: patches Ready for Committer

2014-10-06 Thread Tom Lane

Heikki Linnakangas  writes:
> Committers: Could you please pick a patch, and commit if appropriate? Or 
> if there's a patch there that you think should *not* be committed, 
> please speak up.

The "custom plan API" thing may be marked ready for committer, but that
doesn't mean it's committable, or even that there is any consensus about
whether we want it or what it should look like.

The levenshtein-distance thingy seems to still be a topic of debate
as well, both as to how we're going to refactor the code and as to
what the exact hinting rules ought to be.  If some committer wants
to take charge of it and resolve those issues, fine; but I don't want
to see it done by just blindly committing whatever the last submitted
version was.

I've not paid much of any attention to the other four.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-12 Thread KaiGai Kohei


Josh Berkus wrote:

Hackers,

At this point, almost all patches have been assigned to reviewers.  If 
you submitted a patch and don't get feedback by Saturday, take a look at 
who's in the "reviewers" column and send them a query.  Since I have no 
way to track when patches are assigned to reviewers, I have no idea if 
some of them are sitting on their hands.

   :
Note that patches need not have only one reviewer!  If you have time, 
please take on testing some of the more complex patches, especially:

Column-level Permissions
Common Table Expressions
SE-PostgreSQL patches


I updated my patches:
http://archives.postgresql.org/pgsql-hackers/2008-09/msg00859.php

It contains rebasing to the latest CVS HEAD, a new hook to avoid
bypass access controls, and a small fix.

In addition, I tried to attach several short explanations about key points
to understand the big patch. I wrote the documentation of SE-PostgreSQL,
but it is not a description for *the patch*.
If reviewers have any unclear things, please feel free to ask me.

Thanks,
--
OSS Platform Development Division, NEC
KaiGai Kohei <[EMAIL PROTECTED]>

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Andrew Chernow


Josh Berkus wrote:

Josh is probably basing that on the long history of
discussion/revision cycles. 


Yep, and that *I* don't understand what the patch does, so I'm not going to 
turn a newbie reviewer loose on it.




Here is a quick overview, there are two parts to the patch:

1. event system
This allows one to register "PQregisterEventProc" a per-conn callback function 
with libpq that gets called when particular events occur.  Currently, the events 
tell you when a conn or result is created, reset, destroyed and copied.  It is 
generic enough to add more events in the future.


By receiving events about libpq objects, you can properly allocate and free 
userland memory (PQfinish, PQexec, PQclear, etc... trigger events) and associate 
it with the libpq object: thus PQsetInstanceData(conn...) or 
PQsetResultInstanceData(res).  This is basically adding members to the 
opaque PGconn or PGresult during runtime.  This "instance data" can be retreived 
via PQinstanceData and PQresultInstanceData.  To shine a different light on it, 
apps normally wrap a PGconn or PGresult within their own structures, but now you 
can wrap the app structures inside a PGconn or PGresult.


This may help, its the libpqtypes PGEventProc implementation.
http://libpqtypes.esilo.com/browse_source.html?file=events.c

Also check out the patches sgml docs if you get a chance.


2. result management
There are also four functions that provide more control over PGresult.  If you 
need to create a result from scratch, expands on the PQmakeEmptyPGResult idea.


PQcopyResult - copy a given source result, flags argument determines what 
portions of the result are copied.


PQsetResultAttrs - sets the attributes of the reuslt (its columns).

PQsetvalue - sets a tuple value in a result

PQresultAlloc - thin wrapper around the internal pqResultAlloc.  Uses the 
result's block allocater, which allows PQclear to properly free all memory 
assocaited with a PGresult.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Josh Berkus


> Josh is probably basing that on the long history of
> discussion/revision cycles. 

Yep, and that *I* don't understand what the patch does, so I'm not going to 
turn a newbie reviewer loose on it.

-- 
--Josh

Josh Berkus
PostgreSQL
San Francisco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Merlin Moncure

On Thu, Sep 11, 2008 at 5:06 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
>> The patch looks OK to me as it was the last time I looked at it; maybe
>> Tom has more comments, so I decided against just committing it.
>
> I haven't got round to looking at it in this fest.  If anyone else wants
> to look it over, feel free.  I think Josh is overrating the patch's
> review difficulty --- anyone who uses libpq a lot could review it,
> IMHO.  You certainly wouldn't need any backend-internals knowledge.
>

Josh is probably basing that on the long history of
discussion/revision cycles.  There is very little change from the
design that was hammered out except for the late breaking tweak of how
the passthru pointer works.  I think the subtext here is that we are
waiting on you to sign off on the patch (or not).

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Tom Lane

Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Merlin Moncure escribió:
>> Alvaro actually performed a review on libpq events.  We are waiting on
>> his feedback on our changes (based on his review) and the newly
>> submitted documentation.  I'll update the wiki accordingly.  I wasn't
>> sure if Alvaro was claiming reviewer status or not.  It may be ready
>> to go in as-is unless we draw any new objections.

> The patch looks OK to me as it was the last time I looked at it; maybe
> Tom has more comments, so I decided against just committing it.

I haven't got round to looking at it in this fest.  If anyone else wants
to look it over, feel free.  I think Josh is overrating the patch's
review difficulty --- anyone who uses libpq a lot could review it,
IMHO.  You certainly wouldn't need any backend-internals knowledge.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Andrew Chernow


Alvaro Herrera wrote:


Actually a minor gripe ... should PQsetvalue be PQsetValue? :-)



We were mimicing PQgetvalue, which uses a lowercase 'v'.  Is there a 
preference, none on this end.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Alvaro Herrera

Merlin Moncure escribió:
> On Thu, Sep 11, 2008 at 1:53 AM, Josh Berkus <[EMAIL PROTECTED]> wrote:
> > Some patches have not been assigned to reviewers for various reasons. The
> > following weren't assigned because they are complex and really need a
> > high-end hacker or committer to take them on:
> >
> > libpq events
> 
> Alvaro actually performed a review on libpq events.  We are waiting on
> his feedback on our changes (based on his review) and the newly
> submitted documentation.  I'll update the wiki accordingly.  I wasn't
> sure if Alvaro was claiming reviewer status or not.  It may be ready
> to go in as-is unless we draw any new objections.

The patch looks OK to me as it was the last time I looked at it; maybe
Tom has more comments, so I decided against just committing it.

I admit I haven't checked the docs.


Actually a minor gripe ... should PQsetvalue be PQsetValue? :-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Merlin Moncure

On Thu, Sep 11, 2008 at 1:53 AM, Josh Berkus <[EMAIL PROTECTED]> wrote:
> Some patches have not been assigned to reviewers for various reasons. The
> following weren't assigned because they are complex and really need a
> high-end hacker or committer to take them on:
>
> libpq events

Alvaro actually performed a review on libpq events.  We are waiting on
his feedback on our changes (based on his review) and the newly
submitted documentation.  I'll update the wiki accordingly.  I wasn't
sure if Alvaro was claiming reviewer status or not.  It may be ready
to go in as-is unless we draw any new objections.

Anybody who wants to look should ping Alvaro and/or Tom.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches mostly assigned ... status

2008-09-11 Thread Tom Lane

Josh Berkus <[EMAIL PROTECTED]> writes:
> Also, note that the following patches need performance testing on a 
> variety of platforms.  Everyone should help with this.
> GSoC Improved Hash Indexing
> posix fadvises
> operator restrictivity function for text search
> CLUSTER using sort instead of index scan

The last two of those are not code-complete so I'm not sure that
performance testing is all that meaningful for them.  But by all
means, test the first two ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-04-04 Thread Gregory Stark

"Greg Smith" <[EMAIL PROTECTED]> writes:

> On Fri, 28 Mar 2008, Gregory Stark wrote:
>
>> I described which interfaces worked on Linux and Solaris based on empirical
>> tests. I posted source code for synthetic benchmarks so we could test it on a
>> wide range of hardware. I posted graphs based on empirical results.
>
> Is it possible to post whatever script that generates the graph (gnuplot?) so
> people can compare the results they get to yours?  

Unfortunately I couldn't figure out how to get gnuplot to do this. I ended up
doing it in gnumeric. I'll look into fixing it the script up to be more
automatic. Maybe look at gnuplot again.

> I can run some tests on smaller Linux/Solaris systems to see if they don't 
> show
> a regression, that was my main concern about this experiment.  

Heikki suggested some things to test for regressions. I'll look at that.

What I'm more curious about and hoped I could get data from others is whether
posix_fadvise(WILL_NEED) does anything useful on various versions of FreeBSD,
OSX, etc. Afaict the syscall doesn't exist at all on Solaris though, which
surprises me :(

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-04-04 Thread Gregory Stark

"Martijn van Oosterhout" <[EMAIL PROTECTED]> writes:

> - I think normal index scans could benefit from this (it was measurable
> when I was playing with AIO in postgres a few years back).

I don't want to torture any existing code paths to get prefetching to work.
Heikki suggested I take advantage of the page-at-a-time index scanning though
which does sound like it could be convenient.

> - I think the number of preread_count is far too small, given you get a
> benefit even if you only have one spindle.
> - I don't understand the ramp-up method either.

So the idea is that I was deathly afraid of being accused of doing unnecessary
additional I/O. So I was worried about the user doing something like SELECT
... LIMIT 1. Or worse, just starting a select and discarding it after only
selecting a handful of records. 

I also didn't want to start the bitmap scan by doing hundreds of syscalls
before we even return the first record.

So I figured it's safer to read only the first block at first. Only if the
user goes on to the next record do we try prefetching the next block. Each
record the user reads we bump up the prefetch amount by 1 until we hit the
goal value for the size raid array we're using.

That also nicely spreads out the syscalls so we get one prefetch between each
record returned.

We also don't know how densely packed the records are on the pages. If they're
densely packed then we'll have prefetched a whole bunch of records in time for
the second page. But if there's only one record on a page and we're already on
the second page I figured that indicated we would be reading many pages with
sparsely distributed records. So we ramp up the prefetch exponentially by
multiplying by two each time we move to the next page.

The net result is that if you do a bitmap scan which is processing lots of
pointers it'll quickly reach the point where it's prefetching the number of
blocks based on effective_spindle_count. If you're just processing the first
few tuples it'll only read a small number of extra pages. And if you're only
processing one record it won't prefetch at all.

> People spend a lot of time worrying about hundreds of posix_fadvise()
> calls but you don't need anywhere near that much to be effective. With
> AIO I limited the number of outstanding requests to a dozen and it was
> still useful. You lose nothing by capping the number of requests at any
> point.

Well you leave money on the table. But yes, I'm trying to be conservative
about how much to prefetch when we don't know that it's in our favour.

>> I want to know if we're interested in the more invasive patch restructuring
>> the buffer manager. My feeling is that we probably are eventually. But I
>> wonder if people wouldn't feel more comfortable taking baby steps at first
>> which will have less impact in cases where it's not being heavily used.
>
> I think the way it is now is neat and simple and enough for now.

Thanks.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-29 Thread Greg Smith


On Fri, 28 Mar 2008, Gregory Stark wrote:


I described which interfaces worked on Linux and Solaris based on empirical
tests. I posted source code for synthetic benchmarks so we could test it on a
wide range of hardware. I posted graphs based on empirical results.


Is it possible to post whatever script that generates the graph (gnuplot?) 
so people can compare the results they get to yours?  It's when I realized 
I didn't have that and would have to recreate one myself that my intention 
to just run some quick results and ship them to you lost momentum and I 
never circled back to it.  It would be nice if you made it easier for 
people to generate fancy results here as immediate gratification for them 
(while of course just asking them to send the raw data).


I can run some tests on smaller Linux/Solaris systems to see if they don't 
show a regression, that was my main concern about this experiment.  Some 
of the discussion that followed your original request for tests was kind 
of confusing as far as how to interpret the results as well; I think I 
know what to look for but certainly wouldn't mind some more guidance 
there, too.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Martijn van Oosterhout

On Fri, Mar 28, 2008 at 05:34:30PM +, Gregory Stark wrote:
> But what I really need is someone to read the patch and say "looks good" or
> point out things they don't like. In particular, what I really, really want is
> some guidance on the singular key question I asked.

I was going to write all sorts of stuff, till I noticed Heikki said
basically everything I was going to say:

- I think normal index scans could benefit from this (it was measurable
when I was playing with AIO in postgres a few years back).
- The integration with the bitmap scan is good, neat even
- I think the number of preread_count is far too small, given you get a
benefit even if you only have one spindle.
- I don't understand the ramp-up method either.

People spend a lot of time worrying about hundreds of posix_fadvise()
calls but you don't need anywhere near that much to be effective. With
AIO I limited the number of outstanding requests to a dozen and it was
still useful. You lose nothing by capping the number of requests at any
point.

> I want to know if we're interested in the more invasive patch restructuring
> the buffer manager. My feeling is that we probably are eventually. But I
> wonder if people wouldn't feel more comfortable taking baby steps at first
> which will have less impact in cases where it's not being heavily used.

I think the way it is now is neat and simple and enough for now.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Heikki Linnakangas


Gregory Stark wrote:

I want to know if we're interested in the more invasive patch restructuring
the buffer manager. My feeling is that we probably are eventually. But I
wonder if people wouldn't feel more comfortable taking baby steps at first
which will have less impact in cases where it's not being heavily used.

It's a lot more work to do the invasive patch and there's not much point in
doing it if we're not interested in taking it when it's ready.


I like the simplicity of this patch. My biggest worry is the impact on 
performance when the posix_fadvise calls are not helping. I'd like to 
see some testing of that. How expensive are the extra bufmgr hash table 
lookups, compared to all the other stuff that's involved in a bitmap 
heap scan?


Another thing I wonder is whether this approach can easily be extended 
to other types of scans than bitmap heap scans. Normal index scans seem 
most interesting; they can generate a lot of random I/O. I don't see any 
big problems there, except again the worst-case performance: If you 
issue AdviseBuffer calls for all the heap tuples in an index page, in 
the naive way, you can issue hundreds of posix_fadvise calls. But what 
if they're all actually on the same page? Surely you only want to go 
through the buffer manager once (like we do in the scan, thanks to 
ReleaseAndReadBuffer()) in that case, and only call posix_fadvise once. 
But again, maybe you can convince me that it's a non-issue by some 
benchmarking.



Here's an updated patch. It's mainly just a CVS update but it also includes
some window dressing: an autoconf test, some documentation, and some fancy
arithmetic to auto-tune the amount of prefetching based on a more real-world
parameter "effective_spindle_count". 


I like the "effective_spindle_count". That's very intuitive.

The formula to turn that into preread_pages looks complex, though. I can 
see the reasoning behind it, but I doubt thing behave that beautifully 
in the real world. (That's not important right now, we have plenty of 
time to tweak that after more testing.)



It also includes printing out how much
the prefetching target got ramped up to in the explain output -- though I'm
not sure we really want that in the tree, but it's nice for performance
testing.


I don't understand the ramp up logic. Can you explain what that's for 
and how it works?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Gregory Stark

"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> Gregory Stark wrote:
>> 
>> Bruce, you seem to have removed one of my three patches from the queue. I
>> would actually prefer you remove the other two and put back that one. It's 
>> the
>> one I most urgently need feedback on to continue.
>
> I talked to Greg on IM.  The complaint was that his posix_fadvise()
> patch was added to the TODO list as a URL, rather than getting him
> feedback.  He is getting feedback now in another thread.

Sort of. The feedback I'm getting is from people who want to discuss the
"easy" parameters of the patch like how much prefetching to do and when
without actually reading to see what's already been done. I'm happy to discuss
them as I find these things interesting too.

But what I really need is someone to read the patch and say "looks good" or
point out things they don't like. In particular, what I really, really want is
some guidance on the singular key question I asked.

I want to know if we're interested in the more invasive patch restructuring
the buffer manager. My feeling is that we probably are eventually. But I
wonder if people wouldn't feel more comfortable taking baby steps at first
which will have less impact in cases where it's not being heavily used.

It's a lot more work to do the invasive patch and there's not much point in
doing it if we're not interested in taking it when it's ready.

Here's an updated patch. It's mainly just a CVS update but it also includes
some window dressing: an autoconf test, some documentation, and some fancy
arithmetic to auto-tune the amount of prefetching based on a more real-world
parameter "effective_spindle_count". It also includes printing out how much
the prefetching target got ramped up to in the explain output -- though I'm
not sure we really want that in the tree, but it's nice for performance
testing.

bitmap-preread-v9.diff.gz
Description: Binary data

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Bruce Momjian

Heikki Linnakangas wrote:
> > Should we consider only telling the kernel X pages ahead, meaning when
> > we are on page 10 we tell it about page 16?
> 
> Yes. You don't want to fire off thousands of posix_fadvise calls 
> upfront. That'll just flood the kernel, and it will most likely ignore 
> any advise after the first few hundred or so. I'm not sure what the 
> appropriate amount of read ahead would be, though. Probably depends a 
> lot on the OS and hardware, and needs to be a adjustable.
> 
> In some cases we can't easily read ahead more than a certain number of 
> pages. For example, in a regular index scan, we can easily fire off 
> posix_advise calls for all the heap pages referenced by a single index 
> page, but reading ahead more than that becomes much more complex.

And if you read-ahead too far the pages might get pushed out of the
kernel cache before you ask to read them.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Heikki Linnakangas


Bruce Momjian wrote:

Heikki Linnakangas wrote:

So it has nothing to do with table size. The fadvise calls need to be
(and are) 
limited by what can be used in the near future, and not for the whole

statement.
Right, I was sloppy. Instead of table size, what matters is the amount 
of data the scan needs to access. The point remains that if the data is 
already in OS cache, the posix_fadvise calls are a waste of time, 
regardless of how many pages ahead you advise.


I now understand what posix_fadvise() is allowing us to do. 
posix_fadvise(POSIX_FADV_WILLNEED) allows us to tell the kernel we will

need a certain block in the future --- this seems much cheaper than a
background reader.


Yep.


We know we will need the blocks, and telling the kernel can't hurt,
except that there is overhead in telling the kernel.  Has anyone
measured how much overhead?  I would be interested in a test program
that read the same page over and over again from the kernel, with and
without a posix_fadvise() call.


Agreed, that needs to be benchmarked next. There's also some overhead in 
doing the buffer manager hash table lookup to check whether the page is 
in shared_buffers. We could reduce that by the more complex approach 
Greg mentioned of allocating a buffer in shared_buffers when we do 
posix_fadvise.



Should we consider only telling the kernel X pages ahead, meaning when
we are on page 10 we tell it about page 16?


Yes. You don't want to fire off thousands of posix_fadvise calls 
upfront. That'll just flood the kernel, and it will most likely ignore 
any advise after the first few hundred or so. I'm not sure what the 
appropriate amount of read ahead would be, though. Probably depends a 
lot on the OS and hardware, and needs to be a adjustable.


In some cases we can't easily read ahead more than a certain number of 
pages. For example, in a regular index scan, we can easily fire off 
posix_advise calls for all the heap pages referenced by a single index 
page, but reading ahead more than that becomes much more complex.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Martijn van Oosterhout

On Fri, Mar 28, 2008 at 11:41:58AM -0400, Bruce Momjian wrote:
> Should we consider only telling the kernel X pages ahead, meaning when
> we are on page 10 we tell it about page 16?

It's not so interesting for sequential reads, the kernel can work that
out for itself. Disk reads are usually in blocks of at least 128K
anyway, so there's a real good chance you have block 16 already.

The interesting case is index scan, where you so a posix_fadvise() on
the next block *before* returning the items in the current block. Then
by the time you've processed these tuples, the next block will
hopefully have been read in and we can proceed without delay.

Or fadvising all the tuples referred to from an index page at once so
the kernel can determine the optimal order to fetch them. The read()
will still be in order of the tuple, but the delay will (hopefully) be
less.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.

signature.asc
Description: Digital signature

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Bruce Momjian

Gregory Stark wrote:
> 
> Bruce, you seem to have removed one of my three patches from the queue. I
> would actually prefer you remove the other two and put back that one. It's the
> one I most urgently need feedback on to continue.

I talked to Greg on IM.  The complaint was that his posix_fadvise()
patch was added to the TODO list as a URL, rather than getting him
feedback.  He is getting feedback now in another thread.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Bruce Momjian

Heikki Linnakangas wrote:
> > So it has nothing to do with table size. The fadvise calls need to be
> > (and are) 
> > limited by what can be used in the near future, and not for the whole
> > statement.
> 
> Right, I was sloppy. Instead of table size, what matters is the amount 
> of data the scan needs to access. The point remains that if the data is 
> already in OS cache, the posix_fadvise calls are a waste of time, 
> regardless of how many pages ahead you advise.

I now understand what posix_fadvise() is allowing us to do. 
posix_fadvise(POSIX_FADV_WILLNEED) allows us to tell the kernel we will
need a certain block in the future --- this seems much cheaper than a
background reader.

We know we will need the blocks, and telling the kernel can't hurt,
except that there is overhead in telling the kernel.  Has anyone
measured how much overhead?  I would be interested in a test program
that read the same page over and over again from the kernel, with and
without a posix_fadvise() call.

Should we consider only telling the kernel X pages ahead, meaning when
we are on page 10 we tell it about page 16?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Heikki Linnakangas


Zeugswetter Andreas OSB SD wrote:

Heikki wrote:
It seems that the worst case for this patch is a scan on a table that 
doesn't fit in shared_buffers, but is fully cached in the OS cache. In



that case, the posix_fadvise calls would be a certain waste of time.


I think this is a misunderstanding, the fadvise is not issued to read
the
whole table and is not issued for table scans at all (and if it were it
would 
only advise for the next N pages).


So it has nothing to do with table size. The fadvise calls need to be
(and are) 
limited by what can be used in the near future, and not for the whole

statement.


Right, I was sloppy. Instead of table size, what matters is the amount 
of data the scan needs to access. The point remains that if the data is 
already in OS cache, the posix_fadvise calls are a waste of time, 
regardless of how many pages ahead you advise.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Zeugswetter Andreas OSB SD

Heikki wrote:
> It seems that the worst case for this patch is a scan on a table that 
> doesn't fit in shared_buffers, but is fully cached in the OS cache. In

> that case, the posix_fadvise calls would be a certain waste of time.

I think this is a misunderstanding, the fadvise is not issued to read
the
whole table and is not issued for table scans at all (and if it were it
would 
only advise for the next N pages).

So it has nothing to do with table size. The fadvise calls need to be
(and are) 
limited by what can be used in the near future, and not for the whole
statement.

e.g. N next level index pages that are relevant, or N relevant heap
pages one 
index leaf page points at. Maybe in the index case N does not need to be
limited,
since we have a natural limit on how many pointers fit on one page.

In general I think separate reader processes (or threads :-) that
directly read
into the bufferpool would be a more portable and efficient
implementation. 
E.g. it could use ScatterGather IO. So I think we should look, that the
fadvise 
solution is not obstruing that path, but I think it does not.

Gregory wrote:
>> A more invasive form of this patch would be to assign and pin a
buffer when
>> the preread is done. That would men subsequently we would have a
pinned buffer
>> ready to go and not need to go back to the buffer manager a second
time. We
>> would instead just "complete" the i/o by issuing a normal read call.

I guess you would rather need to mark the buffer for use for this page,
but let any backend that needs it first, pin it and issue the read.
I think the fadviser should not pin it in advance, since he cannot
guarantee to
actually read the page [soon]. Rather remember the buffer and later
check and pin 
it for the read. Else you might be blocking the buffer.
But I think doing something like this might be good since it avoids
issuing duplicate
fadvises.

Andreas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

2008-03-28 Thread Heikki Linnakangas


Gregory Stark wrote:

I described which interfaces worked on Linux and Solaris based on empirical
tests. I posted source code for synthetic benchmarks so we could test it on a
wide range of hardware. I posted graphs based on empirical results. I posted
mathematical formulas analysing just how much preread would be expected to
exercise a raid array fully. I'm not sure what else I can do to effect a more
thorough examination.


I'm sure posix_fadvise is a win in the case where it's supposed to help: 
a scan that does a lot of random reads, on RAID array. And you've posted 
results demonstrating that. What we need to make sure is that there's no 
significant loss when it's not helping.


It seems that the worst case for this patch is a scan on a table that 
doesn't fit in shared_buffers, but is fully cached in the OS cache. In 
that case, the posix_fadvise calls would be a certain waste of time.


That could be alleviated by deciding at plan time whether to preread or 
not, based on effective_cache_size.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Simon Riggs

On Fri, 2008-03-28 at 11:26 +, Gregory Stark wrote:
> "Simon Riggs" <[EMAIL PROTECTED]> writes:

> For heaven's sake. I've been posting about this for months. 

Any chance of getting all that together on a single Wiki page, so we can
review everything? We'll need those docs after its committed anyway.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Gregory Stark

"Simon Riggs" <[EMAIL PROTECTED]> writes:

> On Fri, 2008-03-28 at 09:08 +, Gregory Stark wrote:
>
>> A more invasive form of this patch would be to assign and pin a buffer when
>> the preread is done. That would men subsequently we would have a pinned 
>> buffer
>> ready to go and not need to go back to the buffer manager a second time. We
>> would instead just "complete" the i/o by issuing a normal read call.
>
> So if posix_fadvise did nothing or there was a longer than optimal
> delay, this would be a net loss.
>
> You'd need reasonable evidence that the posix_fadvise facility was a win
> on all platforms and recent release levels before we could agree with
> that.

I posted a test program and asked for others to test it on various platforms
and various sized raid arrays. [EMAIL PROTECTED] was the only person who
bothered to run it and he only found a 16x speedup on his hardware.

 http://archives.postgresql.org/pgsql-performance/2008-01/msg00285.php

(unfortunately our mail archives seem to have failed to archive his message,
this is my followup to it)

> I think we need a more thorough examination of this area before we
> commit anything. Maybe you've done this, but I haven't seen the
> analysis. Can you say more, please? Or at least say what you don't know,
> so other experts listening can fill in the blanks.

For heaven's sake. I've been posting about this for months. Search the
archives for "RAID arrays and performance", "There's random access and then
there's random access", and "Bitmap index scan preread using posix_fadvise".

I described which interfaces worked on Linux and Solaris based on empirical
tests. I posted source code for synthetic benchmarks so we could test it on a
wide range of hardware. I posted graphs based on empirical results. I posted
mathematical formulas analysing just how much preread would be expected to
exercise a raid array fully. I'm not sure what else I can do to effect a more
thorough examination.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest patches

2008-03-28 Thread Simon Riggs

On Fri, 2008-03-28 at 09:08 +, Gregory Stark wrote:

> A more invasive form of this patch would be to assign and pin a buffer when
> the preread is done. That would men subsequently we would have a pinned buffer
> ready to go and not need to go back to the buffer manager a second time. We
> would instead just "complete" the i/o by issuing a normal read call.

So if posix_fadvise did nothing or there was a longer than optimal
delay, this would be a net loss.

You'd need reasonable evidence that the posix_fadvise facility was a win
on all platforms and recent release levels before we could agree with
that.

I think we need a more thorough examination of this area before we
commit anything. Maybe you've done this, but I haven't seen the
analysis. Can you say more, please? Or at least say what you don't know,
so other experts listening can fill in the blanks.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Commitfest: patches Ready for Committer

Re: [HACKERS] Commitfest: patches Ready for Committer

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches mostly assigned ... status

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: [HACKERS] Commitfest patches

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Prereading using posix_fadvise (was Re: [HACKERS] Commitfest patches)

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

Re: [HACKERS] Commitfest patches

28 matches

Site Navigation

Mail list logo

Footer information