subject:"\[HACKERS\] WAL Rate Limiting"

Re: [HACKERS] WAL Rate Limiting

2014-02-21 Thread Robert Haas

On Thu, Feb 20, 2014 at 12:07 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I think bulk (maintenance) operations should legitimately configured
 separately from regular UPDATE etc operations.  For the latter I think
 it's pretty reasonable to assume that users are going to want to tweak
 the GUC for each individual callsite in the application, rather than
 wholesale in postgresql.conf.

There's an argument to be made for that design, but the discussion
upthread doesn't support the contention that the patch makes a clean
and well-defined division between those things.  The initial patch
left VACUUM out on the dubious theory that since we have an existing
system to limit its throughput by pages dirtied we don't need a way to
limit the rate at which it generates WAL, and left as an open question
whether this out to apply to COPY and CREATE TABLE AS SELECT (but
apparently not UPDATE or DELETE, or for that matter just plain SELECT,
when it does a lot of pruning).  A subsequent version of the patch
changed the list of commands affected, but it's not at all clear to me
that we have something as tidy as only commands where X and never
commands where Y.

More broadly, three committers (myself, Tom, Heikki) expressed the
desire for this to be made into a more generic mechanism, and two of
those people (myself and Tom) said that this was too late for 9.4 and
should be postponed to 9.5.  Then a month went by and Greg Stark
showed up and made noises about pushing this forward as-is.   So I
don't want it to be forgotten that those objections were made and have
not been withdrawn.  I'm not dead set against changing my position on
this patch, but that should happen because of work to improve the
patch - which was last posted on January 17th and did not at that time
even include the requested and agreed fix to measure the limit in MB/s
rather than some odd unit - not because of the simple passage of time.
 If anything, the passage of 5 weeks without meaningful progress ought
to strengthen rather than weaken the argument that this is far too
late for this release.

Of course, if we're talking about 9.5, then disregard the previous
paragraph.  But in that case perhaps we could postpone this until we
don't have 40+ patches needing review and a dozen marked ready for
committer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-21 Thread Alvaro Herrera

It's clear now that if what Greg wants is an uncontroversial patch to
commit, this is not it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-21 Thread Jim Nasby


On 1/16/14, 9:19 PM, Craig Ringer wrote:

On 01/16/2014 11:52 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

On 01/16/2014 05:39 PM, Andres Freund wrote:

Do you see a reasonable way to implement this generically for all
commands? I don't.



In suitable safe places, check if you've written too much WAL, and sleep
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
vacuum_delay_point() is a better analogy. Hopefully you don't need to
sprinkle them in too many places to be useful.


We probably don't really need to implement it for all commands; I think
everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
emit enough WAL to really matter.  My point was that we should try to hit
everything that potentially *could* generate lots of WAL, rather than
arbitrarily deciding that some are utility commands and some are not.


Then you land up needing a configuration mechanism to control *which*
commands get affected, too, to handle the original use case of I want
to rate limit vaccuum and index rebuilds, while everything else runs
full tilt.

Or do you propose to just do this per-session? If so, what about autovacuum?


Tom was proposing per-session.

For autovac, don't we already have some method of changing the GUCs it uses? If 
not, we should...
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Simon Riggs

On 19 February 2014 16:04, Robert Haas robertmh...@gmail.com wrote:

 Well, *I* don't think this is ready to go.  A WAL rate limit that only
 limits WAL sometimes still doesn't impress me.

Could you be specific in your criticism? Sometimes wouldn't impress
anybody, but we need to understand whether you are referring to a bug
or just making a personal assessment of the patch features. Thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Greg Stark

On Thu, Feb 20, 2014 at 1:45 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 19 February 2014 16:04, Robert Haas robertmh...@gmail.com wrote:

 Well, *I* don't think this is ready to go.  A WAL rate limit that only
 limits WAL sometimes still doesn't impress me.

 Could you be specific in your criticism? Sometimes wouldn't impress
 anybody, but we need to understand whether you are referring to a bug
 or just making a personal assessment of the patch features. Thanks.


I think it's clear what the question here is. The patch represents a
quick easy win for certain cases but isn't the nice broad solution we
would prefer to have. We face this decision a lot and it's a tough
one. On the one hand we want to set a high standard for ourselves and
not load up the Postgres source with
lots of half-measures, on the
other hand we don't want to let perfect be the enemy of the good and
have development stall while we refuse easy wins.

I don't think it would be a terrible thing if this patch went in. And
I also don't think it would be a huge loss if it didn't since I think
we will need IO resource management down the road anyways.

I was looking for a few non-controversial patches to work on at least
at the moment to establish some history before I take on more
controversial patches so I don't think committing this is really a
great idea for me right now. If you wanted to make this a more general
system I have some ideas and I think we could do it fairly easily. We
could make CHECK_FOR_INTERRUPTS have a single flag but if it's set
check the interrupt type and know that some types can be handled by
simply sleeping and not throwing an error. There are already some
cases that are handled specially so it wouldn't really even change the
code much. I would be happy to help get something like that committed.




-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Simon Riggs

On 19 February 2014 13:28, Greg Stark st...@mit.edu wrote:
 On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Agreed; that was the original plan, but implementation delays
 prevented the whole vision/discussion/implementation. Requirements
 from various areas include WAL rate limiting for replication, I/O rate
 limiting, hard CPU and I/O limits for security and mixed workload
 coexistence.

 I'd still like to get something on this in 9.4 that alleviates the
 replication issues, leaving wider changes for later releases.

 My first reaction was that we should just have a generic I/O resource
 throttling. I was only convinced this was a reasonable idea by the
 replication use case. It would help me to understand the specific
 situations where replication breaks down due to WAL bandwidth
 starvation.

The various bulk operations mentioned in the OP can dump GBs of data
into WAL. That causes replication to immediately fall behind, which
slows everything down if you are using synchronous replication. This
has been discussed before on list over last few years (and Heroku
staff have been involved in that, fyi).

General I/O limiting is a nice-to-have but what is suggested here is
an essential aspect of running bulk commands when using replication,
especially synchronous replication.

The use cases for I/O limiting and replication limiting are different
in both their effects and their criticality.

The design choice of making the limit only apply to bulk ops is
because that is where the main problem lies. Rate limiting will cause
a loss of performance in the main line for non-bulk operations, so
adding tests there will not be valuable.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Greg Stark

On Thu, Feb 20, 2014 at 2:43 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The design choice of making the limit only apply to bulk ops is
 because that is where the main problem lies. Rate limiting will cause
 a loss of performance in the main line for non-bulk operations, so
 adding tests there will not be valuable.

This isn't a bad point. If you do a massive UPDATE or INSERT you
probably do want the ability to limit the damage to your replication
lag but if you're an OLTP worker and you just happen to handle a
disproportionate amount of the incoming requests you wouldn't. But
then presumably you would just set that GUC that way where your OLTP
workers would run without any limits and your batch jobs would set a
limit.

As I said it wouldn't be terrible if the patch went in and could only
handle cluster/vacuum/create index. They're by far the most common
causes of problems and most people who do massive UPDATES or INSERTS
quickly discover they want to do it in batches of a few thousand rows
anyways so they can limit the speed themselves.

There's a decent argument to made for incremental changes too. If we
add a feature like this for these operations then we'll find out how
well it works in practice and whether users need finer control or what
other sources become the next problem before we've invested a whole
lot in a general solution.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Robert Haas

On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The design choice of making the limit only apply to bulk ops is
 because that is where the main problem lies. Rate limiting will cause
 a loss of performance in the main line for non-bulk operations, so
 adding tests there will not be valuable.

That's pure sophistry.  Of course rate limiting will cause a loss of
performance in the main line for non-bulk operations.  Rate limiting
will also cause a loss of performance for bulk operations.  The whole
point of rate limiting is to cause a loss of performance.  That's not
a bug; that's what the feature is explicitly designed to do.

So-called non-bulk operations, which seems to be defined as more or
less anything where actually making this feature work seemed hard,
can include huge inserts, updates, or deletes that greatly depress
performance for other parts of the system, either individually or in
aggregate.  And it's just as legitimate to want to tamp down that
activity as it is to want to slow down CLUSTER.  Remember, this is a
GUC, so it need not be set identically for every user session.  It is
not hard at all to imagine a situation where the user wishes to limit
the rate at which some sessions can write WAL so as to avoid
interfering with other, higher-priority tasks happening in other
sessions.  That is hardly a niche use case; I think I've seen it
reported, if anything, even more frequently than problems with what
you're calling bulk operations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Robert Haas

On Thu, Feb 20, 2014 at 10:10 AM, Greg Stark st...@mit.edu wrote:
 On Thu, Feb 20, 2014 at 2:43 PM, Simon Riggs si...@2ndquadrant.com wrote:
 The design choice of making the limit only apply to bulk ops is
 because that is where the main problem lies. Rate limiting will cause
 a loss of performance in the main line for non-bulk operations, so
 adding tests there will not be valuable.

 This isn't a bad point. If you do a massive UPDATE or INSERT you
 probably do want the ability to limit the damage to your replication
 lag but if you're an OLTP worker and you just happen to handle a
 disproportionate amount of the incoming requests you wouldn't. But
 then presumably you would just set that GUC that way where your OLTP
 workers would run without any limits and your batch jobs would set a
 limit.

 As I said it wouldn't be terrible if the patch went in and could only
 handle cluster/vacuum/create index. They're by far the most common
 causes of problems and most people who do massive UPDATES or INSERTS
 quickly discover they want to do it in batches of a few thousand rows
 anyways so they can limit the speed themselves.

 There's a decent argument to made for incremental changes too. If we
 add a feature like this for these operations then we'll find out how
 well it works in practice and whether users need finer control or what
 other sources become the next problem before we've invested a whole
 lot in a general solution.

What will more likely happen is that we won't be able to change the
definition in future releases because it breaks backward
compatibility.  If we implement this now for only a subset of
commands, and then later someone proposes a patch to expand it to more
commands, then there's going to be someone who shows up and complains
about that definitional change.  And for good reason.

I'm heartily in favor of incremental development of features, but each
incremental addition needs to stand on its own two feet, not be
something that we can already foresee needing to break or throw out
later.  There are way, way too many PostgreSQL users now for us to be
able to get away with that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Hannu Krosing

On 02/20/2014 04:16 PM, Robert Haas wrote:
 On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
 The design choice of making the limit only apply to bulk ops is
 because that is where the main problem lies. Rate limiting will cause
 a loss of performance in the main line for non-bulk operations, so
 adding tests there will not be valuable.
 That's pure sophistry.  Of course rate limiting will cause a loss of
 performance in the main line for non-bulk operations. 
I think he meant that *adding code for* rate limiting there will cause
loss of performance even if not used.
  Rate limiting
 will also cause a loss of performance for bulk operations.  The whole
 point of rate limiting is to cause a loss of performance.
Not when it is not enabled it should not.
   That's not
 a bug; that's what the feature is explicitly designed to do.

 So-called non-bulk operations, which seems to be defined as more or
 less anything where actually making this feature work seemed hard,
NO, it is anything that a decent DBA could not script to run in
chunks, which he would do anyway for other reasons, like not
to lock out OLTP.
 can include huge inserts, updates, or deletes that greatly depress
 performance for other parts of the system, either individually or in
 aggregate.  And it's just as legitimate to want to tamp down that
 activity as it is to want to slow down CLUSTER.  Remember, this is a
 GUC, so it need not be set identically for every user session.  It is
 not hard at all to imagine a situation where the user wishes to limit
 the rate at which some sessions can write WAL so as to avoid
 interfering with other, higher-priority tasks happening in other
 sessions. 
It is hard to imagine, but it is still a much more infrequent than needing
 to do concurrent ops like VACUUM and CREATE INDEX CONCURRENTLY .
  That is hardly a niche use case; I think I've seen it
 reported, if anything, even more frequently than problems with what
 you're calling bulk operations.
Could be, but they are two separate cases. One helps DBA get stuff
done without stepping on users toes, the other is about protecting
one user from the other.

It is arguable that the first is a sub-case of the other, but I don't think
these should be controlled by the same GUC though you may want to
have some checking for sane values between the two

Maybe call this one `maintenance_wal_rate_limit_delay` same way as we 
have `maintenance_work_mem`


Cheers

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-20 Thread Alvaro Herrera

Robert Haas escribió:
 On Thu, Feb 20, 2014 at 9:43 AM, Simon Riggs si...@2ndquadrant.com wrote:
  The design choice of making the limit only apply to bulk ops is
  because that is where the main problem lies. Rate limiting will cause
  a loss of performance in the main line for non-bulk operations, so
  adding tests there will not be valuable.
 
 That's pure sophistry.  Of course rate limiting will cause a loss of
 performance in the main line for non-bulk operations.  Rate limiting
 will also cause a loss of performance for bulk operations.  The whole
 point of rate limiting is to cause a loss of performance.

I think he's saying that it will degrade performance even in the case
where the feature is disabled, which would be undesirable.  I don't
necessarily agree with that assertion, but at least it's something to
consider rather than to laugh at.

 And it's just as legitimate to want to tamp down that
 activity as it is to want to slow down CLUSTER.  Remember, this is a
 GUC, so it need not be set identically for every user session.  It is
 not hard at all to imagine a situation where the user wishes to limit
 the rate at which some sessions can write WAL so as to avoid
 interfering with other, higher-priority tasks happening in other
 sessions.

I don't disagree with this either, but I don't think it necessarily has
to be the same feature.  Most likely we don't want to have users setting
the GUC each time they want to run vacuum; rather they will want to be
able to set one line in postgresql.conf and have it affect all vacuuming
activity.  However, if that same config tweak affects all their UPDATE
statements regardless of them being single-row updates or bulk updates,
I'm pretty sure it's not going to be welcome.

I think bulk (maintenance) operations should legitimately configured
separately from regular UPDATE etc operations.  For the latter I think
it's pretty reasonable to assume that users are going to want to tweak
the GUC for each individual callsite in the application, rather than
wholesale in postgresql.conf.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Greg Stark

On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Agreed; that was the original plan, but implementation delays
 prevented the whole vision/discussion/implementation. Requirements
 from various areas include WAL rate limiting for replication, I/O rate
 limiting, hard CPU and I/O limits for security and mixed workload
 coexistence.

 I'd still like to get something on this in 9.4 that alleviates the
 replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

 The vacuum_* parameters don't allow any control over WAL production,
 which is often the limiting factor. I could, for example, introduce a
 new parameter for vacuum_cost_delay that provides a weighting for each
 new BLCKSZ chunk of WAL, then rename all parameters to a more general
 form. Or I could forget that and just press ahead with the patch as
 is, providing a cleaner interface in next release.

 It's also interesting to wonder about the relationship to
 CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
 that that's *cheap* (1 test and branch) as long as nothing is pending.
 I don't want to see a bunch of arithmetic added to it.

 Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Robert Haas

On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark st...@mit.edu wrote:
 On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Agreed; that was the original plan, but implementation delays
 prevented the whole vision/discussion/implementation. Requirements
 from various areas include WAL rate limiting for replication, I/O rate
 limiting, hard CPU and I/O limits for security and mixed workload
 coexistence.

 I'd still like to get something on this in 9.4 that alleviates the
 replication issues, leaving wider changes for later releases.

 My first reaction was that we should just have a generic I/O resource
 throttling. I was only convinced this was a reasonable idea by the
 replication use case. It would help me to understand the specific
 situations where replication breaks down due to WAL bandwidth
 starvation. Heroku has had some problems with slaves falling behind
 though the immediate problems that causes is the slave filling up disk
 which we could solve more directly by switching to archive mode rather
 than slowing down the master.

 But I would suggest you focus on a specific use case that's
 problematic so we can judge better if the implementation is really
 fixing it.

 The vacuum_* parameters don't allow any control over WAL production,
 which is often the limiting factor. I could, for example, introduce a
 new parameter for vacuum_cost_delay that provides a weighting for each
 new BLCKSZ chunk of WAL, then rename all parameters to a more general
 form. Or I could forget that and just press ahead with the patch as
 is, providing a cleaner interface in next release.

 It's also interesting to wonder about the relationship to
 CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
 that that's *cheap* (1 test and branch) as long as nothing is pending.
 I don't want to see a bunch of arithmetic added to it.

 Good point.

 I think it should be possible to actually merge it into
 CHECK_FOR_INTERRUPTS. Have a single global flag
 io_done_since_check_for_interrupts which is set to 0 after each
 CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
 CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
 in the normal case.

 In fact you could do all the arithmetic when you do the wal write.
 Only set the flag if the bandwidth consumed is above the budget. Then
 the flag should only ever be set when you're about to sleep.

 I would dearly love to see a generic I/O bandwidth limits so it would
 be nice to see a nicely general pattern here that could be extended
 even if we only target wal this release.

 I'm going to read the existing patch now, do you think it's ready to
 go or did you want to do more work based on the feedback?

Well, *I* don't think this is ready to go.  A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-20 Thread Simon Riggs

On 17 January 2014 16:34, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 On 01/17/2014 05:20 PM, Simon Riggs wrote:

 +   if (RelationNeedsWAL(indexrel))
 +   CHECK_FOR_WAL_BUDGET();


 I don't think we need the RelationNeedsWAL tests. If the relation is not
 WAL-logged, you won't write much WAL, so you should easily stay under the
 limit. And CHECK_FOR_WAL_BUDGET() better be cheap when you're below the
 limit.

OK, I'll remove them, thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-20 Thread Simon Riggs

On 17 January 2014 16:10, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-17 09:04:54 -0500, Robert Haas wrote:
 That having been said, I bet it could be done at the tail of
 XLogInsert().

 I don't think there are many locations where this would be ok. Sleeping
 while holding exclusive buffer locks? Quite possibly inside a criticial
 section?

 More or less by definition, you're always doing both when you call
 XLogInsert.

 Surely not.

 I agree.  It's got to be somewhere further up the call stack.

Definitely.

 I'm inclined to think that what we ought to do is reconceptualize
 vacuum_delay_point() as something a bit more generic, and sprinkle
 calls to it in a few more places than now.

Agreed; that was the original plan, but implementation delays
prevented the whole vision/discussion/implementation. Requirements
from various areas include WAL rate limiting for replication, I/O rate
limiting, hard CPU and I/O limits for security and mixed workload
coexistence.

I'd still like to get something on this in 9.4 that alleviates the
replication issues, leaving wider changes for later releases.

The vacuum_* parameters don't allow any control over WAL production,
which is often the limiting factor. I could, for example, introduce a
new parameter for vacuum_cost_delay that provides a weighting for each
new BLCKSZ chunk of WAL, then rename all parameters to a more general
form. Or I could forget that and just press ahead with the patch as
is, providing a cleaner interface in next release.

 It's also interesting to wonder about the relationship to
 CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
 that that's *cheap* (1 test and branch) as long as nothing is pending.
 I don't want to see a bunch of arithmetic added to it.

Good point.

I'll call these new calls CHECK_FOR_RESOURCES() to allow us to
implement various kinds of resource checking in future.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-20 Thread Simon Riggs

On 17 January 2014 16:30, Heikki Linnakangas hlinnakan...@vmware.com wrote:
 It occurs to me that this is very similar to the method I proposed in June
 to enforce a hard limit on WAL usage, to avoid PANIC caused by running out
 of disk space when writing WAL:

 http://www.postgresql.org/message-id/51b095fe.6050...@vmware.com

 Enforcing a global limit needs more book-keeping than rate limiting
 individual sesssions. But I'm hoping that the CHECK_WAL_BUDGET() calls could
 be used for both purposes.

We can't set a useful hard limit on WAL by default without that being
a bigger foot gun than what we have now.

If we did have a limit, I would look to make it less exact and take it
out of the main path for example by checkpointer - or perhaps only
make the check when we switch xlog files.

So I don't want to include that requirement into this current work.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Robert Haas

On Thu, Jan 16, 2014 at 10:56 AM, Andres Freund and...@2ndquadrant.com wrote:
 That'd be most places doing XLogInsert() if you want to be
 consistent. Each needing to be analyzed whether we're blocking important
 resources, determining where in the callstack we can do the
 CHECK_FOR_WAL_BUDGET().

And doing that probably wouldn't be a problem except for the fact that
this patch is showing up at the absolute very last minute with
multiple unresolved design considerations.  I'm with Tom: this should
be postponed to 9.5.

That having been said, I bet it could be done at the tail of
XLogInsert().  if there are cases where that's not desirable, then add
macros HOLDOFF_WAL_THROTTLING() and RESUME_WAL_THROTTLING() that bump
a counter up and down.  When the counter is 0, XLogInsert() doesn't
sleep; when RESUME_WAL_THROTTLING() drops the counter to 0, it also
considers sleeping.  I suspect only a few places would need to do
this, like where we're holding one of the SLRU locks.  Some testing
would be needed to figure out exactly which places cause problems, but
that's why it's a good idea to start discussion of your proposed
feature before January 14th.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Andres Freund

On 2014-01-17 09:04:54 -0500, Robert Haas wrote:
 That having been said, I bet it could be done at the tail of
 XLogInsert().  if there are cases where that's not desirable, then add
 macros HOLDOFF_WAL_THROTTLING() and RESUME_WAL_THROTTLING() that bump
 a counter up and down.  When the counter is 0, XLogInsert() doesn't
 sleep; when RESUME_WAL_THROTTLING() drops the counter to 0, it also
 considers sleeping.  I suspect only a few places would need to do
 this, like where we're holding one of the SLRU locks.

I don't think there are many locations where this would be ok. Sleeping
while holding exclusive buffer locks? Quite possibly inside a criticial
section?
Surely not.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-17 09:04:54 -0500, Robert Haas wrote:
 That having been said, I bet it could be done at the tail of
 XLogInsert().

 I don't think there are many locations where this would be ok. Sleeping
 while holding exclusive buffer locks? Quite possibly inside a criticial
 section?

More or less by definition, you're always doing both when you call
XLogInsert.

 Surely not.

I agree.  It's got to be somewhere further up the call stack.

I'm inclined to think that what we ought to do is reconceptualize
vacuum_delay_point() as something a bit more generic, and sprinkle
calls to it in a few more places than now.

It's also interesting to wonder about the relationship to
CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
that that's *cheap* (1 test and branch) as long as nothing is pending.
I don't want to see a bunch of arithmetic added to it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Simon Riggs

On 16 January 2014 17:20, Simon Riggs si...@2ndquadrant.com wrote:

 Thank you everyone for responding so positively to this proposal. It
 is something many users have complained about.

 In time, I think we might want both WAL Rate Limiting and I/O Rate
 Limiting. IMHO I/O rate limiting is harder and so this proposal is
 restricted solely to WAL rate limiting.

 I'm comfortable with a session level parameter. I was mulling over a
 wal_rate_limit_scope parameter but I think that is too much.
 I will follow Craig's proposal to define this in terms of MB/s, adding
 that I would calc from beginning of statement to now, so it is
 averaged rate. Setting of zero means unlimited. The rate would be
 per-session (in this release) rather than a globally calculated
 setting.

Parameter renamed to wal_rate_limit.

Not yet modified to produce this in MB/s

 I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

Used CHECK_WAL_BUDGET() as proposed

 That seems like a cheap enough call to allow it to be added in more
 places, so my expanded list of commands touched are
  CLUSTER/VACUUM FULL
  ALTER TABLE (only in phase 3 table rewrite)
  ALTER TABLE SET TABLESPACE
  CREATE INDEX
 which are already there, plus new ones suggested/implied
  COPY
  CREATE TABLE AS SELECT
  INSERT/UPDATE/DELETE

Added, no problems or technical difficulties encountered.

 Please let me know if I missed something (rather than debating what is
 or is not a maintenance command).

 Any other thoughts before I cut v2 ?

v2 attached

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


wal_rate_limiting.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Heikki Linnakangas


On 01/17/2014 05:20 PM, Simon Riggs wrote:

+   if (RelationNeedsWAL(indexrel))
+   CHECK_FOR_WAL_BUDGET();


I don't think we need the RelationNeedsWAL tests. If the relation is not 
WAL-logged, you won't write much WAL, so you should easily stay under 
the limit. And CHECK_FOR_WAL_BUDGET() better be cheap when you're below 
the limit.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-17 Thread Heikki Linnakangas

It occurs to me that this is very similar to the method I proposed in 
June to enforce a hard limit on WAL usage, to avoid PANIC caused by 
running out of disk space when writing WAL:


http://www.postgresql.org/message-id/51b095fe.6050...@vmware.com

Enforcing a global limit needs more book-keeping than rate limiting 
individual sesssions. But I'm hoping that the CHECK_WAL_BUDGET() calls 
could be used for both purposes.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Robert Haas

On Wed, Jan 15, 2014 at 7:54 AM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Jan 15, 2014 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:
 We've discussed previously the negative impact of large bulk
 operations, especially wrt WAL writes. Patch here allows maintenance
 operations to have their WAL generation slowed down as a replication
 lag prevention feature.

 I believe there was originally intended to be some work on I/O rate
 limiting, but that hasn't happened and is in some ways orthogonal to
 this patch and we will likely eventually want both.

 Single new parameter works very similarly to vacuum_cost_delay

 wal_rate_limit_delay = Xms


 Seems like a really bad name if we are only slowing down some commands -
 that seems to indicate we're slowing down all of them. I think it should be
 something that indicates that it only affects the maintenance commands.

And why should it only affect the maintenance commands anyway, and who
decides what's a maintenance command?

I thought Heroku suggested something like this previously, and their
use case was something along the lines of we need to slow the system
down enough to do a backup so we can delete some stuff before the disk
fills.  For that, it seems likely to me that you would just want to
slow everything down.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund

On 2014-01-16 09:06:30 -0500, Robert Haas wrote:
  Seems like a really bad name if we are only slowing down some commands -
  that seems to indicate we're slowing down all of them. I think it should be
  something that indicates that it only affects the maintenance commands.
 
 And why should it only affect the maintenance commands anyway, and who
 decides what's a maintenance command?

I think implementing it for everything might have some use, but it's a
much, much more complex task. You can't simply do rate limiting in
XLogInsert() or somesuch - we're holding page locks, buffer pins, other
locks... I don't see why that needs to be done in the same feature.

I don't really see much difficulty in determining what's a utility
command and what not for the purpose of this? All utility commands which
create WAL in O(table_size) or worse.

 I thought Heroku suggested something like this previously, and their
 use case was something along the lines of we need to slow the system
 down enough to do a backup so we can delete some stuff before the disk
 fills.  For that, it seems likely to me that you would just want to
 slow everything down.

I think the usecase for this more along the lines of not slowing normal
operations or cause replication delays to standbys unduly, while
performing maintenance operations on relations.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 I don't really see much difficulty in determining what's a utility
 command and what not for the purpose of this? All utility commands which
 create WAL in O(table_size) or worse.

By that definition, INSERT, UPDATE, and DELETE can all be utility
commands.  So would a full-table-scan SELECT, if it's unfortunate
enough to run into a lot of hint-setting or HOT-pruning work.

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane

Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 Do you see a reasonable way to implement this generically for all
 commands? I don't.

 In suitable safe places, check if you've written too much WAL, and sleep 
 if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of 
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe 
 vacuum_delay_point() is a better analogy. Hopefully you don't need to 
 sprinkle them in too many places to be useful.

We probably don't really need to implement it for all commands; I think
everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
emit enough WAL to really matter.  My point was that we should try to hit
everything that potentially *could* generate lots of WAL, rather than
arbitrarily deciding that some are utility commands and some are not.

For INSERT/UPDATE/DELETE, likely you could do this with a single 
CHECK_FOR_WAL_BUDGET() added at a strategic spot in nodeModifyTable.c.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I don't really see much difficulty in determining what's a utility
  command and what not for the purpose of this? All utility commands which
  create WAL in O(table_size) or worse.
 
 By that definition, INSERT, UPDATE, and DELETE can all be utility
 commands.  So would a full-table-scan SELECT, if it's unfortunate
 enough to run into a lot of hint-setting or HOT-pruning work.

Well, I said *utility* command. So everything processed by
ProcessUtility() generating WAL like that.

 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.

Do you see a reasonable way to implement this generically for all
commands? I don't.
We shouldn't let the the need for generic resource control stop us from
providing some for of resource control for a consistent subset of
commands.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs

On 16 January 2014 16:56, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.
 
 Do you see a reasonable way to implement this generically for all
 commands? I don't.

 In suitable safe places, check if you've written too much WAL, and sleep if
 so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
 vacuum_delay_point() is a better analogy. Hopefully you don't need to
 sprinkle them in too many places to be useful.

 That'd be most places doing XLogInsert() if you want to be
 consistent. Each needing to be analyzed whether we're blocking important
 resources, determining where in the callstack we can do the
 CHECK_FOR_WAL_BUDGET().

 I don't think the the add power by allowing bulk DML to be metered is
 worth the increased implementation cost.

 I think the usecases that would want this for DML probably also wan this
 to work for unlogged, temp tables. That'd require a much more generic
 resource control framework.

Thank you everyone for responding so positively to this proposal. It
is something many users have complained about.

In time, I think we might want both WAL Rate Limiting and I/O Rate
Limiting. IMHO I/O rate limiting is harder and so this proposal is
restricted solely to WAL rate limiting.

I'm comfortable with a session level parameter. I was mulling over a
wal_rate_limit_scope parameter but I think that is too much.
I will follow Craig's proposal to define this in terms of MB/s, adding
that I would calc from beginning of statement to now, so it is
averaged rate. Setting of zero means unlimited. The rate would be
per-session (in this release) rather than a globally calculated
setting.

I would like to call it CHECK_FOR_WAL_RATE_LIMIT()

That seems like a cheap enough call to allow it to be added in more
places, so my expanded list of commands touched are
 CLUSTER/VACUUM FULL
 ALTER TABLE (only in phase 3 table rewrite)
 ALTER TABLE SET TABLESPACE
 CREATE INDEX
which are already there, plus new ones suggested/implied
 COPY
 CREATE TABLE AS SELECT
 INSERT/UPDATE/DELETE

Please let me know if I missed something (rather than debating what is
or is not a maintenance command).

Any other thoughts before I cut v2 ?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane

Andres Freund and...@2ndquadrant.com writes:
 On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 In suitable safe places, check if you've written too much WAL, and sleep if
 so.

 That'd be most places doing XLogInsert() if you want to be
 consistent.

See my other response.  There's no need for consistency, only to be sure
that code paths that might generate a *lot* of WAL include such checks.
We've gotten along fine without any formal methodology for where to put
CHECK_FOR_INTERRUPTS() or vacuum_delay_point() calls, and this seems like
just more of the same.

 I think the usecases that would want this for DML probably also wan this
 to work for unlogged, temp tables.

Huh?  Unlogged tables generate *zero* WAL, by definition.

If your point is that WAL creation isn't a particularly good resource
consumption metric, that's an argument worth considering, but it seems
quite orthogonal to whether we can implement such throttling reasonably.
And in any case, you didn't provide such an argument.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund

On 2014-01-16 17:47:51 +0200, Heikki Linnakangas wrote:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 On 2014-01-16 10:35:20 -0500, Tom Lane wrote:
 I think possibly a more productive approach to this would be to treat
 it as a session-level GUC setting, rather than hard-wiring it to affect
 certain commands and not others.
 
 Do you see a reasonable way to implement this generically for all
 commands? I don't.
 
 In suitable safe places, check if you've written too much WAL, and sleep if
 so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe
 vacuum_delay_point() is a better analogy. Hopefully you don't need to
 sprinkle them in too many places to be useful.

That'd be most places doing XLogInsert() if you want to be
consistent. Each needing to be analyzed whether we're blocking important
resources, determining where in the callstack we can do the
CHECK_FOR_WAL_BUDGET().

I don't think the the add power by allowing bulk DML to be metered is
worth the increased implementation cost.

I think the usecases that would want this for DML probably also wan this
to work for unlogged, temp tables. That'd require a much more generic
resource control framework.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund

On 2014-01-16 11:19:28 -0500, Tom Lane wrote:
  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.
 
 Huh?  Unlogged tables generate *zero* WAL, by definition.

Yes. That's my point. If we provide it as a generic resource control -
which what's being discussed here sounds to me - it should be generic.

If we provide as a measure to prevent standbys from getting out of date
due to maintenance commands, then it only needs to cover those.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs

On 16 January 2014 17:22, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-01-16 11:19:28 -0500, Tom Lane wrote:
  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.

 Huh?  Unlogged tables generate *zero* WAL, by definition.

 Yes. That's my point. If we provide it as a generic resource control -
 which what's being discussed here sounds to me - it should be generic.

 If we provide as a measure to prevent standbys from getting out of date
 due to maintenance commands, then it only needs to cover those.

Agreed, but it won't happen in this release. I/O resource control to
follow in later releases.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Andres Freund

On 2014-01-16 17:20:19 +0100, Simon Riggs wrote:
 I'm comfortable with a session level parameter. I was mulling over a
 wal_rate_limit_scope parameter but I think that is too much.
 I will follow Craig's proposal to define this in terms of MB/s, adding
 that I would calc from beginning of statement to now, so it is
 averaged rate. Setting of zero means unlimited. The rate would be
 per-session (in this release) rather than a globally calculated
 setting.
 
 I would like to call it CHECK_FOR_WAL_RATE_LIMIT()
 
 That seems like a cheap enough call to allow it to be added in more
 places, so my expanded list of commands touched are
  CLUSTER/VACUUM FULL
  ALTER TABLE (only in phase 3 table rewrite)
  ALTER TABLE SET TABLESPACE
  CREATE INDEX
 which are already there, plus new ones suggested/implied
  COPY
  CREATE TABLE AS SELECT
  INSERT/UPDATE/DELETE
 
 Please let me know if I missed something (rather than debating what is
 or is not a maintenance command).

If we're going to do this for DML - which I am far from convinced of -
we also should do it for SELECT, since that can generate significant
amounts of WAL with checksums turned on.
Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
very confusingly since suddenly thez will not only block the WAL
generated by the INSERT but also the SELECT.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs

On 16 January 2014 17:29, Andres Freund and...@2ndquadrant.com wrote:

 Please let me know if I missed something (rather than debating what is
 or is not a maintenance command).

 If we're going to do this for DML - which I am far from convinced of -
 we also should do it for SELECT, since that can generate significant
 amounts of WAL with checksums turned on.
 Otherwise stuff like INSERT ... SELECT, UPDATE FROM et al. will behave
 very confusingly since suddenly thez will not only block the WAL
 generated by the INSERT but also the SELECT.

Good point, but rather happily I can say thought of that and refer
you to the other patch which limits SELECT's ability to dirty pages,
and thus, with checksums enabled will limit the generation of WAL.
(And no, they aren't the same thing).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Tom Lane

Simon Riggs si...@2ndquadrant.com writes:
 Agreed, but it won't happen in this release. I/O resource control to
 follow in later releases.

This release?  TBH, I think this patch has arrived too late to be
considered for 9.4.  It's a fairly major feature and clearly not
without controversy, so I don't think that posting it for the first
time the day before CF4 starts meets the guidelines we all agreed
to back in Ottawa.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Simon Riggs

On 16 January 2014 17:49, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Agreed, but it won't happen in this release. I/O resource control to
 follow in later releases.

 This release?  TBH, I think this patch has arrived too late to be
 considered for 9.4.  It's a fairly major feature and clearly not
 without controversy, so I don't think that posting it for the first
 time the day before CF4 starts meets the guidelines we all agreed
 to back in Ottawa.

I think its majorly useful, but there really isn't much to the
implementation and I expect its beneath the level of being defined as
a major feature. (I think it may have taken less time to write than
this current conversation has lasted).

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Heikki Linnakangas


On 01/16/2014 05:39 PM, Andres Freund wrote:

On 2014-01-16 10:35:20 -0500, Tom Lane wrote:

I think possibly a more productive approach to this would be to treat
it as a session-level GUC setting, rather than hard-wiring it to affect
certain commands and not others.


Do you see a reasonable way to implement this generically for all
commands? I don't.


In suitable safe places, check if you've written too much WAL, and sleep 
if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of 
CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe 
vacuum_delay_point() is a better analogy. Hopefully you don't need to 
sprinkle them in too many places to be useful.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Jeff Janes

On Thu, Jan 16, 2014 at 8:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:


  I think the usecases that would want this for DML probably also wan this
  to work for unlogged, temp tables.

 Huh?  Unlogged tables generate *zero* WAL, by definition.


Transactions that only change unlogged tables still generate commit records
to WAL.

I don't think that amount of WAL is particularly relevant to this
discussion, but I was recently surprised by it, so wanted to publicize it.
 (It was causing a lot of churn in my WAL due to interaction with
archive_timeout)

Cheers,

Jeff

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Peter Eisentraut

The documentation doesn't build:

openjade:config.sgml:1416:11:E: end tag for VARIABLELIST omitted, but OMITTAG 
NO was specified
openjade:config.sgml:1390:5: start tag was here



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-16 Thread Craig Ringer

On 01/16/2014 11:52 PM, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 01/16/2014 05:39 PM, Andres Freund wrote:
 Do you see a reasonable way to implement this generically for all
 commands? I don't.
 
 In suitable safe places, check if you've written too much WAL, and sleep 
 if so. Call it CHECK_FOR_WAL_BUDGET(), along the lines of 
 CHECK_FOR_INTERRUPTS(), but called less frequently. Or maybe 
 vacuum_delay_point() is a better analogy. Hopefully you don't need to 
 sprinkle them in too many places to be useful.
 
 We probably don't really need to implement it for all commands; I think
 everyone can agree that, say, ALTER TABLE RENAME COLUMN isn't going to
 emit enough WAL to really matter.  My point was that we should try to hit
 everything that potentially *could* generate lots of WAL, rather than
 arbitrarily deciding that some are utility commands and some are not.

Then you land up needing a configuration mechanism to control *which*
commands get affected, too, to handle the original use case of I want
to rate limit vaccuum and index rebuilds, while everything else runs
full tilt.

Or do you propose to just do this per-session? If so, what about autovacuum?


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-15 Thread Magnus Hagander

On Wed, Jan 15, 2014 at 3:20 AM, Simon Riggs si...@2ndquadrant.com wrote:

 We've discussed previously the negative impact of large bulk
 operations, especially wrt WAL writes. Patch here allows maintenance
 operations to have their WAL generation slowed down as a replication
 lag prevention feature.

 I believe there was originally intended to be some work on I/O rate
 limiting, but that hasn't happened and is in some ways orthogonal to
 this patch and we will likely eventually want both.

 Single new parameter works very similarly to vacuum_cost_delay

 wal_rate_limit_delay = Xms


Seems like a really bad name if we are only slowing down some commands -
that seems to indicate we're slowing down all of them. I think it should be
something that indicates that it only affects the maintenance commands.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

[HACKERS] WAL Rate Limiting

2014-01-14 Thread Simon Riggs

We've discussed previously the negative impact of large bulk
operations, especially wrt WAL writes. Patch here allows maintenance
operations to have their WAL generation slowed down as a replication
lag prevention feature.

I believe there was originally intended to be some work on I/O rate
limiting, but that hasn't happened and is in some ways orthogonal to
this patch and we will likely eventually want both.

Single new parameter works very similarly to vacuum_cost_delay

wal_rate_limit_delay = Xms

slows down CLUSTER, VACUUM FULL, ALTER TABLE (rewrite  set
tablespace), CREATE INDEX
so basically same things we optimise WAL for and the same places where
we honour maintenance_work_mem
(discuss: should we add COPY, CTAS etc also?)
(discuss: do we need another parameter to specify cost? Currently
patch uses sleep every 64kB of WAL)

VACUUM is not included, since we already have controls for that -
honouring two controls would be complex and weird.

Uses GetCurrentTransactionWALVolume() patch, which is included within
the patch to make it easier to review as a whole.

Technically, we can't simply wait before/after WAL inserts because
these typically occur while holding  buffer locks. So we need to put
the waits at a higher level, notably in safe places that currently do
CHECK_FOR_INTERRUPTS(). Doing that during query execution might make
locking of blocks for nested loops joins much worse.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


wal_rate_limiting.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-14 Thread Craig Ringer

On 01/15/2014 10:20 AM, Simon Riggs wrote:
 (discuss: do we need another parameter to specify cost? Currently
 patch uses sleep every 64kB of WAL)

It might be nicer to express this as up to n MB of WAL per second, but
otherwise seems reasonable, and it's easy enough to convert WAL MB/s
into a delay by hand.

Initial tests of this show that it does limit index creation rates as
expected, while continuing to allow WAL generation at normal rates for
other concurrent work.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-14 Thread Jim Nasby


On 1/14/14, 8:20 PM, Simon Riggs wrote:

We've discussed previously the negative impact of large bulk
operations, especially wrt WAL writes. Patch here allows maintenance
operations to have their WAL generation slowed down as a replication
lag prevention feature.

I believe there was originally intended to be some work on I/O rate
limiting, but that hasn't happened and is in some ways orthogonal to
this patch and we will likely eventually want both.

Single new parameter works very similarly to vacuum_cost_delay

wal_rate_limit_delay = Xms


From a replication standpoint, I think it would be a lot nicer if this also (or 
instead) considered the replication delay we were currently experiencing.

Related to that... it would be extremely useful to us if we could similarly 
rate limit certain backend operations, for the purpose of not overloading the 
database or replication. That's fairly unrelated except that sometimes you 
might want to WAL rate limit a backend operation, besides just COPY.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL Rate Limiting

2014-01-14 Thread Pavan Deolasee

On Wed, Jan 15, 2014 at 7:50 AM, Simon Riggs si...@2ndquadrant.com wrote:

 We've discussed previously the negative impact of large bulk
 operations, especially wrt WAL writes. Patch here allows maintenance
 operations to have their WAL generation slowed down as a replication
 lag prevention feature.

 I believe there was originally intended to be some work on I/O rate
 limiting, but that hasn't happened and is in some ways orthogonal to
 this patch and we will likely eventually want both.

 Single new parameter works very similarly to vacuum_cost_delay

 wal_rate_limit_delay = Xms

 slows down CLUSTER, VACUUM FULL, ALTER TABLE (rewrite  set
 tablespace), CREATE INDEX
 so basically same things we optimise WAL for and the same places where
 we honour maintenance_work_mem
 (discuss: should we add COPY, CTAS etc also?)
 (discuss: do we need another parameter to specify cost? Currently
 patch uses sleep every 64kB of WAL)

 VACUUM is not included, since we already have controls for that -
 honouring two controls would be complex and weird.


I wonder if should replace vacuum_cost/delay with say
maintenance_cost/delay and use it in all maintenance activities including
what you just listed out above. While the exact semantics of vacuum and
other maintenance activities may differ, ISTM the final goal is just the
same i.e. reduce load on the master.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee

45 matches

Mail list logo