Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Tue, May  6, 2014 at 11:15:17PM +0100, Simon Riggs wrote:
  Well, for what it's worth, I've encountered systems where setting
  effective_cache_size too low resulted in bad query plans, but I've
  never encountered the reverse situation.
 
 I agree with that.
 
 Though that misses my point, which is that you can't know that all of
 that memory is truly available on a server with many concurrent users.
 Choosing settings that undercost memory intensive plans are not the
 best choice for a default strategy in a mixed workload when cache may
 be better used elsewhere, even if such settings make sense for some
 individual users.

This is the same problem we had with auto-tuning work_mem, in that we
didn't know what other concurrent activity was happening.  Seems we need
concurrent activity detection before auto-tuning work_mem and
effective_cache_size.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Amit Langote
On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:

 This is the same problem we had with auto-tuning work_mem, in that we
 didn't know what other concurrent activity was happening.  Seems we need
 concurrent activity detection before auto-tuning work_mem and
 effective_cache_size.


Perhaps I am missing something obvious here, but would mmgr have any
useful numbers on this? Like any book-keeping info maintained by
mcxt.c/aset.c? Would extending that interface help?

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote:
 On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:
 
  This is the same problem we had with auto-tuning work_mem, in that we
  didn't know what other concurrent activity was happening.  Seems we need
  concurrent activity detection before auto-tuning work_mem and
  effective_cache_size.
 
 
 Perhaps I am missing something obvious here, but would mmgr have any
 useful numbers on this? Like any book-keeping info maintained by
 mcxt.c/aset.c? Would extending that interface help?

No, all memory allocat is per-process, except for shared memory.  We
probably need a way to record our large local memory allocations in
PGPROC that other backends can see;  same for effective cache size
assumptions we make.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Amit Langote
On Thu, May 15, 2014 at 11:24 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, May 15, 2014 at 10:23:19PM +0900, Amit Langote wrote:
 On Thu, May 15, 2014 at 9:06 PM, Bruce Momjian br...@momjian.us wrote:
 
  This is the same problem we had with auto-tuning work_mem, in that we
  didn't know what other concurrent activity was happening.  Seems we need
  concurrent activity detection before auto-tuning work_mem and
  effective_cache_size.
 

 Perhaps I am missing something obvious here, but would mmgr have any
 useful numbers on this? Like any book-keeping info maintained by
 mcxt.c/aset.c? Would extending that interface help?

 No, all memory allocat is per-process, except for shared memory.  We
 probably need a way to record our large local memory allocations in
 PGPROC that other backends can see;  same for effective cache size
 assumptions we make.


I see. I thought there would be some centralised way to traverse, say,
a linked list of contexts that individual backends create or something
like that. But, I suppose it would not be straightforward to make any
of that work for what we are after here.

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Bruce Momjian
On Thu, May 15, 2014 at 11:36:51PM +0900, Amit Langote wrote:
  No, all memory allocat is per-process, except for shared memory.  We
  probably need a way to record our large local memory allocations in
  PGPROC that other backends can see;  same for effective cache size
  assumptions we make.
 
 
 I see. I thought there would be some centralised way to traverse, say,
 a linked list of contexts that individual backends create or something
 like that. But, I suppose it would not be straightforward to make any
 of that work for what we are after here.

The problem is locking overhead between sessions.  Right now we avoid
all of that, and I think if we just put the value in PGPROC, it will be
good enough with limited locking required.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-15 Thread Robert Haas
On Thu, May 15, 2014 at 8:06 AM, Bruce Momjian br...@momjian.us wrote:
 On Tue, May  6, 2014 at 11:15:17PM +0100, Simon Riggs wrote:
  Well, for what it's worth, I've encountered systems where setting
  effective_cache_size too low resulted in bad query plans, but I've
  never encountered the reverse situation.

 I agree with that.

 Though that misses my point, which is that you can't know that all of
 that memory is truly available on a server with many concurrent users.
 Choosing settings that undercost memory intensive plans are not the
 best choice for a default strategy in a mixed workload when cache may
 be better used elsewhere, even if such settings make sense for some
 individual users.

 This is the same problem we had with auto-tuning work_mem, in that we
 didn't know what other concurrent activity was happening.  Seems we need
 concurrent activity detection before auto-tuning work_mem and
 effective_cache_size.

I think it's worse than that: we don't even know what else is
happening *in the same query*.  For example, look at this:

http://www.postgresql.org/message-id/16161.1324414...@sss.pgh.pa.us

That's pretty awful, and it's just one example of a broader class of
problems that we haven't even tried to solve.  We really need a way to
limit memory usage on a per-query basis rather than a per-node basis.
For example, consider a query plan that needs to do four sorts.  If
work_mem = 64MB, we'll happily use 256MB total, 64MB for each sort.
Now, that might cause the system to swap: since there are four sorts,
maybe we ought to have used only 16MB per sort, and switched to a heap
sort if that wasn't enough.  But it's even subtler than that: if we
had known when building the query plan that we only had 16MB per sort
rather than 64MB per sort, we would potentially have estimated higher
costs for those sorts in the first place, which might have led to a
different plan that needed fewer sorts to begin with.

When you start to try to balance memory usage across multiple
backends, things get even more complicated.  If the first query that
starts up is allowed to use all the available memory, and we respond
to that by lowering the effective value of work_mem to something very
small, a second query that shows up a bit later might choose a very
inefficient plan as a result.  That in turn might cause heavy I/O load
on the system for a long time, making the first query run very slowly.
 We might have been better off just letting the first query finish,
and the running the second one (with a much better plan) after it was
done.  Or, maybe we should have only let the first query take a
certain fraction (half? 10%?) of the available memory, so that there
was more left for the second guy.  But that could be wrong too - it
might cause the first plan to be unnecessarily inefficient when nobody
was planning to run any other queries anyway.  Plus, DBAs hate it when
plans change on them unexpectedly, so anything that involves a
feedback loop between current utilization and query plans will be
unpopular with some people for that reason.

These are hard problems.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Simon Riggs
On 6 May 2014 17:55, Andres Freund and...@2ndquadrant.com wrote:

 All this changes is the cost of
 IndexScans that would use more than 25% of shared_buffers worth of
 data. Hopefully not many of those in your workload. Changing the cost
 doesn't necessarily prevent index scans either. And if there are many
 of those in your workload AND you run more than one at same time, then
 the larger setting will work against you. So the benefit window for
 such a high setting is slim, at best.

 Why? There's many workloads where indexes are larger than shared buffers
 but fit into the operating system's cache. And that's precisely what
 effective_cache_size is about.
 Especially on bigger machines shared_buffers can't be set big enough to
 actually use all the machine's memory. It's not uncommon to have 4GB
 shared buffers on a machine with 512GB RAM... It'd be absolutely
 disastrous to set effective_cache_size to 1GB for an analytics workload.

In this case, a setting of effective_cache_size  (4 * shared_buffers)
could be appropriate, as long as we are certain we have the memory.

We don't have any stats on peak memory usage to be certain - although
in that case its pretty clear.

If we had stats on how effective the indexscan was at multiple-hitting
earlier read blocks, we'd be able to autotune, but I accept that
without that we do still need the parameter.

 I specifically picked 25% of shared_buffers because that is the point
 at which sequential scans become more efficient and use the cache more
 efficiently. If our cost models are correct, then switching away from
 index scans shouldn't hurt at all.

 More often than not indexes are smaller than the table size, so this
 argument doesn't seem to make much sense.

If we believe that 25% of shared_buffers worth of heap blocks would
flush the cache doing a SeqScan, why should we allow 400% of
shared_buffers worth of index blocks? In your example, that would be
1GB of heap blocks, or 16GB of index blocks.  If our table is 100GB
with a 32GB index, then yes, that is 1% of the heap and 50% of the
index. But that doesn't matter, since I am discussing the point at
which we prevent the cache being churned. Given your example we do not
allow a SeqScan of a table larger than 1GB to flush cache, since we
use BAS_BULKREAD. If we allow an indexscan plan that will touch 16GB
of an index that will very clearly flush out our 4GB of
shared_buffers, increasing time for later queries even if they only
have to read from OS buffers back into shared_buffers. That will still
show itself as a CPU spike, which is what people say they are seeing.

I think I'm arguing myself towards using a BufferAccessStrategy of
BAS_BULKREAD for large IndexScans, BitMapIndexScans and
BitMapHeapScans. Yes, we can make plans assuming we can use OS cache,
but we shouldn't be churning shared_buffers when we execute those
plans. large here meaning the same thing as it does for SeqScans,
which is a scan that seems likely to touch more than 25% of shared
buffers. I'll work up a patch.

Perhaps it would also be useful to consider using a sequential scan of
the index relation for less selective BitmapIndexScans, just as we do
very effectively during VACUUM. Maybe that is a better idea than
bitmap indexes.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Robert Haas
On Wed, May 7, 2014 at 3:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 If we believe that 25% of shared_buffers worth of heap blocks would
 flush the cache doing a SeqScan, why should we allow 400% of
 shared_buffers worth of index blocks?

I think you're comparing apples and oranges.  The 25% threshold is
answering the question How big does a sequential scan have to be
before it's likely to flush so much so much unrelated data out of
shared_buffers that it hurts the performance of other things running
on the system?. So it's not really about whether or not things will
*fit* in the cache, but rather a judgement about at what point caching
that stuff is going to be less value than continuing to cache other
things.  Also, it's specifically a judgement about shared_buffers, not
system memory.

But effective_cache_size is used to estimate the likelihood that an
index scan which accesses the same heap or index block twice will
still be in cache on the second hit, and thus need to be faulted in
only once.  So this *is* a judgment about what will fit - generally
over a very short time scale.  And, since bringing a page into
shared_buffers from the OS cache is much less expensive than bringing
a page into memory from disk, it's really about what will fit in
overall system memory, not just shared_buffers.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Simon Riggs
On 7 May 2014 13:31, Robert Haas robertmh...@gmail.com wrote:
 On Wed, May 7, 2014 at 3:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 If we believe that 25% of shared_buffers worth of heap blocks would
 flush the cache doing a SeqScan, why should we allow 400% of
 shared_buffers worth of index blocks?

 I think you're comparing apples and oranges.

I understood the distinction, which is why I changed the direction of
my thinking to say

 Yes, we can make plans assuming we can use OS cache,
 but we shouldn't be churning shared_buffers when we execute those
 plans.

and hence why I proposed

 I think I'm arguing myself towards using a BufferAccessStrategy of
 BAS_BULKREAD for large IndexScans, BitMapIndexScans and
 BitMapHeapScans.

which I hope will be effective in avoiding churn in shared_buffers
even though we may use much larger memory from the OS.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 I think I'm arguing myself towards using a BufferAccessStrategy of
 BAS_BULKREAD for large IndexScans, BitMapIndexScans and
 BitMapHeapScans.

As soon as you've got some hard evidence to present in favor of such
changes, we can discuss it.  I've got other things to do besides
hypothesize.

In the meantime, it seems like there is an emerging consensus that nobody
much likes the existing auto-tuning behavior for effective_cache_size,
and that we should revert that in favor of just increasing the fixed
default value significantly.  I see no problem with a value of say 4GB;
that's very unlikely to be worse than the pre-9.4 default (128MB) on any
modern machine.

Votes for or against?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Merlin Moncure
On Wed, May 7, 2014 at 9:07 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 I think I'm arguing myself towards using a BufferAccessStrategy of
 BAS_BULKREAD for large IndexScans, BitMapIndexScans and
 BitMapHeapScans.

 As soon as you've got some hard evidence to present in favor of such
 changes, we can discuss it.  I've got other things to do besides
 hypothesize.

Let me throw out one last point:

It's pretty likely that s_b is going to be raised higher as a
percentage of RAM.   I never really bought into the conventional
wisdom of 25% and have had to set it lower many times.  Nevertheless,
it was a documented suggestion.

The core issues are:
1) There is no place to enter total system memory available to the
database in postgresql.conf
2) Memory settings (except for the above) are given as absolute
amounts, not percentages.

It would be a lot easier to standardize configurations particularly if
there was a way to electronically support #1 with auto-detection.
Then, e_c_s. s_b, work_mem, and various other settings could be given
using standard (and perhaps somewhat conservative) percentages using
the best and hopefully factually supported recommendations.   I
oversee dozens of servers in a virtualized environment (as most
enterprise shops are these days).  Everything is 'right sized', often
on demand, and often nobody bothers to adjust the various settings.

 In the meantime, it seems like there is an emerging consensus that nobody
 much likes the existing auto-tuning behavior for effective_cache_size,
 and that we should revert that in favor of just increasing the fixed
 default value significantly.  I see no problem with a value of say 4GB;
 that's very unlikely to be worse than the pre-9.4 default (128MB) on any
 modern machine.

In lieu of something fancy like the above, adjusting the defaults
seems a better way to go (so I vote to revert).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andres Freund
On 2014-05-07 10:07:07 -0400, Tom Lane wrote:
 In the meantime, it seems like there is an emerging consensus that nobody
 much likes the existing auto-tuning behavior for effective_cache_size,
 and that we should revert that in favor of just increasing the fixed
 default value significantly.  I see no problem with a value of say 4GB;
 that's very unlikely to be worse than the pre-9.4 default (128MB) on any
 modern machine.
 
 Votes for or against?

+1 for increasing it to 4GB and remove the autotuning. I don't like the
current integration into guc.c much and a new static default doesn't
seem to be worse than the current autotuning. 

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Robert Haas
On Wed, May 7, 2014 at 10:12 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-07 10:07:07 -0400, Tom Lane wrote:
 In the meantime, it seems like there is an emerging consensus that nobody
 much likes the existing auto-tuning behavior for effective_cache_size,
 and that we should revert that in favor of just increasing the fixed
 default value significantly.  I see no problem with a value of say 4GB;
 that's very unlikely to be worse than the pre-9.4 default (128MB) on any
 modern machine.

 Votes for or against?

 +1 for increasing it to 4GB and remove the autotuning. I don't like the
 current integration into guc.c much and a new static default doesn't
 seem to be worse than the current autotuning.

It was my proposal originally, so I assume I'd be counted as in favor,
but for the sake of clarity: +1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Magnus Hagander
On Wed, May 7, 2014 at 4:12 PM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-05-07 10:07:07 -0400, Tom Lane wrote:
  In the meantime, it seems like there is an emerging consensus that nobody
  much likes the existing auto-tuning behavior for effective_cache_size,
  and that we should revert that in favor of just increasing the fixed
  default value significantly.  I see no problem with a value of say 4GB;
  that's very unlikely to be worse than the pre-9.4 default (128MB) on any
  modern machine.
 
  Votes for or against?

 +1 for increasing it to 4GB and remove the autotuning. I don't like the
 current integration into guc.c much and a new static default doesn't
 seem to be worse than the current autotuning.


+1.

If we can't make the autotuning better than that, we're better off holding
off on that one until we can actually figure out something better. (At
which point perhaps we can reach the level where we can just remove it..
But that's all handwaving about the future of course).


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, May 7, 2014 at 3:18 AM, Simon Riggs si...@2ndquadrant.com wrote:
 If we believe that 25% of shared_buffers worth of heap blocks would
 flush the cache doing a SeqScan, why should we allow 400% of
 shared_buffers worth of index blocks?

 I think you're comparing apples and oranges.  The 25% threshold is
 answering the question How big does a sequential scan have to be
 before it's likely to flush so much so much unrelated data out of
 shared_buffers that it hurts the performance of other things running
 on the system?. So it's not really about whether or not things will
 *fit* in the cache, but rather a judgement about at what point caching
 that stuff is going to be less value than continuing to cache other
 things.  Also, it's specifically a judgement about shared_buffers, not
 system memory.

 But effective_cache_size is used to estimate the likelihood that an
 index scan which accesses the same heap or index block twice will
 still be in cache on the second hit, and thus need to be faulted in
 only once.  So this *is* a judgment about what will fit - generally
 over a very short time scale.  And, since bringing a page into
 shared_buffers from the OS cache is much less expensive than bringing
 a page into memory from disk, it's really about what will fit in
 overall system memory, not just shared_buffers.

Another point is that the 25% seqscan threshold actually controls some
specific caching decisions, which effective_cache_size does not.  Raising
effective_cache_size too high is unlikely to result in cache trashing;
in fact I'd guess the opposite.  What that would do is cause the planner
to prefer indexscans over seqscans in more cases involving large tables.
But if you've got a table+index that's bigger than RAM, seqscans are
probably going to be worse for the OS cache than indexscans, because
they're going to require bringing in more data.

So I still think this whole argument is founded on shaky hypotheses
with a complete lack of hard data showing that a smaller default for
effective_cache_size would be better.  The evidence we have points
in the other direction.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andrew Dunstan


On 05/07/2014 10:12 AM, Andres Freund wrote:

On 2014-05-07 10:07:07 -0400, Tom Lane wrote:

In the meantime, it seems like there is an emerging consensus that nobody
much likes the existing auto-tuning behavior for effective_cache_size,
and that we should revert that in favor of just increasing the fixed
default value significantly.  I see no problem with a value of say 4GB;
that's very unlikely to be worse than the pre-9.4 default (128MB) on any
modern machine.

Votes for or against?

+1 for increasing it to 4GB and remove the autotuning. I don't like the
current integration into guc.c much and a new static default doesn't
seem to be worse than the current autotuning.





+1. If we ever want to implement an auto-tuning heuristic it seems we're 
going to need some hard empirical evidence to support it, and that 
doesn't seem likely to appear any time soon.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Simon Riggs
On 7 May 2014 15:07, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 I think I'm arguing myself towards using a BufferAccessStrategy of
 BAS_BULKREAD for large IndexScans, BitMapIndexScans and
 BitMapHeapScans.

 As soon as you've got some hard evidence to present in favor of such
 changes, we can discuss it.  I've got other things to do besides
 hypothesize.

Now we have a theory to test, I'll write a patch and we can collect
evidence for, or against.

 In the meantime, it seems like there is an emerging consensus that nobody
 much likes the existing auto-tuning behavior for effective_cache_size,
 and that we should revert that in favor of just increasing the fixed
 default value significantly.  I see no problem with a value of say 4GB;
 that's very unlikely to be worse than the pre-9.4 default (128MB) on any
 modern machine.

 Votes for or against?

+1 for fixed 4GB and remove the auto-tuning code.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Simon Riggs
On 7 May 2014 15:10, Merlin Moncure mmonc...@gmail.com wrote:

 The core issues are:
 1) There is no place to enter total system memory available to the
 database in postgresql.conf
 2) Memory settings (except for the above) are given as absolute
 amounts, not percentages.

Those sound useful starting points.

The key issue for me is that effective_cache_size is a USERSET. It
applies per-query, just like work_mem (though work_mem is per query
node).

If we had total system memory we wouldn't know how to divide it up
amongst users since we have no functionality for workload
management.

It would be very nice to be able to tell Postgres that I have 64GB
RAM, use it wisely. At present, any and all users can set
effective_cache_size and work_mem to any value they please, any time
they wish and thus overuse available memory. Which is why I've had to
write plugins to manage the memory allocations better in userspace.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Josh Berkus
On 05/07/2014 07:31 AM, Andrew Dunstan wrote:

 +1. If we ever want to implement an auto-tuning heuristic it seems we're
 going to need some hard empirical evidence to support it, and that
 doesn't seem likely to appear any time soon.

4GB default it is, then.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Josh Berkus
On 05/06/2014 10:35 PM, Peter Geoghegan wrote:
 +1. In my view, we probably should have set it to a much higher
 absolute default value. The main problem with setting it to any
 multiple of shared_buffers that I can see is that shared_buffers is a
 very poor proxy for what effective_cache_size is supposed to
 represent. In general, the folk wisdom around sizing shared_buffers
 has past its sell-by date.

Unfortunately nobody has the time/resources to do the kind of testing
required for a new recommendation for shared_buffers.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 11:04 AM, Josh Berkus j...@agliodbs.com wrote:
 Unfortunately nobody has the time/resources to do the kind of testing
 required for a new recommendation for shared_buffers.

I meant to suggest that the buffer manager could be improved to the
point that the old advice becomes obsolete. Right now, it's much
harder to analyze shared_buffers than it should be, presumably because
of the problems with the buffer manager. I think that if we could
formulate better *actionable* advice around what we have right now,
that would have already happened.

We ought to be realistic about the fact that the current
recommendations around sizing shared_buffers are nothing more than
folk wisdom. That's the best we have right now, but that seems quite
unsatisfactory to me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Jeff Janes
On Tue, May 6, 2014 at 9:55 AM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-05-06 17:43:45 +0100, Simon Riggs wrote:



  All this changes is the cost of
  IndexScans that would use more than 25% of shared_buffers worth of
  data. Hopefully not many of those in your workload. Changing the cost
  doesn't necessarily prevent index scans either. And if there are many
  of those in your workload AND you run more than one at same time, then
  the larger setting will work against you. So the benefit window for
  such a high setting is slim, at best.


Not only do you need to run more than one at a time, but they also must use
mostly disjoint sets of data, in order for the larger estimate to be bad.



 Why? There's many workloads where indexes are larger than shared buffers
 but fit into the operating system's cache. And that's precisely what
 effective_cache_size is about.


It is more about the size of the table referenced by the index, rather than
the size of the index.  The point is that doing a large index scan might
lead you to visit the same table blocks repeatedly within quick succession.
 (If a small index scan is on the inner side of a nested loop, then you
might access the same index leaf blocks and the same table blocks
repeatedly--that is why is only mostly about the table size, rather than
exclusively).

Cheers,

Jeff


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Merlin Moncure
On Wed, May 7, 2014 at 1:13 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 7, 2014 at 11:04 AM, Josh Berkus j...@agliodbs.com wrote:
 Unfortunately nobody has the time/resources to do the kind of testing
 required for a new recommendation for shared_buffers.

 I meant to suggest that the buffer manager could be improved to the
 point that the old advice becomes obsolete. Right now, it's much
 harder to analyze shared_buffers than it should be, presumably because
 of the problems with the buffer manager. I think that if we could
 formulate better *actionable* advice around what we have right now,
 that would have already happened.

 We ought to be realistic about the fact that the current
 recommendations around sizing shared_buffers are nothing more than
 folk wisdom. That's the best we have right now, but that seems quite
 unsatisfactory to me.

I think the stock advice is worse then nothing because it is A. based
on obsolete assumptions and B. doesn't indicate what the tradeoffs are
or what kinds of symptoms adjusting the setting could alleviate.  The
documentation should be reduced to things that are known, for example:

*) raising shared buffers does not 'give more memory to postgres for
caching' -- it can only reduce it via double paging
*) are generally somewhat faster than fault to o/s buffers
*) large s_b than working dataset size can be good configuration for
read only loads especially
*) have bad interplay with o/s in some configurations with large settings
*) shared buffers can reduce write i/o in certain workloads
*) interplay with checkpoint
*) have different mechanisms for managing contention than o/s buffers

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andres Freund
On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
 On Wed, May 7, 2014 at 1:13 PM, Peter Geoghegan p...@heroku.com wrote:
  On Wed, May 7, 2014 at 11:04 AM, Josh Berkus j...@agliodbs.com wrote:
  Unfortunately nobody has the time/resources to do the kind of testing
  required for a new recommendation for shared_buffers.
 
  I meant to suggest that the buffer manager could be improved to the
  point that the old advice becomes obsolete. Right now, it's much
  harder to analyze shared_buffers than it should be, presumably because
  of the problems with the buffer manager. I think that if we could
  formulate better *actionable* advice around what we have right now,
  that would have already happened.
 
  We ought to be realistic about the fact that the current
  recommendations around sizing shared_buffers are nothing more than
  folk wisdom. That's the best we have right now, but that seems quite
  unsatisfactory to me.
 
 I think the stock advice is worse then nothing because it is A. based
 on obsolete assumptions and B. doesn't indicate what the tradeoffs are
 or what kinds of symptoms adjusting the setting could alleviate.  The
 documentation should be reduced to things that are known, for example:
 
 *) raising shared buffers does not 'give more memory to postgres for
 caching' -- it can only reduce it via double paging

That's absolutely not a necessary consequence. If pages are in s_b for a
while the OS will be perfectly happy to throw them away.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Josh Berkus
On 05/07/2014 11:13 AM, Peter Geoghegan wrote:
 We ought to be realistic about the fact that the current
 recommendations around sizing shared_buffers are nothing more than
 folk wisdom. That's the best we have right now, but that seems quite
 unsatisfactory to me.

So, as one of several people who put literally hundreds of hours into
the original benchmarking which established the sizing recommendations
for shared_buffers (and other settings), I find the phrase folk wisdom
personally offensive.  So, can we stop with this?

Otherwise, I don't think I can usefully participate in this discussion.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.com wrote:
 *) raising shared buffers does not 'give more memory to postgres for
 caching' -- it can only reduce it via double paging

 That's absolutely not a necessary consequence. If pages are in s_b for a
 while the OS will be perfectly happy to throw them away.

The biggest problem with double buffering is not that it wastes
memory. Rather, it's that it wastes memory bandwidth. I think that
lessening that problem will be the major benefit of making larger
shared_buffers settings practical.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Robert Haas
On Wed, May 7, 2014 at 2:40 PM, Josh Berkus j...@agliodbs.com wrote:
 On 05/07/2014 11:13 AM, Peter Geoghegan wrote:
 We ought to be realistic about the fact that the current
 recommendations around sizing shared_buffers are nothing more than
 folk wisdom. That's the best we have right now, but that seems quite
 unsatisfactory to me.

 So, as one of several people who put literally hundreds of hours into
 the original benchmarking which established the sizing recommendations
 for shared_buffers (and other settings), I find the phrase folk wisdom
 personally offensive.  So, can we stop with this?

 Otherwise, I don't think I can usefully participate in this discussion.

+1.

I think it is quite accurate to say that we can't predict precisely
what value of shared_buffers will perform best for a particular
workload and on a particular system.  There are people out there using
very large values and very small ones, according to what they have
found most effective.  But that does not mean, as the phrase folk
wisdom might be taken to imply, that we don't know anything at all
about what actually works well in practice.  Because we do know quite
a bit about that.  I and people I work with have been able to improve
performance greatly on many systems by providing guidance based on
what this community has been able to understand on this topic, and
dismissing it as rubbish is wrong.

Also, I seriously doubt that a one-size-fits-all guideline about
setting shared_buffers will ever be right for every workload.
Workloads, by their nature, are complex beasts.  The size of the
workload varies, and which portions of it are how hot vary, and the
read-write mix varies, and those are not problems with PostgreSQL;
those are problems with data.  That is not to say that we can't do
anything to make PostgreSQL work better across a wider range of
settings for shared_buffers, but it is to say that no matter how much
work we do on the code, setting this optimally for every workload will
probably remain complex.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andres Freund
On 2014-05-07 11:45:04 -0700, Peter Geoghegan wrote:
 On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.com wrote:
  *) raising shared buffers does not 'give more memory to postgres for
  caching' -- it can only reduce it via double paging
 
  That's absolutely not a necessary consequence. If pages are in s_b for a
  while the OS will be perfectly happy to throw them away.
 
 The biggest problem with double buffering is not that it wastes
 memory. Rather, it's that it wastes memory bandwidth.

Doesn't match my experience. Even with the current buffer manager
there's usually enough locality to keep important pages in s_b for a
meaningful time. I *have* seen workloads that should have fit into
memory not fit because of double buffering.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Robert Haas
On Wed, May 7, 2014 at 2:49 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-07 11:45:04 -0700, Peter Geoghegan wrote:
 On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  *) raising shared buffers does not 'give more memory to postgres for
  caching' -- it can only reduce it via double paging
 
  That's absolutely not a necessary consequence. If pages are in s_b for a
  while the OS will be perfectly happy to throw them away.

 The biggest problem with double buffering is not that it wastes
 memory. Rather, it's that it wastes memory bandwidth.

 Doesn't match my experience. Even with the current buffer manager
 there's usually enough locality to keep important pages in s_b for a
 meaningful time. I *have* seen workloads that should have fit into
 memory not fit because of double buffering.

Same here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 11:40 AM, Josh Berkus j...@agliodbs.com wrote:
 So, as one of several people who put literally hundreds of hours into
 the original benchmarking which established the sizing recommendations
 for shared_buffers (and other settings), I find the phrase folk wisdom
 personally offensive.  So, can we stop with this?

I have also put a lot of time into benchmarking. No personal offence
was intended, and I'm glad that we have some advice to give to users,
but the fact of the matter is that current *official* recommendations
are very vague.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 But that does not mean, as the phrase folk
 wisdom might be taken to imply, that we don't know anything at all
 about what actually works well in practice.

Folk wisdom doesn't imply that. It implies that we think this works,
and we may well be right, but there isn't all that much rigor behind
some of it. I'm not blaming anyone for this state of affairs. I've
heard plenty of people repeat the don't exceed 8GB rule - I
regularly repeated it myself. I cannot find any rigorous defense of
this, though. If you're aware of one, please point it out to me.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Josh Berkus
On 05/07/2014 11:52 AM, Peter Geoghegan wrote:
 On Wed, May 7, 2014 at 11:40 AM, Josh Berkus j...@agliodbs.com wrote:
 So, as one of several people who put literally hundreds of hours into
 the original benchmarking which established the sizing recommendations
 for shared_buffers (and other settings), I find the phrase folk wisdom
 personally offensive.  So, can we stop with this?
 
 I have also put a lot of time into benchmarking. No personal offence
 was intended, and I'm glad that we have some advice to give to users,
 but the fact of the matter is that current *official* recommendations
 are very vague.

Well, they should be vague; the only hard data we have is rather
out-of-date (I think 8.2 was our last set of tests).  If we gave users
specific, detailed recommendations, we'd be misleading them.

For that matter, our advice on shared_buffers ... and our design for it
... is going to need to change radically soon, since Linux is getting an
ARC with a frequency cache as well as a recency cache, and FreeBSD and
OpenSolaris already have them.

FWIW, if someone could fund me for a month, I'd be happy to create a
benchmarking setup where we could test these kinds of things; I have
pretty clear ideas how to build one.  I imagine some of our other
consultants could make the same offer.  However, it's too much work for
anyone to get done in their spare time.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 Doesn't match my experience. Even with the current buffer manager
 there's usually enough locality to keep important pages in s_b for a
 meaningful time. I *have* seen workloads that should have fit into
 memory not fit because of double buffering.

 Same here.

I think that it depends on whether or not you're thinking about the
worst case. Most people are not going to be in the category you
describe here. Plenty of people in the Postgres community run with
very large shared_buffers settings, on non i/o bound workloads, and
report good results - often massive, quickly apparent improvements.
I'm mostly concerned with obsoleting the 8GB hard ceiling rule here.

It probably doesn't matter whether and by how much one factor is worse
than the other, though. I found the section 5.2 Temporal Control:
Buffering in the following paper, that speaks about the subject quite
interesting: http://db.cs.berkeley.edu/papers/fntdb07-architecture.pdf
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Robert Haas
On Wed, May 7, 2014 at 2:58 PM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 7, 2014 at 11:50 AM, Robert Haas robertmh...@gmail.com wrote:
 But that does not mean, as the phrase folk
 wisdom might be taken to imply, that we don't know anything at all
 about what actually works well in practice.

 Folk wisdom doesn't imply that. It implies that we think this works,
 and we may well be right, but there isn't all that much rigor behind
 some of it. I'm not blaming anyone for this state of affairs. I've
 heard plenty of people repeat the don't exceed 8GB rule - I
 regularly repeated it myself. I cannot find any rigorous defense of
 this, though. If you're aware of one, please point it out to me.

I'm not sure the level of rigor you'd like to see is going to be
available here.  Complex systems have complex behavior; that's life.

At any rate, I'm not aware of any rigorous defense of the don't
exceed 8GB rule.  But, #1, I'd never put it that simply.   What I've
found is more like this: If it's possible to size shared_buffers so
that the working set fits entirely within shared_buffers, that
configuration is worthy of strong consideration.  Otherwise, you
probably want to keep shared_buffers low in order to avoid
checkpoint-related I/O spikes and minimize double buffering; try 25%
of system memory up to 512MB on Windows or up to 2GB on 32-bit Linux
or up to 8GB on 64-bit Linux for starters, and then tune based on your
workload.

And #2, I think the origin of the 8GB number on 64-bit non-Windows
systems is that people found that checkpoint-related I/O spikes became
intolerable when you went too much above that number.  On some
systems, the threshold is lower than that - for example, I believe
Merlin and others have reported numbers more like 2GB than 8GB - and
on other systems, the threshold is higher - indeed, some people go way
higher and never hit it at all.  I agree that it would be nice to
better-characterize why different users hit it at different levels,
but it's probably highly dependent on hardware, workload, and kernel
version, so I tend to doubt it can be characterized very simply.

If I had go to guess, I'd bet that fixing Linux's abominable behavior
around the fsync() call would probably go a long way toward making
higher values of shared_buffers more practical.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Jeff Janes
On Wed, May 7, 2014 at 11:04 AM, Josh Berkus j...@agliodbs.com wrote:

 On 05/06/2014 10:35 PM, Peter Geoghegan wrote:
  +1. In my view, we probably should have set it to a much higher
  absolute default value. The main problem with setting it to any
  multiple of shared_buffers that I can see is that shared_buffers is a
  very poor proxy for what effective_cache_size is supposed to
  represent. In general, the folk wisdom around sizing shared_buffers
  has past its sell-by date.

 Unfortunately nobody has the time/resources to do the kind of testing
 required for a new recommendation for shared_buffers.


I think it is worse than that.  I don't think we know what such testing
would even look like.  SSD?  BBU? max_connections=2 with 256 cores?
 pgbench -N?  capture and replay of Amazon's workload?

If we could spell out/agree upon what kind of testing we would find
convincing, that would probably go a long way to getting some people to
work on carrying out the tests.  Unless the conclusion was please have 3TB
or RAM and a 50 disk RAID, then there might be few takers.

Cheers,

Jeff


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Josh Berkus
On 05/07/2014 01:36 PM, Jeff Janes wrote:
 On Wed, May 7, 2014 at 11:04 AM, Josh Berkus j...@agliodbs.com wrote:

 Unfortunately nobody has the time/resources to do the kind of testing
 required for a new recommendation for shared_buffers.

 I think it is worse than that.  I don't think we know what such testing
 would even look like.  SSD?  BBU? max_connections=2 with 256 cores?
  pgbench -N?  capture and replay of Amazon's workload?
 
 If we could spell out/agree upon what kind of testing we would find
 convincing, that would probably go a long way to getting some people to
 work on carrying out the tests.  Unless the conclusion was please have 3TB
 or RAM and a 50 disk RAID, then there might be few takers.

Well, step #1 would be writing some easy-to-run benchmarks which carry
out selected workloads and measure response times.  The minimum starting
set would include one OLTP/Web benchmark, and one DW benchmark.

I'm not talking about the software to run the workload; we have that, in
several varieties.  I'm talking about the actual database generator and
queries to run.  That's the hard work.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Jeff Janes
On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.comwrote:

 On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
 
  *) raising shared buffers does not 'give more memory to postgres for
  caching' -- it can only reduce it via double paging

 That's absolutely not a necessary consequence. If pages are in s_b for a
 while the OS will be perfectly happy to throw them away.


Is that an empirical observation?  I've run some simulations a couple years
ago, and also wrote some instrumentation to test that theory under
favorably engineered (but still plausible) conditions, and couldn't get
more than a small fraction of s_b to be so tightly bound in that the kernel
could forget about them.  Unless of course the entire workload or close to
it fits in s_b.

Cheers,

Jeff


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andres Freund
On 2014-05-07 13:51:57 -0700, Jeff Janes wrote:
 On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
  
   *) raising shared buffers does not 'give more memory to postgres for
   caching' -- it can only reduce it via double paging
 
  That's absolutely not a necessary consequence. If pages are in s_b for a
  while the OS will be perfectly happy to throw them away.
 
 
 Is that an empirical observation?

Yes.

 I've run some simulations a couple years
 ago, and also wrote some instrumentation to test that theory under
 favorably engineered (but still plausible) conditions, and couldn't get
 more than a small fraction of s_b to be so tightly bound in that the kernel
 could forget about them.  Unless of course the entire workload or close to
 it fits in s_b.

I think it depends on your IO access patterns. If the whole working set
fits into the kernel's page cache and there's no other demand for pages
it will stay in. If you constantly rewrite most all your pages they'll
also stay in the OS cache because they'll get written out. If the churn
in shared_buffers is so high (because it's so small in comparison to the
core hot data set) that there'll be dozens if not hundreds clock sweeps
a second you'll also have no locality.
It's also *hugely* kernel version specific :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Merlin Moncure
On Wed, May 7, 2014 at 4:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-05-07 13:51:57 -0700, Jeff Janes wrote:
 On Wed, May 7, 2014 at 11:38 AM, Andres Freund and...@2ndquadrant.comwrote:

  On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
  
   *) raising shared buffers does not 'give more memory to postgres for
   caching' -- it can only reduce it via double paging
 
  That's absolutely not a necessary consequence. If pages are in s_b for a
  while the OS will be perfectly happy to throw them away.
 

 Is that an empirical observation?

 Yes.

 I've run some simulations a couple years
 ago, and also wrote some instrumentation to test that theory under
 favorably engineered (but still plausible) conditions, and couldn't get
 more than a small fraction of s_b to be so tightly bound in that the kernel
 could forget about them.  Unless of course the entire workload or close to
 it fits in s_b.

 I think it depends on your IO access patterns. If the whole working set
 fits into the kernel's page cache and there's no other demand for pages
 it will stay in. If you constantly rewrite most all your pages they'll
 also stay in the OS cache because they'll get written out. If the churn
 in shared_buffers is so high (because it's so small in comparison to the
 core hot data set) that there'll be dozens if not hundreds clock sweeps
 a second you'll also have no locality.
 It's also *hugely* kernel version specific :(

right.  This is, IMNSHO, exactly the sort of language that belongs in the docs.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 2:24 PM, Merlin Moncure mmonc...@gmail.com wrote:
 right.  This is, IMNSHO, exactly the sort of language that belongs in the 
 docs.

+1

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Andres Freund
On 2014-05-07 16:24:53 -0500, Merlin Moncure wrote:
 On Wed, May 7, 2014 at 4:15 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-05-07 13:51:57 -0700, Jeff Janes wrote:
  On Wed, May 7, 2014 at 11:38 AM, Andres Freund 
  and...@2ndquadrant.comwrote:
 
   On 2014-05-07 13:32:41 -0500, Merlin Moncure wrote:
   
*) raising shared buffers does not 'give more memory to postgres for
caching' -- it can only reduce it via double paging
  
   That's absolutely not a necessary consequence. If pages are in s_b for a
   while the OS will be perfectly happy to throw them away.
  
 
  Is that an empirical observation?
 
  Yes.
 
  I've run some simulations a couple years
  ago, and also wrote some instrumentation to test that theory under
  favorably engineered (but still plausible) conditions, and couldn't get
  more than a small fraction of s_b to be so tightly bound in that the kernel
  could forget about them.  Unless of course the entire workload or close to
  it fits in s_b.
 
  I think it depends on your IO access patterns. If the whole working set
  fits into the kernel's page cache and there's no other demand for pages
  it will stay in. If you constantly rewrite most all your pages they'll
  also stay in the OS cache because they'll get written out. If the churn
  in shared_buffers is so high (because it's so small in comparison to the
  core hot data set) that there'll be dozens if not hundreds clock sweeps
  a second you'll also have no locality.
  It's also *hugely* kernel version specific :(
 
 right.  This is, IMNSHO, exactly the sort of language that belongs in the 
 docs.

Well, that's just the tip of the iceberg though. Whether you can accept
small shared_buffers to counteract double buffering or not is also a
hard to answer question... That again heavily depends on the usage
patterns. If you have high concurrency and your working set has some
locality it's very important to have a high s_b lest you fall afoul of
the freelist lock. If you have high concurrency but 90+ of your page
lookups *aren't* going to be in the cache you need to be very careful
with a large s_b because the clock sweeps to lower the usagecounts can
enlarge the lock contention.
Then there's both memory and cache efficiency questions around both the
PrivateRefCount array and the lwlocks

In short: I think it's pretty hard to transfer this into language that's
a) agreed upon b) understandable to someone that hasn't discovered
several of the facts for him/herself.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-07 Thread Peter Geoghegan
On Wed, May 7, 2014 at 12:06 PM, Josh Berkus j...@agliodbs.com wrote:
 For that matter, our advice on shared_buffers ... and our design for it
 ... is going to need to change radically soon, since Linux is getting an
 ARC with a frequency cache as well as a recency cache, and FreeBSD and
 OpenSolaris already have them.

I knew about ZFS, but Linux is implementing ARC? There are good
reasons to avoid ARC. CAR seems like a more plausible candidate, since
it apparently acknowledges ARC's shortcomings and fixes them.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 8 October 2013 17:13, Bruce Momjian br...@momjian.us wrote:

 Patch applied with a default of 4x shared buffers.  I have added a 9.4
 TODO that we might want to revisit this.

I certainly want to revisit this patch and this setting.

How can we possibly justify a default setting that could be more than
physical RAM?

The maximum known safe value is the setting of shared_buffers itself,
without external knowledge. But how can we possibly set it even that
high?

Does anyone have any evidence at all on how to set this? How can we
possibly autotune it?

I prefer the idea of removing effective_cache_size completely, since
it has so little effect on workloads and is very frequently
misunderstood by users. It's just dangerous, without being useful.

Why do we autotune the much more important synch scan threshold, yet
allow tuning of e_c_s?

Lets fix e_c_s at 25% of shared_buffers and remove the parameter
completely, just as we do with so many other performance parameters.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Andres Freund
On 2014-05-06 15:09:15 +0100, Simon Riggs wrote:
 On 8 October 2013 17:13, Bruce Momjian br...@momjian.us wrote:
 
  Patch applied with a default of 4x shared buffers.  I have added a 9.4
  TODO that we might want to revisit this.
 
 I certainly want to revisit this patch and this setting.
 
 How can we possibly justify a default setting that could be more than
 physical RAM?

Because it doesn't hurt overly much if it's set too large?

 The maximum known safe value is the setting of shared_buffers itself,
 without external knowledge. But how can we possibly set it even that
 high?

 Does anyone have any evidence at all on how to set this? How can we
 possibly autotune it?

It's just a different default setting? I think the new value will cause
less problems than the old one which frequently leads to index scans not
being used although beneficial.

 I prefer the idea of removing effective_cache_size completely, since
 it has so little effect on workloads and is very frequently
 misunderstood by users. It's just dangerous, without being useful.

-many.

 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

That'd cause *massive* regression for many installations. Without
significantly overhauling costsize.c that's really not feasible. There's
lots of installations that use relatively small s_b settings for good
reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
indexes anymore.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

Apparently, you don't even understand what this parameter is for.
Setting it smaller than shared_buffers is insane.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

You know you can't justify that comment and so do I. What workload is
so badly affected as to justify use of the word insane in this
context?

I can read code. But it appears nobody apart from me actually does, or
at least understand the behaviour that results.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 15:17, Andres Freund and...@2ndquadrant.com wrote:

 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 That'd cause *massive* regression for many installations. Without
 significantly overhauling costsize.c that's really not feasible. There's
 lots of installations that use relatively small s_b settings for good
 reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
 indexes anymore.

many queries can't be correct. All this changes is the cost of
IndexScans that would use more than 25% of shared_buffers worth of
data. Hopefully not many of those in your workload. Changing the cost
doesn't necessarily prevent index scans either. And if there are many
of those in your workload AND you run more than one at same time, then
the larger setting will work against you. So the benefit window for
such a high setting is slim, at best.

I specifically picked 25% of shared_buffers because that is the point
at which sequential scans become more efficient and use the cache more
efficiently. If our cost models are correct, then switching away from
index scans shouldn't hurt at all.

Assuming we can use large tranches of memory for single queries has a
very bad effect on cache hit ratios. Encouraging such usage seems to
fall into the category of insane, from my perspective. Having it as a
default setting is bad.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

 You know you can't justify that comment and so do I.

What I meant is that your comments indicate complete lack of understanding
of the parameter.  It's *supposed* to be larger than shared_buffers, and
there is no safety risk involved in setting it too high.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Andres Freund
On 2014-05-06 17:43:45 +0100, Simon Riggs wrote:
 On 6 May 2014 15:17, Andres Freund and...@2ndquadrant.com wrote:
 
  Lets fix e_c_s at 25% of shared_buffers and remove the parameter
  completely, just as we do with so many other performance parameters.
 
  That'd cause *massive* regression for many installations. Without
  significantly overhauling costsize.c that's really not feasible. There's
  lots of installations that use relatively small s_b settings for good
  reasons. If we fix e_c_s to 25% of s_b many queries on those won't use
  indexes anymore.
 
 many queries can't be correct.

It is.

 All this changes is the cost of
 IndexScans that would use more than 25% of shared_buffers worth of
 data. Hopefully not many of those in your workload. Changing the cost
 doesn't necessarily prevent index scans either. And if there are many
 of those in your workload AND you run more than one at same time, then
 the larger setting will work against you. So the benefit window for
 such a high setting is slim, at best.

Why? There's many workloads where indexes are larger than shared buffers
but fit into the operating system's cache. And that's precisely what
effective_cache_size is about.
Especially on bigger machines shared_buffers can't be set big enough to
actually use all the machine's memory. It's not uncommon to have 4GB
shared buffers on a machine with 512GB RAM... It'd be absolutely
disastrous to set effective_cache_size to 1GB for an analytics workload.

 I specifically picked 25% of shared_buffers because that is the point
 at which sequential scans become more efficient and use the cache more
 efficiently. If our cost models are correct, then switching away from
 index scans shouldn't hurt at all.

More often than not indexes are smaller than the table size, so this
argument doesn't seem to make much sense.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Josh Berkus
On 05/06/2014 08:41 AM, Simon Riggs wrote:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.
 
 You know you can't justify that comment and so do I. What workload is
 so badly affected as to justify use of the word insane in this
 context?

Most of them?  Really?

I have to tell you, your post sounds like you've missed out on the last
12 years of PostgreSQL query tuning.  Which is a little shocking
considering where you've spent that 12 years.

 I can read code. But it appears nobody apart from me actually does, or
 at least understand the behaviour that results.

So, break it down for us: explain how we'll get desirable query plans
out of the current code if:

(1) Table  Index is larger than shared_buffers;
(2) Table  Index is smaller than RAM;
(3) Selectivity is 0.02
(4) ECS is set lower than shared_buffers

I think the current cost math does a pretty good job of choosing the
correct behavior if ECS is set correctly.  But if it's not, no.

If I'm wrong, then you've successfully found a bug in our costing math,
so I'd love to see it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Jeff Janes
On Tue, May 6, 2014 at 7:18 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Simon Riggs si...@2ndquadrant.com writes:
  Lets fix e_c_s at 25% of shared_buffers and remove the parameter
  completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.


The e_c_s is assumed to be usable for each backend trying to run queries
sensitive to it.  If you have dozens of such queries running simultaneously
(not something I personally witness, but also not insane) and each of these
queries has its own peculiar working set, then having e_c_s smaller than
s_b makes sense.

I have a hard time believe that this is at all common, however.   Certainly
not common enough so to justify cranking the setting all the way the other
direction and then removing the crank handle.

Cheers,

Jeff


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 18:08, Josh Berkus j...@agliodbs.com wrote:
 On 05/06/2014 08:41 AM, Simon Riggs wrote:
 On 6 May 2014 15:18, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 Lets fix e_c_s at 25% of shared_buffers and remove the parameter
 completely, just as we do with so many other performance parameters.

 Apparently, you don't even understand what this parameter is for.
 Setting it smaller than shared_buffers is insane.

 You know you can't justify that comment and so do I. What workload is
 so badly affected as to justify use of the word insane in this
 context?

 Most of them?  Really?

I didn't use the word most anywhere. So not really clear what you are saying.


 I have to tell you, your post sounds like you've missed out on the last
 12 years of PostgreSQL query tuning.  Which is a little shocking
 considering where you've spent that 12 years.

 I read the code, think what to say and then say what I think, not
rely on dogma.

I tried to help years ago by changing the docs on e_c_s, but that's
been mostly ignored down the years, as it is again here.


 I can read code. But it appears nobody apart from me actually does, or
 at least understand the behaviour that results.

 So, break it down for us: explain how we'll get desirable query plans
 out of the current code if:

 (1) Table  Index is larger than shared_buffers;
 (2) Table  Index is smaller than RAM;
 (3) Selectivity is 0.02
 (4) ECS is set lower than shared_buffers

Is that it? The above use case is the basis for a default setting??

It's a circular argument, since you're assuming we've all followed
your advice of setting shared_buffers to 25% of RAM, which then
presumes a large gap between (1) and (2). It also ignores that if ECS
is set low then it increases the cost, but does not actually preclude
index scans larger than that setting. It also ignores that if your
database fits in RAM, your random_page_cost setting is wrong and
lowering that appropriately will increase the incidence of index scans
again.

You should also include

(5) You're only running one query at a time (which you know, how?)
(6) You don't care if you flush your cache for later queries
(7) You've got big tables yet are not partitioning them effectively

 I think the current cost math does a pretty good job of choosing the
 correct behavior if ECS is set correctly.  But if it's not, no.

 If I'm wrong, then you've successfully found a bug in our costing math,
 so I'd love to see it.

Setting it high generates lovely EXPLAINs for a single query, but do
we have any evidence that whole workloads are better off with higher
settings? And that represents the general case?

And it makes sense even if it makes it bigger than actual RAM??

If you assume that you can use all of that memory, you're badly wrong.
Presumably you also set work_mem larger than shared_buffers, since
that will induce exactly the same behaviour and have the same
downsides. (Large memory usage for single query, but causes cache
churn, plus problems if we try to overuse RAM because of concurrent
usage).

In the absence of  performance measurements that show the genuine
effect on workloads, I am attempting to make a principle-based
argument. I suggested 25% of shared_buffers because we already use
that as the point where other features cut in to minimise cache churn.
I'm making the argument that if *that* setting is the right one to
control cache churn, then why is it acceptable for index scans to
churn up even bigger chunks of cache?

In case it wasn't clear, I am only suggesting 25% of shared_buffers
for large settings, not for micro-configurations. My proposal to
remove the setting completely was a rhetorical question, asking why we
have a setting for this parameter and yet no tunables for other
things.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 20:41, Jeff Janes jeff.ja...@gmail.com wrote:

 The e_c_s is assumed to be usable for each backend trying to run queries
 sensitive to it.  If you have dozens of such queries running simultaneously
 (not something I personally witness, but also not insane) and each of these
 queries has its own peculiar working set, then having e_c_s smaller than s_b
 makes sense.

 I have a hard time believe that this is at all common, however.

If larger queries are frequent enough to care about, they will happen together.

We should be acting conservatively with default settings. You can be
as aggressive as you like with your own config.

 Certainly
 not common enough so to justify cranking the setting all the way the other
 direction and then removing the crank handle.

Yes, that part was mostly rhetorical, I wasn't arguing for complete
removal, especially when autotuning is unclear.

I am worried about people that set effective_cache_size but not
shared_buffers, which is too common. If we link the two parameters it
should work in both directions by default.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Robert Haas
On Tue, May 6, 2014 at 4:38 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I read the code, think what to say and then say what I think, not
 rely on dogma.

 I tried to help years ago by changing the docs on e_c_s, but that's
 been mostly ignored down the years, as it is again here.

Well, for what it's worth, I've encountered systems where setting
effective_cache_size too low resulted in bad query plans, but I've
never encountered the reverse situation.  My personal sample size is
pretty small, though.  And, when I did a study of 100+
pgsql-performance reports for last year's PGCon talk, I didn't turn up
any that seemed related to effective_cache_size.  Here's the subset of
those reports that appeared settings-related:

https://sites.google.com/site/robertmhaas/query-performance/settings

I think the basic problem with effective_cache_size is that it's a
pretty weak knob.  I don't think it's a secret that we more often
seq-scan when we should have index-scanned than the other other way
around.  So if I had to hard-code a value for effective_cache_size,
I'd probably pick positive infinity.  Yeah, that could be overkill -
but I bet I'd be able to compensate by frobbing seq_page_cost and
random_page_cost in a pinch.

I basically think the auto-tuning we've installed for
effective_cache_size is stupid.  Most people are going to run with
only a few GB of shared_buffers, so setting effective_cache_size to a
small multiple of that isn't going to make many more people happy than
just raising the value - say from the current default of 128MB to, oh,
4GB - especially because in my experience queries aren't very
sensitive to the exact value; it just has to not be way too small.  I
bet the number of PostgreSQL users who would be made happy by a much
higher hard-coded default is not too different from the number that
will be made happy by the (completely unprincipled) auto-tuning.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Andrew Dunstan


On 05/06/2014 05:54 PM, Robert Haas wrote:

On Tue, May 6, 2014 at 4:38 PM, Simon Riggs si...@2ndquadrant.com wrote:

  I read the code, think what to say and then say what I think, not
rely on dogma.

I tried to help years ago by changing the docs on e_c_s, but that's
been mostly ignored down the years, as it is again here.

Well, for what it's worth, I've encountered systems where setting
effective_cache_size too low resulted in bad query plans, but I've
never encountered the reverse situation.



I have encountered both. Recently I discovered that a client's 
performance problems were solved pretty instantly by reducing a 
ridiculously high setting down to something more reasonable (in this 
case about 50% of physical RAM is what we set it to).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 22:54, Robert Haas robertmh...@gmail.com wrote:
 On Tue, May 6, 2014 at 4:38 PM, Simon Riggs si...@2ndquadrant.com wrote:
  I read the code, think what to say and then say what I think, not
 rely on dogma.

 I tried to help years ago by changing the docs on e_c_s, but that's
 been mostly ignored down the years, as it is again here.

 Well, for what it's worth, I've encountered systems where setting
 effective_cache_size too low resulted in bad query plans, but I've
 never encountered the reverse situation.

I agree with that.

Though that misses my point, which is that you can't know that all of
that memory is truly available on a server with many concurrent users.
Choosing settings that undercost memory intensive plans are not the
best choice for a default strategy in a mixed workload when cache may
be better used elsewhere, even if such settings make sense for some
individual users.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I basically think the auto-tuning we've installed for
 effective_cache_size is stupid.  Most people are going to run with
 only a few GB of shared_buffers, so setting effective_cache_size to a
 small multiple of that isn't going to make many more people happy than
 just raising the value - say from the current default of 128MB to, oh,
 4GB - especially because in my experience queries aren't very
 sensitive to the exact value; it just has to not be way too small.  I
 bet the number of PostgreSQL users who would be made happy by a much
 higher hard-coded default is not too different from the number that
 will be made happy by the (completely unprincipled) auto-tuning.

There is a lot to be said for that argument, especially considering
that we're not even really happy with the auto-tuning mechanism,
never mind the behavior it's trying to implement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Josh Berkus
On 05/06/2014 01:38 PM, Simon Riggs wrote:
 Most of them?  Really?
 
 I didn't use the word most anywhere. So not really clear what you are 
 saying.

Sorry, those were supposed to be periods, not question marks.  As in
Most of them.  Really.

 I have to tell you, your post sounds like you've missed out on the last
 12 years of PostgreSQL query tuning.  Which is a little shocking
 considering where you've spent that 12 years.
 
  I read the code, think what to say and then say what I think, not
 rely on dogma.
 
 I tried to help years ago by changing the docs on e_c_s, but that's
 been mostly ignored down the years, as it is again here.

Well, if you're going to buck the conventional wisdom, you need to
provide a factual and numerical basis for your arguments.  So far, I
haven't seen you do so, although I'll admit that I haven't read 100% of
hackers traffic.   So if you have previously presented benchmarking
results or math, please post a link to the archives.

 (1) Table  Index is larger than shared_buffers;
 (2) Table  Index is smaller than RAM;
 (3) Selectivity is 0.02
 (4) ECS is set lower than shared_buffers
 
 Is that it? The above use case is the basis for a default setting??

Are you just going to ask rhetorical questions?

 It's a circular argument, since you're assuming we've all followed
 your advice of setting shared_buffers to 25% of RAM, which then
 presumes a large gap between (1) and (2).

That 20% to 25% recommendation had a factual and numerical basis, based
on extensive testing using DBT2 at OSDL.  While we have reason to
believe that advice may be somewhat dated, nobody has undertaken the
benchmarking work to create a new advice basis. If you have done so, you
have not shared the results.

 Setting it high generates lovely EXPLAINs for a single query, but do
 we have any evidence that whole workloads are better off with higher
 settings? And that represents the general case?

So?  Create a benchmark.  Prove that you're right.  I'd love to see it,
we're suffering from a serious lack of data here.

 In the absence of  performance measurements that show the genuine
 effect on workloads, I am attempting to make a principle-based
 argument. I suggested 25% of shared_buffers because we already use
 that as the point where other features cut in to minimise cache churn.

That makes no sense whatsoever.  Again, show me the math.

 In case it wasn't clear, I am only suggesting 25% of shared_buffers
 for large settings, not for micro-configurations. My proposal to
 remove the setting completely was a rhetorical question, asking why we
 have a setting for this parameter and yet no tunables for other
 things.

Asking rhetorical questions based on extreme perspectives that you don't
really believe in is *definitionally* trolling.  If you're going to make
an argument in favor of different tuning advice, then do it based on
something in which you actually believe, based on hard evidence.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Josh Berkus
Robert, Tom:

On 05/06/2014 03:28 PM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 I basically think the auto-tuning we've installed for
 effective_cache_size is stupid.  Most people are going to run with
 only a few GB of shared_buffers, so setting effective_cache_size to a
 small multiple of that isn't going to make many more people happy than
 just raising the value - say from the current default of 128MB to, oh,
 4GB - especially because in my experience queries aren't very
 sensitive to the exact value; it just has to not be way too small.  I
 bet the number of PostgreSQL users who would be made happy by a much
 higher hard-coded default is not too different from the number that
 will be made happy by the (completely unprincipled) auto-tuning.
 
 There is a lot to be said for that argument, especially considering
 that we're not even really happy with the auto-tuning mechanism,
 never mind the behavior it's trying to implement.

Right, the decisive question with this patch is: does it improve things
over what would have happened with most users anyway?

Based on the users I deal with ... which skew rather strongly EC2 web
applications and one-off data warehouses ... most users don't set
effective_cache_size *at all* until they hire me or chat me up on IRC.
This means that ECS is running with the default of 128MB, which means
that Postgres is seriously underestimating the probability of cached
data on most machines today.

The users I deal with are a lot more likely to have set shared_buffers
themselves, based on the 25% conventional wisdom (or based on some other
advice).  And, when they want to tinker with pushing index usage, they
change random_page_cost instead, sometimes to silly values (like 0.5).
  So, based solely on the users I deal with, I would find automatically
setting effective_cache_size to 3X or 4X shared_buffers to be a benefit
compared to leaving it at its current fixed low default.

Other people on this list deal with different kinds of users, so if
people work with a class of users where a default of 4 X shared_buffers
would be definitely a worse idea than the current default, then please
speak up.

ECS is definitely a weak knob, as Robert says, but it's *good* that
it's a weak knob.  This means that we can make large adjustments in it
without introducing a lot of highly variable behavior ... instead of
random_page_cost, which does.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 23:47, Josh Berkus j...@agliodbs.com wrote:

 If you're going to make
 an argument in favor of different tuning advice, then do it based on
 something in which you actually believe, based on hard evidence.

The proposed default setting of 4x shared_buffers is unprincipled
*and* lacks hard evidence from you and everybody else.

You've used the phrase conventional wisdom to describe things which
you have spoken loudly about.  I personally have not seen sufficient
evidence to rely on that as wisdom.

I note that my detailed comments as to why it is unsafe have been
ignored, again.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Simon Riggs
On 6 May 2014 23:28, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I basically think the auto-tuning we've installed for
 effective_cache_size is stupid.  Most people are going to run with
 only a few GB of shared_buffers, so setting effective_cache_size to a
 small multiple of that isn't going to make many more people happy than
 just raising the value - say from the current default of 128MB to, oh,
 4GB - especially because in my experience queries aren't very
 sensitive to the exact value; it just has to not be way too small.  I
 bet the number of PostgreSQL users who would be made happy by a much
 higher hard-coded default is not too different from the number that
 will be made happy by the (completely unprincipled) auto-tuning.

 There is a lot to be said for that argument, especially considering
 that we're not even really happy with the auto-tuning mechanism,
 never mind the behavior it's trying to implement.

+1

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Peter Geoghegan
On Tue, May 6, 2014 at 10:20 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On 6 May 2014 23:47, Josh Berkus j...@agliodbs.com wrote:

 If you're going to make
 an argument in favor of different tuning advice, then do it based on
 something in which you actually believe, based on hard evidence.

 The proposed default setting of 4x shared_buffers is unprincipled
 *and* lacks hard evidence from you and everybody else.

+1. In my view, we probably should have set it to a much higher
absolute default value. The main problem with setting it to any
multiple of shared_buffers that I can see is that shared_buffers is a
very poor proxy for what effective_cache_size is supposed to
represent. In general, the folk wisdom around sizing shared_buffers
has past its sell-by date.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2014-05-06 Thread Mark Kirkwood

On 07/05/14 17:35, Peter Geoghegan wrote:

On Tue, May 6, 2014 at 10:20 PM, Simon Riggs si...@2ndquadrant.com wrote:

On 6 May 2014 23:47, Josh Berkus j...@agliodbs.com wrote:


If you're going to make
an argument in favor of different tuning advice, then do it based on
something in which you actually believe, based on hard evidence.

The proposed default setting of 4x shared_buffers is unprincipled
*and* lacks hard evidence from you and everybody else.

+1. In my view, we probably should have set it to a much higher
absolute default value. The main problem with setting it to any
multiple of shared_buffers that I can see is that shared_buffers is a
very poor proxy for what effective_cache_size is supposed to
represent. In general, the folk wisdom around sizing shared_buffers
has past its sell-by date.



+1. ISTM the only sensible approach to auto tune this requires us to 
have a plugin to detect how much RAM the system has (and then setting it 
to 1/2 that say). I wonder if it might be worthwhile writing plugins for 
the handful of popular platforms. For the remainder maybe we could leave 
it defaulting to the current (small) value, and encourage volunteers to 
code the missing ones if they want something better.


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-10-08 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 05:14:37PM -0400, Bruce Momjian wrote:
 On Thu, Sep  5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
   I have developed the attached patch which implements an auto-tuned
   effective_cache_size which is 4x the size of shared buffers.  I had to
   set effective_cache_size to its old 128MB default so the EXPLAIN
   regression tests would pass unchanged.
  
  That's not really autotuning though. ISTM that making the *default* 4
  x shared_buffers might make perfect sense, but do we really need to
  hijack the value of -1 for that? That might be useful for some time
  when we have actual autotuning, that somehow inspects the system and
  tunes it from there.
 
  I also don't think it should be called autotuning, when it's just a
  smarter default value.
  
  I like the feature, though, just not the packaging.
 
 That auto-tuning text came from the wal_buffer documentation, which
 does exactly this based on shared_buffers:
 
 The contents of the WAL buffers are written out to disk at every
 transaction commit, so extremely large values are unlikely to
 provide a significant benefit.  However, setting this value to at
 least a few megabytes can improve write performance on a busy
 -- server where many clients are committing at once.  The auto-tuning
---
 selected by the default setting of -1 should give reasonable
 results in most cases.
 
 I am fine with rewording and not using -1, but we should change the
 wal_buffer default and documentation too then.  I am not sure what other
 value than -1 to use?  0?  I figure if we ever get better auto-tuning,
 we would just remove this functionality and make it better.

Patch applied with a default of 4x shared buffers.  I have added a 9.4
TODO that we might want to revisit this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-10-08 Thread Kevin Hale Boyes
The patch contains a small typo in config.sgml.  Probably just drop the
is from is can.

+results if this database cluster is can utilize most of the memory

Kevin.


On 8 October 2013 10:13, Bruce Momjian br...@momjian.us wrote:

 On Thu, Sep  5, 2013 at 05:14:37PM -0400, Bruce Momjian wrote:
  On Thu, Sep  5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
I have developed the attached patch which implements an auto-tuned
effective_cache_size which is 4x the size of shared buffers.  I had
 to
set effective_cache_size to its old 128MB default so the EXPLAIN
regression tests would pass unchanged.
  
   That's not really autotuning though. ISTM that making the *default* 4
   x shared_buffers might make perfect sense, but do we really need to
   hijack the value of -1 for that? That might be useful for some time
   when we have actual autotuning, that somehow inspects the system and
   tunes it from there.
  
   I also don't think it should be called autotuning, when it's just a
   smarter default value.
  
   I like the feature, though, just not the packaging.
 
  That auto-tuning text came from the wal_buffer documentation, which
  does exactly this based on shared_buffers:
 
  The contents of the WAL buffers are written out to disk at every
  transaction commit, so extremely large values are unlikely to
  provide a significant benefit.  However, setting this value to at
  least a few megabytes can improve write performance on a busy
  -- server where many clients are committing at once.  The
 auto-tuning
 
  ---
  selected by the default setting of -1 should give reasonable
  results in most cases.
 
  I am fine with rewording and not using -1, but we should change the
  wal_buffer default and documentation too then.  I am not sure what other
  value than -1 to use?  0?  I figure if we ever get better auto-tuning,
  we would just remove this functionality and make it better.

 Patch applied with a default of 4x shared buffers.  I have added a 9.4
 TODO that we might want to revisit this.

 --
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com

   + It's impossible for everything to be true. +


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-10-08 Thread Bruce Momjian
On Tue, Oct  8, 2013 at 01:04:18PM -0600, Kevin Hale Boyes wrote:
 The patch contains a small typo in config.sgml.  Probably just drop the is
 from is can.
 
 +results if this database cluster is can utilize most of the memory
 
 Kevin.

Thank you, fixed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Robert Haas
On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote:
 I think that most of the arguments in this thread drastically
 overestimate the precision and the effect of effective_cache_size. The
 planner logic behind it basically only uses it to calculate things
 within a single index scan. That alone shows that any precise
 calculation cannot be very meaningful.
 It also does *NOT* directly influence how the kernel caches disk
 io. It's there to guess how likely it is something is still cached when
 accessing things repeatedly.

 Agreed.  I think we should take the patch as-is, and spend the rest of
 the 9.4 dev cycle arguing about 3x vs. 4x.

 ;-)

I'm happy with that option, but I think the larger point here is that
this only has a hope of being right if you're setting shared_buffers
to 25% of system memory.  And more and more, people are not doing
that, because of the other recommendation, not much discussed here, to
cap shared_buffers at about 8GB.  Systems whose total memory is far
larger than 32GB are becoming quite commonplace, and only figure to
become moreso.  So while I don't particularly object to this proposal,
it would have had a lot more value if we'd done it 5 years ago.

Now the good news is that right now the default is 128MB, and under
any of these proposals the default will go up, quite a bit.  Default
shared_buffers is now 128MB, so we're looking at raising the default
to at least 384MB, and for people who also tune shared_buffers but
might not bother with effective cache size, it'll go up a lot more.
That's clearly a move in the right direction even if the accuracy of
the formula is suspect (which it is).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 10:08 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 11, 2013 at 3:40 PM, Josh Berkus j...@agliodbs.com wrote:
 I think that most of the arguments in this thread drastically
 overestimate the precision and the effect of effective_cache_size. The
 planner logic behind it basically only uses it to calculate things
 within a single index scan. That alone shows that any precise
 calculation cannot be very meaningful.
 It also does *NOT* directly influence how the kernel caches disk
 io. It's there to guess how likely it is something is still cached when
 accessing things repeatedly.

 Agreed.  I think we should take the patch as-is, and spend the rest of
 the 9.4 dev cycle arguing about 3x vs. 4x.

 ;-)

 I'm happy with that option, but I think the larger point here is that
 this only has a hope of being right if you're setting shared_buffers
 to 25% of system memory.  And more and more, people are not doing
 that, because of the other recommendation, not much discussed here, to
 cap shared_buffers at about 8GB.  Systems whose total memory is far
 larger than 32GB are becoming quite commonplace, and only figure to
 become moreso.  So while I don't particularly object to this proposal,
 it would have had a lot more value if we'd done it 5 years ago.

 Now the good news is that right now the default is 128MB, and under
 any of these proposals the default will go up, quite a bit.  Default
 shared_buffers is now 128MB, so we're looking at raising the default
 to at least 384MB, and for people who also tune shared_buffers but
 might not bother with effective cache size, it'll go up a lot more.
 That's clearly a move in the right direction even if the accuracy of
 the formula is suspect (which it is).

This is a very important point: the 8gb cap is also to high.  We have
a very high transaction rate server here that exploded at 32GB, was
downgraded to 2GB ran fine, then upgraded to 4GB (over my strenuous
objection) and exploded again.

The stock documentation advice I probably needs to be revised to so
that's the lesser of 2GB and 25%.  I'm more and more coming around to
the opinion that in terms of shared buffers we have some major
problems that manifest in high end servers.  So +1 to your point,
although I'm still ok with the auto-setting on the basis that for very
high end servers most of the setting end up being manually tweaked
anyways.   We do need to be cautious though; it's not impossible that
improvements to buffers system might cause the 25% advice to revised
upwards in the not-to-near future if some of the problems get solved..

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
 The stock documentation advice I probably needs to be revised to so
 that's the lesser of 2GB and 25%.

 I think that would be a pretty bad idea. There are lots of workloads
 where people have postgres happily chugging along with s_b lots bigger
 than that and see benefits.
 We have a couple people reporting mostly undiagnosed (because that turns
 out to be hard!) problems that seem to be avoided with smaller s_b. We
 don't even remotely know enough about the problem to make such general
 recommendations.

I happen to be one of those couple people.  Load goes from 0.1 to
500 without warning then back to 0.1 equally without warning.
Unfortunately the server is in a different jurisdiction such that it
makes deep forensic analysis impossible.  I think this is happening
more and more often as postgres is becoming increasingly deployed on
high(er) -end servers.  I've personally (alone) dealt with 4-5
confirmed cases and there have been many more.  We have a problem.

But, to address your point, the big s_b benefits are equally hard to
quantify (unless your database happens to fit in s_b) -- they mostly
help high write activity servers where the write activity fits a very
specific pattern.  But the risks of low s_b (mostly slightly higher
i/o and query latency) are much easier to deal with than high s_b
(even if less likely); random inexplicable server stalls and other
weird manifestations.  My stock advice remains to set to max 2gb
until having a reason (of which there can be many) to set otherwise.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
 The stock documentation advice I probably needs to be revised to so
 that's the lesser of 2GB and 25%.

I think that would be a pretty bad idea. There are lots of workloads
where people have postgres happily chugging along with s_b lots bigger
than that and see benefits.
We have a couple people reporting mostly undiagnosed (because that turns
out to be hard!) problems that seem to be avoided with smaller s_b. We
don't even remotely know enough about the problem to make such general
recommendations.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 11:27:03 -0500, Merlin Moncure wrote:
 On Fri, Sep 13, 2013 at 11:07 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2013-09-13 10:50:06 -0500, Merlin Moncure wrote:
  The stock documentation advice I probably needs to be revised to so
  that's the lesser of 2GB and 25%.
 
  I think that would be a pretty bad idea. There are lots of workloads
  where people have postgres happily chugging along with s_b lots bigger
  than that and see benefits.
  We have a couple people reporting mostly undiagnosed (because that turns
  out to be hard!) problems that seem to be avoided with smaller s_b. We
  don't even remotely know enough about the problem to make such general
  recommendations.

 I happen to be one of those couple people.  Load goes from 0.1 to
 500 without warning then back to 0.1 equally without warning.
 Unfortunately the server is in a different jurisdiction such that it
 makes deep forensic analysis impossible.  I think this is happening
 more and more often as postgres is becoming increasingly deployed on
 high(er) -end servers.  I've personally (alone) dealt with 4-5
 confirmed cases and there have been many more.  We have a problem.

Absolutely not claiming the contrary. I think it sucks that we couldn't
fully figure out what's happening in detail. I'd love to get my hand on
a setup where it can be reliably reproduced.

 But, to address your point, the big s_b benefits are equally hard to
 quantify (unless your database happens to fit in s_b)

Databases where the hot dataset fits in s_b is pretty honking big use
case tho. That's one of the primary reasons to buy machines with
craploads of memory.

That said, I think having a note in the docs that large s_b can cause
such a problem might not be a bad idea and I surely wouldn't argue
against it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 Absolutely not claiming the contrary. I think it sucks that we
 couldn't fully figure out what's happening in detail. I'd love to
 get my hand on a setup where it can be reliably reproduced.

I have seen two completely different causes for symptoms like this,
and I suspect that these aren't the only two.

(1)  The dirty page avalanche: PostgreSQL hangs on to a large
number of dirty buffers and then dumps a lot of them at once.  The
OS does the same.  When PostgreSQL dumps its buffers to the OS it
pushes the OS over a tipping point where it is writing dirty
buffers too fast for the controller's BBU cache to absorb them. 
Everything freezes until the controller writes and accepts OS
writes for a lot of data.  This can take several minutes, during
which time the database seems frozen.  Cure is some combination
of these: reduce shared_buffers, make the background writer more
aggressive, checkpoint more often, make the OS dirty page writing
more aggressive, add more BBU RAM to the controller.

(2)  Transparent huge page support goes haywire on its defrag work.
Clues on this include very high system CPU time during an
episode, and `perf top` shows more time in kernel spinlock
functions than anywhere else.  The database doesn't completely lock
up like with the dirty page avalanche, but it is slow enough that
users often describe it that way.  So far I have only seen this
cured by disabling THP support (in spite of some people urging that
just the defrag be disabled).  It does make me wonder whether there
is something we could do in PostgreSQL to interact better with
THPs.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Merlin Moncure
On Fri, Sep 13, 2013 at 4:04 PM, Kevin Grittner kgri...@ymail.com wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 Absolutely not claiming the contrary. I think it sucks that we
 couldn't fully figure out what's happening in detail. I'd love to
 get my hand on a setup where it can be reliably reproduced.

 I have seen two completely different causes for symptoms like this,
 and I suspect that these aren't the only two.

 (1)  The dirty page avalanche: PostgreSQL hangs on to a large
 number of dirty buffers and then dumps a lot of them at once.  The
 OS does the same.  When PostgreSQL dumps its buffers to the OS it
 pushes the OS over a tipping point where it is writing dirty
 buffers too fast for the controller's BBU cache to absorb them.
 Everything freezes until the controller writes and accepts OS
 writes for a lot of data.  This can take several minutes, during
 which time the database seems frozen.  Cure is some combination
 of these: reduce shared_buffers, make the background writer more
 aggressive, checkpoint more often, make the OS dirty page writing
 more aggressive, add more BBU RAM to the controller.

Yeah -- I've seen this too, and it's a well understood problem.
Getting o/s to spin dirty pages out faster is the name of the game I
think.  Storage is getting so fast that it's (mostly) moot anyways.
Also, this is under the umbrella of 'high i/o' -- the stuff I've been
seeing  is low- or no- I/o.

 (2)  Transparent huge page support goes haywire on its defrag work.
 Clues on this include very high system CPU time during an
 episode, and `perf top` shows more time in kernel spinlock
 functions than anywhere else.  The database doesn't completely lock
 up like with the dirty page avalanche, but it is slow enough that
 users often describe it that way.  So far I have only seen this
 cured by disabling THP support (in spite of some people urging that
 just the defrag be disabled).  It does make me wonder whether there
 is something we could do in PostgreSQL to interact better with
 THPs.

Ah, that's a useful tip; need to research that, thanks.  Maybe Josh
might be able to give it a whirl...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-13 Thread Andres Freund
On 2013-09-13 14:04:55 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  Absolutely not claiming the contrary. I think it sucks that we
  couldn't fully figure out what's happening in detail. I'd love to
  get my hand on a setup where it can be reliably reproduced.
 
 I have seen two completely different causes for symptoms like this,
 and I suspect that these aren't the only two.
 
 (1)  The dirty page avalanche: PostgreSQL hangs on to a large
 number of dirty buffers and then dumps a lot of them at once.  The
 OS does the same.  When PostgreSQL dumps its buffers to the OS it
 pushes the OS over a tipping point where it is writing dirty
 buffers too fast for the controller's BBU cache to absorb them. 
 Everything freezes until the controller writes and accepts OS
 writes for a lot of data.  This can take several minutes, during
 which time the database seems frozen.  Cure is some combination
 of these: reduce shared_buffers, make the background writer more
 aggressive, checkpoint more often, make the OS dirty page writing
 more aggressive, add more BBU RAM to the controller.

That should hopefully be diagnosable from other system stats like the
dirty rate.

 (2)  Transparent huge page support goes haywire on its defrag work.
 Clues on this include very high system CPU time during an
 episode, and `perf top` shows more time in kernel spinlock
 functions than anywhere else.  The database doesn't completely lock
 up like with the dirty page avalanche, but it is slow enough that
 users often describe it that way.  So far I have only seen this
 cured by disabling THP support (in spite of some people urging that
 just the defrag be disabled).  

Yes, I have seen that issue a couple of times now as well. I can confirm
that in at least two cases disabling defragmentation alone proved to be
enough to fix the issue.
Annoyingly enough there are different ways to disable
defragmentation/THP depending on whether you're using THP backported by
redhat or the upstream version...

 It does make me wonder whether there
 is something we could do in PostgreSQL to interact better with
 THPs.

The best thing I see is to just use explicit hugepages. I've previously
sent a prototype for that then has been turned into an actual
implementation by Christian Kruse.
A colleague of mine is working on polishing that patch into something
committable.
If you use large s_b, the memory savings alone (some 100kb instead
dozens of megabytes per backend) can be worth it, not to talk of actual
performance gains.

Updating the kernel helps as well, they've improved the efficiency of
defragmentation a good bit.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Bruce Momjian
On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote:
 Merlin,
 
  I vote 4x on the basis that for this setting (unlike almost all the
  other memory settings) the ramifications for setting it too high
  generally aren't too bad.  Also, the o/s and temporary memory usage as
  a share of total physical memory has been declining over time
 
 If we're doing that, then we should change our general advice on this
 setting as well.

Uh, what general advice?  I don't see 4x mentioned anywhere.

 Another argument in favor: this is a default setting, and by default,
 shared_buffers won't be 25% of RAM.

So, are you saying you like 4x now?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Bruce Momjian
On Wed, Sep 11, 2013 at 09:18:30AM -0400, Bruce Momjian wrote:
 On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote:
  Merlin,
  
   I vote 4x on the basis that for this setting (unlike almost all the
   other memory settings) the ramifications for setting it too high
   generally aren't too bad.  Also, the o/s and temporary memory usage as
   a share of total physical memory has been declining over time
  
  If we're doing that, then we should change our general advice on this
  setting as well.
 
 Uh, what general advice?  I don't see 4x mentioned anywhere.
 
  Another argument in favor: this is a default setting, and by default,
  shared_buffers won't be 25% of RAM.
 
 So, are you saying you like 4x now?

Here is an arugment for 3x.  First, using the documented 25% of RAM, 3x
puts our effective_cache_size as 75% of RAM, giving us no room for
kernel, backend memory, and work_mem usage.  If anything it should be
lower than 3x, not higher.

Second, if the machine is not a dedicated machine, and supposed 10% of
RAM is used for shared_buffers, 4x would put effective cache size at 40%
of RAM, which again seems too high, considering others are using the
machine and filling the kernel cache.  3x also seems too high, but
acceptable at 30% of RAM.

I basically can't imagine a case where you set shared_buffers to a
reasonable value and would still have 4x of that available for kernel
cache.

Finally, for those who like the idea of 4x, you can think of
shared_buffers (1x) + effective_cache_size (3x) as totalling 4x.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Alvaro Herrera
Bruce Momjian escribió:

  So, are you saying you like 4x now?
 
 Here is an arugment for 3x.  First, using the documented 25% of RAM, 3x
 puts our effective_cache_size as 75% of RAM, giving us no room for
 kernel, backend memory, and work_mem usage.  If anything it should be
 lower than 3x, not higher.

The other argument I see for the 3x value is that it is a compromise.
People with really large servers will want to increase it; people with
very small servers will want to reduce it.


 Finally, for those who like the idea of 4x, you can think of
 shared_buffers (1x) + effective_cache_size (3x) as totalling 4x.

This part of your argument doesn't work really, because AFAIR the
effective_cache_size value ought to consider that shared_buffers is part
of it (so e_c_s is shared_buffers + kernel cache).  So if you're seeing
the 4x as e_c_s + s_b, you would be counting s_b twice.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Bruce Momjian
On Wed, Sep 11, 2013 at 12:43:07PM -0300, Alvaro Herrera wrote:
 Bruce Momjian escribió:
 
   So, are you saying you like 4x now?
  
  Here is an arugment for 3x.  First, using the documented 25% of RAM, 3x
  puts our effective_cache_size as 75% of RAM, giving us no room for
  kernel, backend memory, and work_mem usage.  If anything it should be
  lower than 3x, not higher.
 
 The other argument I see for the 3x value is that it is a compromise.
 People with really large servers will want to increase it; people with
 very small servers will want to reduce it.

Yes, you could make the argument that 2x is the right default,
especially considering work_mem.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Josh Berkus
On 09/11/2013 08:27 AM, Bruce Momjian wrote:
 On Wed, Sep 11, 2013 at 09:18:30AM -0400, Bruce Momjian wrote:
 On Tue, Sep 10, 2013 at 03:08:24PM -0700, Josh Berkus wrote:
 Another argument in favor: this is a default setting, and by default,
 shared_buffers won't be 25% of RAM.

 So, are you saying you like 4x now?

I still prefer 3X.  However, I acknowledge that there are legitimate
arguments for higher values for a *default*.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Claudio Freire
On Wed, Sep 11, 2013 at 12:27 PM, Bruce Momjian br...@momjian.us wrote:
  Another argument in favor: this is a default setting, and by default,
  shared_buffers won't be 25% of RAM.

 So, are you saying you like 4x now?

 Here is an arugment for 3x.  First, using the documented 25% of RAM, 3x
 puts our effective_cache_size as 75% of RAM, giving us no room for
 kernel, backend memory, and work_mem usage.  If anything it should be
 lower than 3x, not higher.


AFAIK, e_c_s must include shared_buffers, so 25% + 75% = 75%

And your statement seems to assume 25% + 75% = 100%. Which isn't
universally true, no matter what your math teacher probably taught you
;-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Andres Freund
On 2013-09-11 12:53:29 -0400, Bruce Momjian wrote:
 On Wed, Sep 11, 2013 at 12:43:07PM -0300, Alvaro Herrera wrote:
  Bruce Momjian escribió:
  
So, are you saying you like 4x now?
   
   Here is an arugment for 3x.  First, using the documented 25% of RAM, 3x
   puts our effective_cache_size as 75% of RAM, giving us no room for
   kernel, backend memory, and work_mem usage.  If anything it should be
   lower than 3x, not higher.
  
  The other argument I see for the 3x value is that it is a compromise.
  People with really large servers will want to increase it; people with
  very small servers will want to reduce it.
 
 Yes, you could make the argument that 2x is the right default,
 especially considering work_mem.

I think that most of the arguments in this thread drastically
overestimate the precision and the effect of effective_cache_size. The
planner logic behind it basically only uses it to calculate things
within a single index scan. That alone shows that any precise
calculation cannot be very meaningful.
It also does *NOT* directly influence how the kernel caches disk
io. It's there to guess how likely it is something is still cached when
accessing things repeatedly.

I think nearly all practical experience shows that setting it smaller is
more likely to cause problems than setting it too low. We shouldn't be
too skimpy here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-11 Thread Josh Berkus

 I think that most of the arguments in this thread drastically
 overestimate the precision and the effect of effective_cache_size. The
 planner logic behind it basically only uses it to calculate things
 within a single index scan. That alone shows that any precise
 calculation cannot be very meaningful.
 It also does *NOT* directly influence how the kernel caches disk
 io. It's there to guess how likely it is something is still cached when
 accessing things repeatedly.

Agreed.  I think we should take the patch as-is, and spend the rest of
the 9.4 dev cycle arguing about 3x vs. 4x.

;-)

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-10 Thread Jeff Janes
On Mon, Sep 9, 2013 at 6:29 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Sep  5, 2013 at 09:02:27PM -0700, Josh Berkus wrote:
 On 09/05/2013 03:30 PM, Merlin Moncure wrote:

  Standard advice we've given in the past is 25% shared buffers, 75%
  effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
   Maybe we're changing the conventional calculation, but I thought I'd
  point that out.
 
  This was debated upthread.

 Actually, no, it wasn't.  Tom threw out a suggestion that we use 4X for
 historical reasons.  That's all, there was no discussion.

 So, my point stands: our historical advice has been to set EFS to 75% of
 RAM.  Maybe we're changing that advice, but if so, let's change it.
 Otherwise 3X makes more sense.

 So, what do we want the effective_cache_size default to be?  3x or 4x?
 We clearly state:

 If you have a dedicated database server with 1GB or more of RAM,
 a reasonable starting value for shared_buffers is 25% of the
 memory in your system.  There are some workloads where even

 If we make the default 4x, that means that people using the above
 suggestion would be setting their effective_cache_size to 100% of RAM?
 If we go with 4x, which I believe was the majority opinion, what shall
 we answer to someone who asks about this contradiction?

I vote for 3x.  The past defaults had a different ratio, but we are
changing things to make them better, not to leave them the same.  We
should change it be consistent with the advice we have long given.
Sure, 3 is not a power of 2, but I usually root for the underdog.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-10 Thread Merlin Moncure
On Tue, Sep 10, 2013 at 11:39 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 On Mon, Sep 9, 2013 at 6:29 PM, Bruce Momjian br...@momjian.us wrote:
 On Thu, Sep  5, 2013 at 09:02:27PM -0700, Josh Berkus wrote:
 On 09/05/2013 03:30 PM, Merlin Moncure wrote:

  Standard advice we've given in the past is 25% shared buffers, 75%
  effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
   Maybe we're changing the conventional calculation, but I thought I'd
  point that out.
 
  This was debated upthread.

 Actually, no, it wasn't.  Tom threw out a suggestion that we use 4X for
 historical reasons.  That's all, there was no discussion.

 So, my point stands: our historical advice has been to set EFS to 75% of
 RAM.  Maybe we're changing that advice, but if so, let's change it.
 Otherwise 3X makes more sense.

 So, what do we want the effective_cache_size default to be?  3x or 4x?
 We clearly state:

 If you have a dedicated database server with 1GB or more of RAM,
 a reasonable starting value for shared_buffers is 25% of the
 memory in your system.  There are some workloads where even

 If we make the default 4x, that means that people using the above
 suggestion would be setting their effective_cache_size to 100% of RAM?
 If we go with 4x, which I believe was the majority opinion, what shall
 we answer to someone who asks about this contradiction?

 I vote for 3x.  The past defaults had a different ratio, but we are
 changing things to make them better, not to leave them the same.  We
 should change it be consistent with the advice we have long given.
 Sure, 3 is not a power of 2, but I usually root for the underdog.

I vote 4x on the basis that for this setting (unlike almost all the
other memory settings) the ramifications for setting it too high
generally aren't too bad.  Also, the o/s and temporary memory usage as
a share of total physical memory has been declining over time
(meaning, that if you have a 256gb memory server and follow the advice
to set to 64gb, your memory for caching is approximately 64gb).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-10 Thread Josh Berkus
Merlin,

 I vote 4x on the basis that for this setting (unlike almost all the
 other memory settings) the ramifications for setting it too high
 generally aren't too bad.  Also, the o/s and temporary memory usage as
 a share of total physical memory has been declining over time

If we're doing that, then we should change our general advice on this
setting as well.

Another argument in favor: this is a default setting, and by default,
shared_buffers won't be 25% of RAM.

 (meaning, that if you have a 256gb memory server and follow the advice
 to set to 64gb, your memory for caching is approximately 64gb).

Wait, what?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-10 Thread Merlin Moncure
On Tue, Sep 10, 2013 at 5:08 PM, Josh Berkus j...@agliodbs.com wrote:
 Merlin,

 I vote 4x on the basis that for this setting (unlike almost all the
 other memory settings) the ramifications for setting it too high
 generally aren't too bad.  Also, the o/s and temporary memory usage as
 a share of total physical memory has been declining over time

 If we're doing that, then we should change our general advice on this
 setting as well.

 Another argument in favor: this is a default setting, and by default,
 shared_buffers won't be 25% of RAM.

 (meaning, that if you have a 256gb memory server and follow the advice
 to set to 64gb, your memory for caching is approximately 64gb).

oops, meant to say approximately 256gb.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-09 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 09:02:27PM -0700, Josh Berkus wrote:
 On 09/05/2013 03:30 PM, Merlin Moncure wrote:
 
  Standard advice we've given in the past is 25% shared buffers, 75%
  effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
   Maybe we're changing the conventional calculation, but I thought I'd
  point that out.
  
  This was debated upthread.
 
 Actually, no, it wasn't.  Tom threw out a suggestion that we use 4X for
 historical reasons.  That's all, there was no discussion.
 
 So, my point stands: our historical advice has been to set EFS to 75% of
 RAM.  Maybe we're changing that advice, but if so, let's change it.
 Otherwise 3X makes more sense.

So, what do we want the effective_cache_size default to be?  3x or 4x?
We clearly state:

If you have a dedicated database server with 1GB or more of RAM,
a reasonable starting value for shared_buffers is 25% of the
memory in your system.  There are some workloads where even

If we make the default 4x, that means that people using the above
suggestion would be setting their effective_cache_size to 100% of RAM? 
If we go with 4x, which I believe was the majority opinion, what shall
we answer to someone who asks about this contradiction?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-06 Thread Cédric Villemain
Le jeudi 5 septembre 2013 17:14:37 Bruce Momjian a écrit :
 On Thu, Sep  5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
   I have developed the attached patch which implements an auto-tuned
   effective_cache_size which is 4x the size of shared buffers.  I had to
   set effective_cache_size to its old 128MB default so the EXPLAIN
   regression tests would pass unchanged.
  
  That's not really autotuning though. ISTM that making the *default* 4
  x shared_buffers might make perfect sense, but do we really need to
  hijack the value of -1 for that? That might be useful for some time
  when we have actual autotuning, that somehow inspects the system and
  tunes it from there.
  
  I also don't think it should be called autotuning, when it's just a
  smarter default value.
  
  I like the feature, though, just not the packaging.
 
 That auto-tuning text came from the wal_buffer documentation, which
 does exactly this based on shared_buffers:
 
 The contents of the WAL buffers are written out to disk at every
 transaction commit, so extremely large values are unlikely to
 provide a significant benefit.  However, setting this value to at
 least a few megabytes can improve write performance on a busy
 -- server where many clients are committing at once.  The auto-tuning
---
 selected by the default setting of -1 should give reasonable
 results in most cases.
 
 I am fine with rewording and not using -1, but we should change the
 wal_buffer default and documentation too then.  I am not sure what other
 value than -1 to use?  0?  I figure if we ever get better auto-tuning,
 we would just remove this functionality and make it better.

I'm fine with a -1 for auto-tune or inteligent default: it means (for me) that 
you don't need to care about this parameter in most case.

A negative impact of the simpler multiplier might be that if suddendly someone 
reduce the shared_buffers size to fix some strange behavior, then he at the 
same 
needs to increase manualy the effective_cache_size (which remain the sum of the 
caches on the system, at least on a dedicated to PostgreSQL one).

IMHO it is easy to know exactly how much of the memory is (or can be) used 
for/by PostgreSQL, we can compute that and update effective_cache_size at 
regular point int time. (just an idea, I know there are arguments against that 
too)

Maybe the value for a 4x multiplier instead of 3x, is that the 
effective_cache_size usage can be larger than required. It's not a big trouble.
With all things around NUMA we maybe just need to revisit that area (memory 
access cost non linear, double-triple caching, ...) .
-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian br...@momjian.us wrote:
 I have developed the attached patch which implements an auto-tuned
 effective_cache_size which is 4x the size of shared buffers.  I had to
 set effective_cache_size to its old 128MB default so the EXPLAIN
 regression tests would pass unchanged.

 That's not really autotuning though. ISTM that making the *default* 4
 x shared_buffers might make perfect sense, but do we really need to
 hijack the value of -1 for that? That might be useful for some time
 when we have actual autotuning, that somehow inspects the system and
 tunes it from there.

Well, the real problem with this patch is that it documents what the
auto-tuning algorithm is; without that commitment, just saying -1 means
autotune might be fine.

Did you consider the alternative of just tweaking initdb to insert a
default for effective_cache_size that's 4x whatever it picks for
shared_buffers?  That would probably be about 3 lines of code, and it
wouldn't nail down any particular server-side behavior.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Magnus Hagander
On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian br...@momjian.us wrote:
 On Tue, Jan  8, 2013 at 08:40:44PM -0500, Andrew Dunstan wrote:

 On 01/08/2013 08:08 PM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ...  And I don't especially like the idea of trying to
 make it depend directly on the box's physical RAM, for the same
 practical reasons Robert mentioned.
 For the record, I don't believe those problems would be particularly
 hard to solve.
 Well, the problem of find out the box's physical RAM is doubtless
 solvable if we're willing to put enough sweat and tears into it, but
 I'm dubious that it's worth the trouble.  The harder part is how to know
 if the box is supposed to be dedicated to the database.  Bear in mind
 that the starting point of this debate was the idea that we're talking
 about an inexperienced DBA who doesn't know about any configuration knob
 we might provide for the purpose.
 
 I'd prefer to go with a default that's predictable and not totally
 foolish --- and some multiple of shared_buffers seems like it'd fit the
 bill.

 +1. That seems to be by far the biggest bang for the buck. Anything
 else will surely involve a lot more code for not much more benefit.

 I have developed the attached patch which implements an auto-tuned
 effective_cache_size which is 4x the size of shared buffers.  I had to
 set effective_cache_size to its old 128MB default so the EXPLAIN
 regression tests would pass unchanged.

That's not really autotuning though. ISTM that making the *default* 4
x shared_buffers might make perfect sense, but do we really need to
hijack the value of -1 for that? That might be useful for some time
when we have actual autotuning, that somehow inspects the system and
tunes it from there.

I also don't think it should be called autotuning, when it's just a
smarter default value.

I like the feature, though, just not the packaging.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 06:14:33PM +0200, Magnus Hagander wrote:
  I have developed the attached patch which implements an auto-tuned
  effective_cache_size which is 4x the size of shared buffers.  I had to
  set effective_cache_size to its old 128MB default so the EXPLAIN
  regression tests would pass unchanged.
 
 That's not really autotuning though. ISTM that making the *default* 4
 x shared_buffers might make perfect sense, but do we really need to
 hijack the value of -1 for that? That might be useful for some time
 when we have actual autotuning, that somehow inspects the system and
 tunes it from there.

 I also don't think it should be called autotuning, when it's just a
 smarter default value.
 
 I like the feature, though, just not the packaging.

That auto-tuning text came from the wal_buffer documentation, which
does exactly this based on shared_buffers:

The contents of the WAL buffers are written out to disk at every
transaction commit, so extremely large values are unlikely to
provide a significant benefit.  However, setting this value to at
least a few megabytes can improve write performance on a busy
-- server where many clients are committing at once.  The auto-tuning
   ---
selected by the default setting of -1 should give reasonable
results in most cases.

I am fine with rewording and not using -1, but we should change the
wal_buffer default and documentation too then.  I am not sure what other
value than -1 to use?  0?  I figure if we ever get better auto-tuning,
we would just remove this functionality and make it better.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 12:48:54PM -0400, Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  On Thu, Sep 5, 2013 at 3:01 AM, Bruce Momjian br...@momjian.us wrote:
  I have developed the attached patch which implements an auto-tuned
  effective_cache_size which is 4x the size of shared buffers.  I had to
  set effective_cache_size to its old 128MB default so the EXPLAIN
  regression tests would pass unchanged.
 
  That's not really autotuning though. ISTM that making the *default* 4
  x shared_buffers might make perfect sense, but do we really need to
  hijack the value of -1 for that? That might be useful for some time
  when we have actual autotuning, that somehow inspects the system and
  tunes it from there.
 
 Well, the real problem with this patch is that it documents what the
 auto-tuning algorithm is; without that commitment, just saying -1 means
 autotune might be fine.

OK, but I did this based on wal_buffers, which has a -1 default, calls
it auto-tuning, and explains how the default is computed.

 Did you consider the alternative of just tweaking initdb to insert a
 default for effective_cache_size that's 4x whatever it picks for
 shared_buffers?  That would probably be about 3 lines of code, and it
 wouldn't nail down any particular server-side behavior.

The problem there is that many users are told to tune shared_buffers,
but don't touch effective cache size.  Having initdb set the
effective_cache_size value would not help there.  Again, this is all
based on the auto-tuning of wal_buffers.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Josh Berkus
On 09/05/2013 02:16 PM, Bruce Momjian wrote:
 Well, the real problem with this patch is that it documents what the
 auto-tuning algorithm is; without that commitment, just saying -1 means
 autotune might be fine.
 
 OK, but I did this based on wal_buffers, which has a -1 default, calls
 it auto-tuning, and explains how the default is computed.

I don't see a real problem with this.  For users who have set their
shared_buffers correctly, effective_cache_size should also be correct.

 The problem there is that many users are told to tune shared_buffers,
 but don't touch effective cache size.  Having initdb set the
 effective_cache_size value would not help there.  Again, this is all
 based on the auto-tuning of wal_buffers.

Standard advice we've given in the past is 25% shared buffers, 75%
effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
 Maybe we're changing the conventional calculation, but I thought I'd
point that out.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Merlin Moncure
On Thu, Sep 5, 2013 at 5:11 PM, Josh Berkus j...@agliodbs.com wrote:
 On 09/05/2013 02:16 PM, Bruce Momjian wrote:
 Well, the real problem with this patch is that it documents what the
 auto-tuning algorithm is; without that commitment, just saying -1 means
 autotune might be fine.

 OK, but I did this based on wal_buffers, which has a -1 default, calls
 it auto-tuning, and explains how the default is computed.

 I don't see a real problem with this.  For users who have set their
 shared_buffers correctly, effective_cache_size should also be correct.

Agreed.  I think -1 is the right setting for autotune as things stand
today. If we want something else, then we should change other settings
as well (like wal_buffers) and that is not in the scope of this patch.

 The problem there is that many users are told to tune shared_buffers,
 but don't touch effective cache size.  Having initdb set the
 effective_cache_size value would not help there.  Again, this is all
 based on the auto-tuning of wal_buffers.

 Standard advice we've given in the past is 25% shared buffers, 75%
 effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
  Maybe we're changing the conventional calculation, but I thought I'd
 point that out.

This was debated upthread.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Bruce Momjian
On Thu, Sep  5, 2013 at 03:11:53PM -0700, Josh Berkus wrote:
 On 09/05/2013 02:16 PM, Bruce Momjian wrote:
  Well, the real problem with this patch is that it documents what the
  auto-tuning algorithm is; without that commitment, just saying -1 means
  autotune might be fine.
  
  OK, but I did this based on wal_buffers, which has a -1 default, calls
  it auto-tuning, and explains how the default is computed.
 
 I don't see a real problem with this.  For users who have set their
 shared_buffers correctly, effective_cache_size should also be correct.
 
  The problem there is that many users are told to tune shared_buffers,
  but don't touch effective cache size.  Having initdb set the
  effective_cache_size value would not help there.  Again, this is all
  based on the auto-tuning of wal_buffers.
 
 Standard advice we've given in the past is 25% shared buffers, 75%
 effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
  Maybe we're changing the conventional calculation, but I thought I'd
 point that out.

Yes, I had wondered that myself, and 3x and 4x were thrown out as
options.  There were more people who liked 4x, but one of the reasons
was that 3x sounded odd --- not sure what to make of that, but I went
with the most popular.  I am fine with 3x, and I do think it logically
makes more sense, and is less likely to over-estimate than 4x.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-05 Thread Josh Berkus
On 09/05/2013 03:30 PM, Merlin Moncure wrote:

 Standard advice we've given in the past is 25% shared buffers, 75%
 effective_cache_size.  Which would make EFS *3X* shared_buffers, not 4X.
  Maybe we're changing the conventional calculation, but I thought I'd
 point that out.
 
 This was debated upthread.

Actually, no, it wasn't.  Tom threw out a suggestion that we use 4X for
historical reasons.  That's all, there was no discussion.

So, my point stands: our historical advice has been to set EFS to 75% of
RAM.  Maybe we're changing that advice, but if so, let's change it.
Otherwise 3X makes more sense.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-09-04 Thread Bruce Momjian
On Tue, Jan  8, 2013 at 08:40:44PM -0500, Andrew Dunstan wrote:
 
 On 01/08/2013 08:08 PM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 8, 2013 at 7:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 ...  And I don't especially like the idea of trying to
 make it depend directly on the box's physical RAM, for the same
 practical reasons Robert mentioned.
 For the record, I don't believe those problems would be particularly
 hard to solve.
 Well, the problem of find out the box's physical RAM is doubtless
 solvable if we're willing to put enough sweat and tears into it, but
 I'm dubious that it's worth the trouble.  The harder part is how to know
 if the box is supposed to be dedicated to the database.  Bear in mind
 that the starting point of this debate was the idea that we're talking
 about an inexperienced DBA who doesn't know about any configuration knob
 we might provide for the purpose.
 
 I'd prefer to go with a default that's predictable and not totally
 foolish --- and some multiple of shared_buffers seems like it'd fit the
 bill.
 
 +1. That seems to be by far the biggest bang for the buck. Anything
 else will surely involve a lot more code for not much more benefit.

I have developed the attached patch which implements an auto-tuned
effective_cache_size which is 4x the size of shared buffers.  I had to
set effective_cache_size to its old 128MB default so the EXPLAIN
regression tests would pass unchanged.

I considered a new available_ram variable but that just gives us another
variable, and in a way shared_buffers is a fixed amount, while
effective_cache_size is an estimate, so I thought driving everything
from shared_buffers made sense.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
new file mode 100644
index 23ebc11..de2374b
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
*** include 'filename'
*** 2758,2764 
  
 para
  Random access to mechanical disk storage is normally much more expensive
! than four-times sequential access.  However, a lower default is used
  (4.0) because the majority of random accesses to disk, such as indexed
  reads, are assumed to be in cache.  The default value can be thought of
  as modeling random access as 40 times slower than sequential, while
--- 2758,2764 
  
 para
  Random access to mechanical disk storage is normally much more expensive
! than four times sequential access.  However, a lower default is used
  (4.0) because the majority of random accesses to disk, such as indexed
  reads, are assumed to be in cache.  The default value can be thought of
  as modeling random access as 40 times slower than sequential, while
*** include 'filename'
*** 2841,2849 
listitem
 para
  Sets the planner's assumption about the effective size of the
! disk cache that is available to a single query.  This is
! factored into estimates of the cost of using an index; a
! higher value makes it more likely index scans will be used, a
  lower value makes it more likely sequential scans will be
  used. When setting this parameter you should consider both
  productnamePostgreSQL/productname's shared buffers and the
--- 2841,2857 
listitem
 para
  Sets the planner's assumption about the effective size of the
! disk cache that is available to a single query.  The default
! setting of -1 selects a size equal to four times the size of xref
! linkend=guc-shared-buffers, but not less than the size of one
! shared buffer page, typically literal8kB/literal.  This value
! can be set manually if the automatic choice is too large or too
! small.
!/para
! 
!para
! This value is factored into estimates of the cost of using an index;
! a higher value makes it more likely index scans will be used, a
  lower value makes it more likely sequential scans will be
  used. When setting this parameter you should consider both
  productnamePostgreSQL/productname's shared buffers and the
*** include 'filename'
*** 2855,2862 
  memory allocated by productnamePostgreSQL/productname, nor
  does it reserve kernel disk cache; it is used only for estimation
  purposes.  The system also does not assume data remains in
! the disk cache between queries.  The default is 128 megabytes
! (literal128MB/).
 /para
/listitem
   /varlistentry
--- 2863,2872 
  memory allocated by productnamePostgreSQL/productname, nor
  does it reserve kernel disk cache; it is used only for 

Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-10 Thread Kevin Grittner
Josh Berkus wrote:

 The, shared_buffers, wal_buffers, and effective_cache_size (and possible
 other future settings) can be set to -1. If they are set to -1, then we
 use the figure:
 
 shared_buffers = available_ram * 0.25
 (with a ceiling of 8GB)
 wal_buffers = available_ram * 0.05
 (with a ceiling of 32MB)
 effective_cache_size = available_ram * 0.75
 (with a floor of 128MB)
 
 If they are set to an amount, then we use the amount they are set to.
 
 It would be nice to also automatically set work_mem, maint_work_mem,
 temp_buffers, etc. based on the above, but that would be considerably
 more difficult and require performance testing we haven't done yet.

My starting point for work_mem is usually:

work_mem = available_ram * 0.25 / max_connections

Like everything else, I might adjust from there, but it seems like
a sane starting point. Of course, one could easily argue for a
lower percentage or exclusion of some number of maint_work_mem
allocations.

-Kevin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-10 Thread Jeff Janes
On Wed, Jan 9, 2013 at 12:38 AM, Benedikt Grundmann
bgrundm...@janestreet.com wrote:

 For what it is worth even if it is a dedicated database box 75% might be way
 too high. I remember investigating bad performance on our biggest database
 server, that in the end turned out to be a too high setting of
 effective_cache_size. From reading the code back then my rationale for it
 being to high was that the code that makes use of the effective_cache_size
 tries very hard to account for what the current query would do to the cache
 but doesn't take into account how many queries (on separate datasets!) are
 currently begin executed (and competing for the same cache).  On that box we
 often have 100+ active connections and many looking at different big
 datasets.

I think that most busy installations either run a lot of small queries
(for which effective_cache_size is irrelevant), or a few large
queries.  Your case is probably somewhat rare, and so as far as
defaults go, it would be sacrificed for the common good. The docs do
anticipate the need to account for multiple concurrent queries to be
discounted in deciding how to set effective_cache_size, but perhaps
the wording could be improved.

Out of curiosity, what did your queries look like after you lowered
effective_cache_size?  Were there a lot of sequential scans, or did it
just choose different indexes than it had before?  If a lot of
sequential scans, were they mostly on just a few tables that each had
many sequential scans going on simultaneously, or was it 100+
different tables each with one sequential scan going on?  (You said
different big datasets, but I don't know if these are in different
tables, or in common tables with a column to distinguish them.)

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: Set effective_cache_size to greater of .conf value, shared_buffers

2013-01-09 Thread Benedikt Grundmann
On Wed, Jan 9, 2013 at 2:01 AM, Josh Berkus j...@agliodbs.com wrote:

 All,

  Well, the problem of find out the box's physical RAM is doubtless
  solvable if we're willing to put enough sweat and tears into it, but
  I'm dubious that it's worth the trouble.  The harder part is how to know
  if the box is supposed to be dedicated to the database.  Bear in mind
  that the starting point of this debate was the idea that we're talking
  about an inexperienced DBA who doesn't know about any configuration knob
  we might provide for the purpose.

 For what it is worth even if it is a dedicated database box 75% might be
way too high. I remember investigating bad performance on our biggest
database server, that in the end turned out to be a too high setting of
effective_cache_size. From reading the code back then my rationale for it
being to high was that the code that makes use of the effective_cache_size
tries very hard to account for what the current query would do to the cache
but doesn't take into account how many queries (on separate datasets!) are
currently begin executed (and competing for the same cache).  On that box
we often have 100+ active connections and many looking at different big
datasets.

Cheers,

bene


  1   2   >