Simon,

As a postgres DBA, I find your comments about how not to use effective_cache_size instructive, but I'm still not sure how I should arrive at a target value for it.

On most of the machines on which I admin postgres, I generally set shared_buffers to 10,000 (using what seems to have been the recent conventional wisdom of the lesser of 10,000 or 10% of RAM). I haven't really settled on an optimal value for effective_cache_size, and now I'm again confused as to how I might even benchmark it.

Here are the documents on which I've based my knowledge:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html#effcache
http://www.varlena.com/varlena/GeneralBits/Tidbits/annotated_conf_e.html
http://www.ca.postgresql.org/docs/momjian/hw_performance/node8.html

From Bruce's document, I gather that effective_cache_size would assume that either shared buffers or unused RAM were valid sources of cached pages for the purposes of assessing plans.

As a result, I was intending to inflate the value of effective_cache_size to closer to the amount of unused RAM on some of the machines I admin (once I've verified that they all have a unified buffer cache). Is that correct?

-tfo

--
Thomas F. O'Connell
Co-Founder, Information Architect
Sitening, LLC
http://www.sitening.com/
110 30th Avenue North, Suite 6
Nashville, TN 37203-6320
615-260-0005

On Oct 26, 2004, at 3:49 AM, Simon Riggs wrote:

On Mon, 2004-10-25 at 16:34, Jan Wieck wrote:
The problem is, with a too small directory ARC cannot guesstimate what
might be in the kernel buffers. Nor can it guesstimate what recently was
in the kernel buffers and got pushed out from there. That results in a
way too small B1 list, and therefore we don't get B1 hits when in fact
the data was found in memory. B1 hits is what increases the T1target,
and since we are missing them with a too small directory size, our
implementation of ARC is propably using a T2 size larger than the
working set. That is not optimal.

I think I have seen that the T1 list shrinks "too much", but need more tests...with some good test results

The effectiveness of ARC relies upon the balance between the often
conflicting requirements of "recency" and "frequency". It seems
possible, even likely, that pgsql's version of ARC may need some subtle
changes to rebalance it - if we are unlikely enough to find cases where
it genuinely is out of balance. Many performance tests are required,
together with a few ideas on extra parameters to include....hence my
support of Jan's ideas.

That's also why I called the B1+B2 hit ratio "turbulence" because it
relates to how much oscillation is happening between T1 and T2. In
physical systems, we expect the oscillations to be damped, but there is
no guarantee that we have a nearly critically damped oscillator. (Note
that the absence of turbulence doesn't imply that T1+T2 is optimally
sized, just that is balanced).

[...and all though the discussion has wandered away from my original
patch...would anybody like to commit, or decline the patch?]

If we would replace the dynamic T1 buffers with a max_backends*2 area of
shared buffers, use a C value representing the effective cache size and
limit the T1target on the lower bound to effective cache size - shared
buffers, then we basically moved the T1 cache into the OS buffers.

Limiting the minimum size of T1len to be 2* maxbackends sounds like an
easy way to prevent overbalancing of T2, but I would like to follow up
on ways to have T1 naturally stay larger. I'll do a patch with this idea
in, for testing. I'll call this "T1 minimum size" so we can discuss it.


Any other patches are welcome...

It could be that B1 is too small and so we could use a larger value of C
to keep track of more blocks. I think what is being suggested is two
GUCs: shared_buffers (as is), plus another one, larger, which would
allow us to track what is in shared_buffers and what is in OS cache.


I have comments on "effective cache size" below....

On Mon, 2004-10-25 at 17:03, Tom Lane wrote:
Jan Wieck <[EMAIL PROTECTED]> writes:
This all only holds water, if the OS is allowed to swap out shared
memory. And that was my initial question, how likely is it to find this
to be true these days?

I think it's more likely that not that the OS will consider shared
memory to be potentially swappable. On some platforms there is a shmctl
call you can make to lock your shmem in memory, but (a) we don't use it
and (b) it may well require privileges we haven't got anyway.

Are you saying we shouldn't, or we don't yet? I simply assumed that we did use that function - surely it must be at least an option? RHEL supports this at least....

It may well be that we don't have those privileges, in which case we
turn off the option. Often, we (or I?) will want to install a dedicated
server, so we should have all the permissions we need, in which case...

This has always been one of the arguments against making shared_buffers
really large, of course --- if the buffers aren't all heavily used, and
the OS decides to swap them to disk, you are worse off than you would
have been with a smaller shared_buffers setting.

Not really, just an argument against making them *too* large. Large *and* utilised is OK, so we need ways of judging optimal sizing.

However, I'm still really nervous about the idea of using
effective_cache_size to control the ARC algorithm.  That number is
usually entirely bogus.  Right now it is only a second-order influence
on certain planner estimates, and I am afraid to rely on it any more
heavily than that.

...ah yes, effective_cache_size.

The manual describes effective_cache_size as if it had something to do
with the OS, and some of this discussion has picked up on that.

effective_cache_size is used in only two places in the code (both in the
planner), as an estimate for calculating the cost of a) nonsequential
access and b) index access, mainly as a way of avoiding overestimates of
access costs for small tables.


There is absolutely no implication in the code that effective_cache_size
measures anything in the OS; what it gives is an estimate of the number
of blocks that will be available from *somewhere* in memory (i.e. in
shared_buffers OR OS cache) for one particular table (the one currently
being considered by the planner).


Crucially, the "size" referred to is the size of the *estimate*, not the
size of the OS cache (nor the size of the OS cache + shared_buffers). So
setting effective_cache_size = total memory available or setting
effective_cache_size = total memory - shared_buffers are both wildly
irrelevant things to do, or any assumption that directly links memory
size to that parameter. So talking about "effective_cache_size" as if it
were the OS cache isn't the right thing to do.


...It could be that we use a very high % of physical memory as
shared_buffers - in which case the effective_cache_size would represent
the contents of shared_buffers.

Note also that the planner assumes that all tables are equally likely to
be in cache. Increasing effective_cache_size in postgresql.conf seems
destined to give the wrong answer in planning unless you absolutely
understand what it does.


I will submit a patch to correct the description in the manual.

Further comments:
The two estimates appear to use effective_cache_size differently:
a) assumes that a table of size effective_cache_size will be 50% in
cache
b) assumes that effective_cache_size blocks are available, so for a
table of size == effective_cache_size, then it will be 100% available

IMHO the GUC should be renamed "estimated_cached_blocks", with the old
name deprecated to force people to re-read the manual description of
what effective_cache_size means and then set accordingly.....all of that
in 8.0....


--
Best Regards, Simon Riggs


---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


---------------------------(end of broadcast)---------------------------
TIP 6: Have you searched our list archives?

http://archives.postgresql.org

Reply via email to