[PERFORM] Some ideas for comment

Chris Hoover Wed, 24 Aug 2005 10:38:58 -0700

Ok, there is always a lot of talk about tuning PostgreSQL on linux and
how PostgreSQL uses the linux kernel cache to cache the tables and
indexes.


My question is, is there anyway to see what files linux is caching at
this moment?

My reasoning behind this question is:

I have several database systems each with 1 PostgreSQL cluster. 
However, each cluster has  a large number of identical databases on
it. Since there can be a great amount of size disparity between the
databases, I am wondering if some of slowness we see might be caused
by the kernel cache having to reload a lot of data that keeps getting
swapped out. (most systems have at least 100GB of data/indexes on them
with 8 or 12GB ram).

If this is the case, what sort of symptoms would you expect?

To help mitigate this potential, I have been researching the
following, and am thinking of proposing it to management.  Any
comments would be appreciated.

1.  Implement a partition type layout using views and rules - This
will allow me to have one table in each view with the "active" data,
and the inactive data stored by year in other tables.

So I would have the following (for each major table):

Table View as
select * from active_table
union all 
select * from table_2005
union all
select * from table_2004
etc.

Each table would have identical indexes, however only the
"active_table" would have data that is actively being worked.  The
rules and a nightly job can keep the data correctly stored.

I am thinking that with this setup, the active table indexes should
almost always be in memory.  And, if they do happen to be pushed out,
they are much smaller than the indexes I have today (where all data is
in one table), so they should load faster with less i/o pressure.

From the testing I have done so far, I believe I can implement this
system with out having to ask for developer time.  This is a "Good
Thing".

Also, the database is not normalized and is very ugly, by using the
view to partition and abstract the actual data, I will be in a better
position to start normalizing some of the tables w/o developer time
(once again, a "Good Thing")


2.  I am also thinking of recommending we collapse all databases in a
cluster into one "mega" database.  I can then use schema's and views
to control database access and ensure that no customer can see another
customers data.

This would mean that there are only one set of indexes being loaded
into the cache.  While they would be larger, I think in combination
with the partition from idea 1, we would be ahead of the ball game. 
Since there would only be one set of indexes, everyone would be
sharing them so they should always be in memory.

I don't have real numbers to give you, but we know that our systems
are hurting i/o wise and we are growing by about 2GB+ per week (net). 
We actually grow by about 5GB/week/server.  However, when I run my
weekly maintenance of vacuum full, reindex, and the vacuum analyze, we
end up getting about 3GB back.  Unfortunately, I do not have the i/o
bandwidth to vacuum during the day as it causes major slowdowns on our
system.  Each night, I do run a vacuum analyze across all db's to try
and help.  I also have my fsm parameters set high (8000000 fsm pages,
and 5000 fsm relations) to try and compensate.

I believe this is only hurting us as any queries that choose to
tablescan are only getting slower and slower.  Also, obviously, our
indexes are continually growing.  The partitioning should help as the
actual number of records being worked on each table is a very small
percentage ( a site may have 1 million records, but only load and work
a few thousand each day).  The archive tables would be doing the most
growing while the active tables should stay small.  Most of the
queries that are tablescanning can not be fixed as the database
clusters have been initialized with a non-C locale and won't use
indexes on our queries that are using like with a wild card.


Right now, we are still on 7.3.4.  However, these ideas would be
implemented as part of an upgrade to 8.x (plus, we'll initialize the
new clusters with a C locale).

Anyway, I hope this makes since, and any comments, ideas, and/or
suggestions would be appreciated.

Thanks,

Chris

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [EMAIL PROTECTED] so that your
       message can get through to the mailing list cleanly

[PERFORM] Some ideas for comment

Reply via email to