Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-21 Thread Andrew Sullivan
On Mon, Oct 20, 2003 at 05:42:52PM -0400, Tom Lane wrote:

 hold of these dead rows instead of recycling them.  I suspect you have
 a client process somewhere that is holding an open transaction for a
 long time ... probably not doing anything, just sitting there with an
 unclosed BEGIN ...

Which could be because you're doing something nasty with one of the
autocommit=off clients.  Most of the client libraries implement
this by doing commit;begin; at every commit.  This means you have
way more idle in transaction connections than you think.  Look in
pg_stat_activity, assuming you've turned on query echoing. 

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-21 Thread Gaetano Mendola
Andrew Sullivan wrote:
On Mon, Oct 20, 2003 at 05:42:52PM -0400, Tom Lane wrote:


hold of these dead rows instead of recycling them.  I suspect you have
a client process somewhere that is holding an open transaction for a
long time ... probably not doing anything, just sitting there with an
unclosed BEGIN ...


Which could be because you're doing something nasty with one of the
autocommit=off clients.  Most of the client libraries implement
this by doing commit;begin; at every commit.  This means you have
way more idle in transaction connections than you think.  Look in
pg_stat_activity, assuming you've turned on query echoing. 
Or is enough do a ps -eafwww | grep post
to see the state of the connections
Regards
Gaetano Mendola
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-21 Thread Seum-Lim Gan
Hi Tom,

1.)
OK. We have narrowed it down.
We did a few (like 5 to 8 times) vacuum analyze tablename (no full), the
pg_statistics relfilenode grew. There was no database operation when
we did this, no other client connections except the one that does
the vacuum.
If we do plain simple vacuum tablename (again no full), we see
pg_statistics_relid_att_index relfilenode grew instead of
pg_statistics.
So, overtime, these files will grow if we do vacuum.

Are these expected ?

The question now is, if we are not doing anything
to the database, why would they grow after a few vacuums ?
2.)
The other problem we have with
 DETAIL:  101802 dead row versions cannot be removed yet.

 DETAIL:  110900 dead row versions cannot be removed yet.

 DETAIL:  753064 dead row versions cannot be removed yet.

 DETAIL:  765328 dead row versions cannot be removed yet.
We will collect more data and see what we can get from the
the process. Offhand, the process is connecting to
the database through ODBC and we don't use any BEGIN in
our updates, just doing plain UPDATE repeatedly
with different keys randomly.
The database is defaulted to autocommit=true in postgresql.conf.
Thanks.

Gan

At 5:25 pm -0400 2003/10/20, Tom Lane wrote:
Seum-Lim Gan [EMAIL PROTECTED] writes:
 We tried one more thing: with the table not being updated
 at all and we did vacuum. Each time a vacuum is done,
 the index file becomes bigger.
It is not possible for plain vacuum to make the index bigger.

VACUUM FULL possibly could make the index bigger, since it has to
transiently create duplicate index entries for every row it moves.
If you want any really useful comments on your situation, you're going
to have to offer considerably more detail than you have done so far ---
preferably, a test case that lets someone else reproduce your results.
So far, all we can do is guess on the basis of very incomplete
information.  When you aren't even bothering to mention whether a vacuum
is FULL or not, I have to wonder whether I have any realistic picture of
what's going on.
			regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


--
++
| Seum-Lim GAN email : [EMAIL PROTECTED]  |
| Lucent Technologies|
| 2000 N. Naperville Road, 6B-403F  tel : (630)-713-6665 |
| Naperville, IL 60566, USA.fax : (630)-713-7272 |
|   web : http://inuweb.ih.lucent.com/~slgan |
++
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-21 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 We did a few (like 5 to 8 times) vacuum analyze tablename (no full), the
 pg_statistics relfilenode grew.

Well, sure.  ANALYZE puts new rows into pg_statistic, and obsoletes old
ones.  You need to vacuum pg_statistic every so often (not to mention
the other system catalogs).

 If we do plain simple vacuum tablename (again no full), we see
 pg_statistics_relid_att_index relfilenode grew instead of
 pg_statistics.

Don't think I believe that.  Plain vacuum won't touch pg_statistic
at all (unless it's the target table of course).  I'd expect ANALYZE
to make both the stats table and its index grow, though.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-20 Thread Seum-Lim Gan
Hi Tom, Josh,

We tried one more thing: with the table not being updated
at all and we did vacuum. Each time a vacuum is done,
the index file becomes bigger.
This is probably what is contributing to the index file
growing as well.
Thanks.

Gan

At 11:04 am -0500 2003/10/20, Seum-Lim Gan wrote:
Hi Josh, Tom,

OK. As I understand it, vacuum does not release the space
used by the index file.
However, it should be able to reuse the space for indexing.
I have observed that during initial updates of the table,
the index file did not grow and was steady but it did not last long
and keeps growing afterwards. Vacuum/vacuum analyze did not help.
In all the update testing, vacuum analyze was done every 1 minute.

Tom, something caught your attention the last time.

Any insight so far ? Is it a bug ?

Thanks.

Gan

Tom Lane wrote:

Seum-Lim Gan [EMAIL PROTECTED] writes:
 vacuum verbose analyze dsperf_rda_or_key;
 INFO:  vacuuming scncraft.dsperf_rda_or_key
 INFO:  index dsperf242_1105 now contains 30 row versions in 
12387 pages
 DETAIL:  3097702 index row versions were removed.
 0 index pages have been deleted, 0 are currently reusable.
Hm, interesting that you deleted 90% of the entries and still had no
empty index pages at all.  What was the pattern of your deletes and/or
updates with respect to this index's key?
 However, when I check the disk space usage, it has not changed.
It won't in any case.  Plain VACUUM is designed for maintaining a
steady-state level of free space in tables and indexes, not for
returning major amounts of space to the OS.  For that you need
more-invasive operations like VACUUM FULL or REINDEX.
			regards, tom lane

At 12:04 pm -0700 2003/10/19, Josh Berkus wrote:
Gan,

 Oh, so in order to reclaim the disk space, we must run
 reindex or vacuum full ?
 This will lock out the table and we won't be able to do anything.
 Looks like this is a problem. It means we cannot use it for
 24x7 operations without having to stop the process and do the vacuum full
 and reindex. Is there anything down the road that these operations
 will not lock out the table ?
I doubt it; the amount of page-shuffling required to reclaim 90% of the space
in an index for a table that has been mostly cleared is substantial, and
would prevent concurrent access.
Also, you seem to have set up an impossible situation for VACUUM.   If I'm
reading your statistics right, you have a large number of threads accessing
most of the data 100% of the time, preventing VACUUM from cleaning up the
pages.This is not, in my experience, a realistic test case ... there are
peak and idle periods for all databases, even webservers that have been
slashdotted.
--
Josh Berkus
Aglio Database Solutions
San Francisco
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


--
++
| Seum-Lim GAN email : [EMAIL PROTECTED]  |
| Lucent Technologies|
| 2000 N. Naperville Road, 6B-403F  tel : (630)-713-6665 |
| Naperville, IL 60566, USA.fax : (630)-713-7272 |
|   web : http://inuweb.ih.lucent.com/~slgan |
++
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


--
++
| Seum-Lim GAN email : [EMAIL PROTECTED]  |
| Lucent Technologies|
| 2000 N. Naperville Road, 6B-403F  tel : (630)-713-6665 |
| Naperville, IL 60566, USA.fax : (630)-713-7272 |
|   web : http://inuweb.ih.lucent.com/~slgan |
++
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-20 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 We tried one more thing: with the table not being updated
 at all and we did vacuum. Each time a vacuum is done,
 the index file becomes bigger.

It is not possible for plain vacuum to make the index bigger.

VACUUM FULL possibly could make the index bigger, since it has to
transiently create duplicate index entries for every row it moves.

If you want any really useful comments on your situation, you're going
to have to offer considerably more detail than you have done so far ---
preferably, a test case that lets someone else reproduce your results.
So far, all we can do is guess on the basis of very incomplete
information.  When you aren't even bothering to mention whether a vacuum
is FULL or not, I have to wonder whether I have any realistic picture of
what's going on.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-20 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 [ successive outputs from VACUUM ANALYZE ]

FWIW, I don't think your problem is really index bloat at all, it's
more like too-many-dead-rows bloat.  Note that the number of dead row
versions is climbing steadily from run to run:

 DETAIL:  101802 dead row versions cannot be removed yet.

 DETAIL:  110900 dead row versions cannot be removed yet.

 DETAIL:  753064 dead row versions cannot be removed yet.

 DETAIL:  765328 dead row versions cannot be removed yet.

It's hardly the index's fault that it's growing, when it has to keep
track of an ever-increasing number of rows.

The real question is what you're doing that requires the system to keep
hold of these dead rows instead of recycling them.  I suspect you have
a client process somewhere that is holding an open transaction for a
long time ... probably not doing anything, just sitting there with an
unclosed BEGIN ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-19 Thread Seum-Lim Gan
Hi Tom,

Thanks for info. I stoped the update and removed the process that's doing
the update and did vacuum analyze. This time the result says
the index row has been removed :
vacuum verbose analyze dsperf_rda_or_key;
INFO:  vacuuming scncraft.dsperf_rda_or_key
INFO:  index dsperf242_1105 now contains 30 row versions in 12387 pages
DETAIL:  3097702 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 2.86s/25.49u sec elapsed 54.16 sec.
INFO:  dsperf_rda_or_key: removed 3097702 row versions in 53726 pages
DETAIL:  CPU 6.29s/26.05u sec elapsed 78.23 sec.
INFO:  dsperf_rda_or_key: found 3097702 removable, 30 
nonremovable row versions in 58586 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 5 unused item pointers.
0 pages are entirely empty.
CPU 10.23s/53.79u sec elapsed 135.78 sec.
INFO:  analyzing scncraft.dsperf_rda_or_key
INFO:  dsperf_rda_or_key: 58586 pages, 3000 rows sampled, 176830 
estimated total rows
VACUUM

However, when I check the disk space usage, it has not changed.
Before and after the vacuum, it stayed the same :
/pg 822192 21% Sun Oct 19 09:34:25 CDT 2003
table /pg/data/base/17139/34048 Size=479936512 (relfilenode for table)
index /pg/data/base/17139/336727 Size=101474304 (relfilenode for index)
Any idea here ?

Another question, if we have a process that has different threads trying
to update PostgreSQL, is this going to post a problem if we do not have
the thread-safety option during configure ?
Thanks.

Gan

At 1:48 am -0400 2003/10/19, Tom Lane wrote:
Seum-Lim Gan [EMAIL PROTECTED] writes:
 INFO:  vacuuming craft.dsperf_rda_or_key
 INFO:  index hello242_1105 now contains 1792276 row versions in 6237 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.61s/0.36u sec elapsed 17.92 sec.
 INFO:  hello_rda_or_key: found 0 removable, 1791736 nonremovable
 row versions in 30892 pages
 DETAIL:  1492218 dead row versions cannot be removed yet.
You still haven't got an index-bloat problem.  I am, however, starting
to wonder why you have so many dead-but-unremovable rows.  I think you
must have some client process that's been holding an open transaction
for a long time.
			regards, tom lane


--
++
| Seum-Lim GAN email : [EMAIL PROTECTED]  |
| Lucent Technologies|
| 2000 N. Naperville Road, 6B-403F  tel : (630)-713-6665 |
| Naperville, IL 60566, USA.fax : (630)-713-7272 |
|   web : http://inuweb.ih.lucent.com/~slgan |
++
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-19 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 vacuum verbose analyze dsperf_rda_or_key;
 INFO:  vacuuming scncraft.dsperf_rda_or_key
 INFO:  index dsperf242_1105 now contains 30 row versions in 12387 pages
 DETAIL:  3097702 index row versions were removed.
 0 index pages have been deleted, 0 are currently reusable.

Hm, interesting that you deleted 90% of the entries and still had no
empty index pages at all.  What was the pattern of your deletes and/or
updates with respect to this index's key?

 However, when I check the disk space usage, it has not changed.

It won't in any case.  Plain VACUUM is designed for maintaining a
steady-state level of free space in tables and indexes, not for
returning major amounts of space to the OS.  For that you need
more-invasive operations like VACUUM FULL or REINDEX.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-18 Thread Seum-Lim Gan
Hi Josh,

Sample verbose analyze:

VACUUM VERBOSE ANALYZE hello_rda_or_key;
INFO:  vacuuming craft.hello_rda_or_key
INFO:  index hello242_1105 now contains 740813 row versions in 2477 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.42s/0.13u sec elapsed 4.76 sec.
INFO:  hello_rda_or_key: found 0 removable, 740813 nonremovable row 
versions in 12778 pages
DETAIL:  440813 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.78s/0.66u sec elapsed 6.41 sec.
INFO:  analyzing craft.hello_rda_or_key
INFO:  hello_rda_or_key: 12778 pages, 3000 rows sampled, 39388 
estimated total rows
VACUUM

Here is my postgresql.conf file:

# -
# PostgreSQL configuration file
# -
#
# This file consists of lines of the form:
#
#   name = value
#
# (The '=' is optional.) White space may be used. Comments are introduced
# with '#' anywhere on a line. The complete list of option names and
# allowed values can be found in the PostgreSQL documentation. The
# commented-out settings shown in this file represent the default values.
#
# Any option can also be given as a command line switch to the
# postmaster, e.g. 'postmaster -c log_connections=on'. Some options
# can be changed at run-time with the 'SET' SQL command.
#
# This file is read on postmaster startup and when the postmaster
# receives a SIGHUP. If you edit the file on a running system, you have
# to SIGHUP the postmaster for the changes to take effect, or use
# pg_ctl reload.
#---
# CONNECTIONS AND AUTHENTICATION
#---
# - Connection Settings -

#tcpip_socket = false
#max_connections = 100
max_connections = 600
# note: increasing max_connections costs about 500 bytes of shared
# memory per connection slot, in addition to costs from shared_buffers
# and max_locks_per_transaction.
#superuser_reserved_connections = 2
#port = 5432
port = 5333
#unix_socket_directory = ''
#unix_socket_group = ''
#unix_socket_permissions = 0777 # octal
#virtual_host = ''  # what interface to listen on; defaults to any
#rendezvous_name = ''   # defaults to the computer name
# - Security  Authentication -

#authentication_timeout = 60# 1-600, in seconds
#ssl = false
#password_encryption = true
#krb_server_keyfile = ''
#db_user_namespace = false
#---
# RESOURCE USAGE (except WAL)
#---
# - Memory -

#shared_buffers = 1000  # min 16, at least max_connections*2, 8KB each
shared_buffers = 1200   # min 16, at least max_connections*2, 8KB each
#sort_mem = 1024# min 64, size in KB
sort_mem = 40960# min 64, size in KB
#vacuum_mem = 8192  # min 1024, size in KB
vacuum_mem = 81920  # min 1024, size in KB
# - Free Space Map -

#max_fsm_pages = 2  # min max_fsm_relations*16, 6 bytes each
max_fsm_pages = 5  # min max_fsm_relations*16, 6 bytes each
#max_fsm_relations = 1000   # min 100, ~50 bytes each
max_fsm_relations = 1000# min 100, ~50 bytes each
# - Kernel Resource Usage -

#max_files_per_process = 1000   # min 25
#preload_libraries = ''
#---
# WRITE AHEAD LOG
#---
# - Settings -

#fsync = true   # turns forced synchronization on or off
fsync = false   # turns forced synchronization on or off
#wal_sync_method = fsync# the default varies across platforms:
# fsync, fdatasync, open_sync, or open_datasync
#wal_buffers = 8# min 4, 8KB each
# - Checkpoints -

#checkpoint_segments = 3# in logfile segments, min 1, 16MB each
#checkpoint_timeout = 300   # range 30-3600, in seconds
#checkpoint_warning = 30# 0 is off, in seconds
#commit_delay = 0   # range 0-10, in microseconds
#commit_siblings = 5# range 1-1000
#---
# QUERY TUNING
#---
# - Planner Method Enabling -

#enable_hashagg = true
#enable_hashjoin = true
#enable_indexscan = true
#enable_mergejoin = true
#enable_nestloop = true
#enable_seqscan = true
#enable_sort = true
#enable_tidscan = true
# - Planner Cost Constants -

#effective_cache_size = 1000# typically 8KB each
#random_page_cost = 4   # units are one sequential page fetch cost
#cpu_tuple_cost = 0.01  # (same)
#cpu_index_tuple_cost = 0.001   # (same)
#cpu_operator_cost = 0.0025 # (same)
# - 

Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-18 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 Sample verbose analyze:

 VACUUM VERBOSE ANALYZE hello_rda_or_key;
 INFO:  vacuuming craft.hello_rda_or_key
 INFO:  index hello242_1105 now contains 740813 row versions in 2477 pages

So what's the problem?  That doesn't seem like a particularly bloated
index.  You didn't say what datatype the index is on, but making the
most optimistic assumptions, index entries must use at least 16 bytes
each.  You're getting about 300 entries per page, compared to the
theoretical limit of 512 ... actually more, since I'm not allowing for
upper btree levels in this calculation ... which says to me that the
page loading is right around the expected btree loading of 2/3.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-18 Thread Seum-Lim Gan
Hi Tom,

I did that when I have stopped my updates.

Now, I am doing updates below is the output of vacuum.
After doing the vacuum verbose analyze, it reported the following :
INFO:  vacuuming craft.dsperf_rda_or_key
INFO:  index hello242_1105 now contains 1792276 row versions in 6237 pages
DETAIL:  0 index pages have been deleted, 0 are currently reusable.
CPU 0.61s/0.36u sec elapsed 17.92 sec.
INFO:  hello_rda_or_key: found 0 removable, 1791736 nonremovable 
row versions in 30892 pages
DETAIL:  1492218 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 1.95s/1.99u sec elapsed 26.95 sec.
INFO:  analyzing craft.dsperf_rda_or_key
INFO:  hello_rda_or_key: 30909 pages, 3000 rows sampled, 93292 
estimated total rows
VACUUM

Gan

At 10:21 pm -0400 2003/10/18, Tom Lane wrote:
Seum-Lim Gan [EMAIL PROTECTED] writes:
 Sample verbose analyze:

 VACUUM VERBOSE ANALYZE hello_rda_or_key;
 INFO:  vacuuming craft.hello_rda_or_key
 INFO:  index hello242_1105 now contains 740813 row versions in 2477 pages
So what's the problem?  That doesn't seem like a particularly bloated
index.  You didn't say what datatype the index is on, but making the
most optimistic assumptions, index entries must use at least 16 bytes
each.  You're getting about 300 entries per page, compared to the
theoretical limit of 512 ... actually more, since I'm not allowing for
upper btree levels in this calculation ... which says to me that the
page loading is right around the expected btree loading of 2/3.
			regards, tom lane


--
++
| Seum-Lim GAN email : [EMAIL PROTECTED]  |
| Lucent Technologies|
| 2000 N. Naperville Road, 6B-403F  tel : (630)-713-6665 |
| Naperville, IL 60566, USA.fax : (630)-713-7272 |
|   web : http://inuweb.ih.lucent.com/~slgan |
++
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] index file bloating still in 7.4 ?

2003-10-18 Thread Tom Lane
Seum-Lim Gan [EMAIL PROTECTED] writes:
 INFO:  vacuuming craft.dsperf_rda_or_key
 INFO:  index hello242_1105 now contains 1792276 row versions in 6237 pages
 DETAIL:  0 index pages have been deleted, 0 are currently reusable.
 CPU 0.61s/0.36u sec elapsed 17.92 sec.
 INFO:  hello_rda_or_key: found 0 removable, 1791736 nonremovable 
 row versions in 30892 pages
 DETAIL:  1492218 dead row versions cannot be removed yet.

You still haven't got an index-bloat problem.  I am, however, starting
to wonder why you have so many dead-but-unremovable rows.  I think you
must have some client process that's been holding an open transaction
for a long time.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]