Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-05-16 Thread Bruce Momjian

Has this been done yet?  I don't think so.

---

Tom Lane wrote:
 Neil Conway [EMAIL PROTECTED] writes:
  On Thu, 2007-26-04 at 18:07 -0400, Neil Conway wrote:
  (1) I believe the reasoning for Tom's earlier change was not to reduce
  the I/O between the backend and the pgstat process [...]
 
  Tom, any comments on this? Your change introduced an undocumented
  regression into 8.2. I think you're on the hook for a documentation
  update at the very least, if not a revert.
 
 The documentation update seems the most prudent thing to me.  The
 problem with the prior behavior is that it guarantees that every table
 in the database will eventually have a pg_stat entry, even if
 stats_row_level and stats_block_level are both off.  In a DB with lots
 of tables that creates a significant overhead *for a feature the DBA
 probably thinks is turned off*.  This is not how it worked before 8.2,
 and so 8.2.0's behavior is arguably a performance regression compared
 to 8.1 and before.
 
 Now this patch went in before we realized that 8.2.x had a bug in
 computing the stats-file-update delay, and it could be that after fixing
 that the problem is not so pressing.  But I don't particularly care for
 new features that impose a performance penalty on those who aren't using
 them, and that's exactly what last_vacuum/last_analyze tracking does if
 we allow it to bloat the stats file in the default configuration.
 
 The long-term answer of course is to devise a more efficient stats
 reporting scheme, but I'm not sure offhand what that would look like.
 
   regards, tom lane
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-05-06 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Thu, 2007-26-04 at 18:07 -0400, Neil Conway wrote:
 (1) I believe the reasoning for Tom's earlier change was not to reduce
 the I/O between the backend and the pgstat process [...]

 Tom, any comments on this? Your change introduced an undocumented
 regression into 8.2. I think you're on the hook for a documentation
 update at the very least, if not a revert.

The documentation update seems the most prudent thing to me.  The
problem with the prior behavior is that it guarantees that every table
in the database will eventually have a pg_stat entry, even if
stats_row_level and stats_block_level are both off.  In a DB with lots
of tables that creates a significant overhead *for a feature the DBA
probably thinks is turned off*.  This is not how it worked before 8.2,
and so 8.2.0's behavior is arguably a performance regression compared
to 8.1 and before.

Now this patch went in before we realized that 8.2.x had a bug in
computing the stats-file-update delay, and it could be that after fixing
that the problem is not so pressing.  But I don't particularly care for
new features that impose a performance penalty on those who aren't using
them, and that's exactly what last_vacuum/last_analyze tracking does if
we allow it to bloat the stats file in the default configuration.

The long-term answer of course is to devise a more efficient stats
reporting scheme, but I'm not sure offhand what that would look like.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-05-03 Thread Neil Conway
On Thu, 2007-26-04 at 18:07 -0400, Neil Conway wrote:
 (1) I believe the reasoning for Tom's earlier change was not to reduce
 the I/O between the backend and the pgstat process [...]

Tom, any comments on this? Your change introduced an undocumented
regression into 8.2. I think you're on the hook for a documentation
update at the very least, if not a revert.

-Neil



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-04-26 Thread Neil Conway
On Tue, 2007-04-24 at 17:38 -0400, Neil Conway wrote:
 which included other modifications to reduce the pgstat I/O volume in
 8.1. I don't think this particular change was wise

I looked into this a bit further:

(1) I believe the reasoning for Tom's earlier change was not to reduce
the I/O between the backend and the pgstat process: it was to keep the
in-memory stats hash tables small, and to reduce the amount of data that
needs to be written to disk. When the only stats messages we get for a
table are VACUUM or ANALYZE messages, we discard the message in the
pgstat daemon.

(2) If stats_row_level is false, there won't be a stats hash entry for
any tables, so we can skip sending the VACUUM or ANALYZE message in the
first place, by the same logic. (This is more debatable if the user just
disabled stats_row_level for the current session, although since only a
super-user can do that, perhaps that's OK.)

(3) I don't like the fact that the current coding is so willing to throw
away VACUUM and ANALYZE pgstat messages. I think it is quite plausible
that the DBA might be interested in the last-VACUUM and last-ANALYZE
information for a table which hasn't had live operations applied to it
recently. The rest of the pgstat code has a similarly disappointing
willingness to silently discard messages it doesn't think are worth
keeping (e.g. pgstat_recv_autovac() is ignored for databases with no
other activity, and pgstat_count_xact_commit/rollback() is a no-op
unless *either* row-level or block-level stats are enabled.)

If we're so concerned about saving space in the stats hash tables for
tables that don't see non-VACUUM / non-ANALYZE activity, why not arrange
to record the timestamps for database-wide VACUUMs and ANALYZEs
separately from table-local VACUUMs and ANALYZEs? That is, a table's
last_vacuum time could effectively be the max of the last database-wide
vacuum time and the last VACUUM on that particular table. (Recording the
time of the last database-wide VACUUM might be worth doing anyway, e.g.
for avoiding wraparound failure).

Comments?

-Neil



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-04-26 Thread Alvaro Herrera
Neil Conway wrote:

 (3) I don't like the fact that the current coding is so willing to throw
 away VACUUM and ANALYZE pgstat messages. I think it is quite plausible
 that the DBA might be interested in the last-VACUUM and last-ANALYZE
 information for a table which hasn't had live operations applied to it
 recently. The rest of the pgstat code has a similarly disappointing
 willingness to silently discard messages it doesn't think are worth
 keeping (e.g. pgstat_recv_autovac() is ignored for databases with no
 other activity, and pgstat_count_xact_commit/rollback() is a no-op
 unless *either* row-level or block-level stats are enabled.)

One thing to keep in mind is that autovac drives some decision from
whether the database has a pgstat entry or not.  In particular it means
it doesn't bother processing non-connectable databases, unless they are
close to Xid wraparound.

I think this behavior is a useful one, since usually vacuuming those
databases is a waste of time anyway.  Whether to drive it from pgstat or
from somewhere else is another matter, but if you want to drive it from
another mechanism, keep in mind that the autovacuum launcher (which is
the process that makes this decision) is not connected to any database
so it cannot examine any catalog's content.  There are of course ways
around that: for example you could put the information in the
pg_database flatfile.  But it's something to keep in mind if you want to
change it.

 If we're so concerned about saving space in the stats hash tables for
 tables that don't see non-VACUUM / non-ANALYZE activity, why not arrange
 to record the timestamps for database-wide VACUUMs and ANALYZEs
 separately from table-local VACUUMs and ANALYZEs? That is, a table's
 last_vacuum time could effectively be the max of the last database-wide
 vacuum time and the last VACUUM on that particular table. (Recording the
 time of the last database-wide VACUUM might be worth doing anyway, e.g.
 for avoiding wraparound failure).

Another thing to keep in mind is that autovacuum does not do
database-wide vacuums anymore -- they are not needed.  Xid wraparound
decisions are handled on a table-by-table basis, so information about
when the last database-wide vacuum was is not needed.

Note that Xid wraparound decisions are driven by information in
pg_class.  So it's not a problem that pgstat may lose the info from this
POV.

The bottom line is that the current pgstat behavior and autovacuum are
closely related.  So if you want to change pgstats you should also keep
an eye on how it's going to affect autovac.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [DOCS] row-level stats and last analyze time

2007-04-24 Thread Robert Treat
On Tuesday 24 April 2007 17:38, Neil Conway wrote:
 [ CC'ing -hackers ]

 On Sun, 2007-04-22 at 16:10 +0200, Guillaume Lelarge wrote:
  This patch adds a sentence on monitoring.sgml explaining that
  stats_row_level needs to be enabled if user wants to get last
  vacuum/analyze execution time.

 This behavior was introduced in r1.120 of postmaster/pgstat.c:

 Modify pgstats code to reduce performance penalties from
 oversized stats data files: avoid creating stats
 hashtable entries for tables that aren't being touched
 except by vacuum/analyze [...]

 which included other modifications to reduce the pgstat I/O volume in
 8.1. I don't think this particular change was wise: the reduction in
 pgstat volume is pretty marginal, and it is counter-intuitive for
 stats_row_level to effect whether the last ANALYZE / VACUUM is recorded.
 (Plus, the optimization is not even enabled with the default
 postgresql.conf settings.)


+1

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly