Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Cristian Gafton

On Tue, 29 Jan 2008, Cristian Gafton wrote:

I have a ~150GB sized server, containing two databases that are active in 
mostly read mode. I have noticed lately that the global/pgstat.stat file is 
somewhere around 1MB freshly after a restart, but at some point it baloons to 
74MB in size for no apparent reason, after a few hours of uptime. Needless to 
say, having the stats collector dump 74MB of stuff on disk on its every loop 
takes a big bite of the I/O capabilities of this box.


Of course, leaving out the most important thing - this is postgresql 8.2.6 
on x86_64


Looking at all the othe replicas I have of this database (but which are under 
a more lightweight read load), the pgstat.stat file again is rather small in 
size. Am I right to assume that a 74MB pgstat.stat file is not normal - and 
what might have caused it?


Cristian
--
Cristian Gafton
rPath, Inc.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Cristian Gafton


Hello all,

I have a ~150GB sized server, containing two databases that are active in 
mostly read mode. I have noticed lately that the global/pgstat.stat file 
is somewhere around 1MB freshly after a restart, but at some point it 
baloons to 74MB in size for no apparent reason, after a few hours of 
uptime. Needless to say, having the stats collector dump 74MB of stuff on 
disk on its every loop takes a big bite of the I/O capabilities of this 
box.


Looking at all the othe replicas I have of this database (but which are 
under a more lightweight read load), the pgstat.stat file again is rather 
small in size. Am I right to assume that a 74MB pgstat.stat file is not 
normal - and what might have caused it?


Thanks,

Cristian
--
Cristian Gafton
rPath, Inc.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Cristian Gafton

On Tue, 29 Jan 2008, Tom Lane wrote:


(Pokes around in the code...)  I think the problem here is that the only
active mechanism for flushing dead stats-table entries is
pgstat_vacuum_tabstat(), which is invoked by a VACUUM command or an
autovacuum.  Once-a-day VACUUM isn't gonna cut it for you under those
circumstances.  What you might do is just issue a VACUUM on some
otherwise-uninteresting small table, once an hour or however often you
need to keep the stats file bloat to a reasonable level.


I just ran a vacuumdb -a on the box - the pgstat file is still 90MB in 
size. If vacuum is supposed to clean up the cruft from pgstat, then I 
don't know if we're looking at the right cruft - I kind of expected the 
pgstat file to go down in size and the I/O storm to subside, but that is 
not happening after vacuum.


I will try to instrument the application to record the oids of the temp 
tables it creates and investigate from that angle, but in the meantime is 
there any way to reset the stats collector altogether? Could this be a 
corrupt stat file that gets read and written right back on every loop 
without any sort of validation?


Thanks,

Cristian
--
Cristian Gafton
rPath, Inc.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Cristian Gafton

On Tue, 29 Jan 2008, Tom Lane wrote:


Cristian Gafton [EMAIL PROTECTED] writes:

Autovacuum is disabled, since the database is mostly read only. There is a
vacuumdb -a -z running nightly on the box. However, the application that
queries it does a lot of work with temporary tables - would those bloat
the stats at all?


Conceivably, if you mean a lot of short-lived tables rather than a lot
of operations on a few tables.  However, I'd think that would result in
a steady accumulation of stats entries, not a sudden jump as you seemed
to describe.


We are churning through a bunch of short-lived temp tables. Since I 
reported the problem, the pgstat file is now sitting at 85M, yet the 
pg_stat* tables barely have any entries in them:


count(*)
pg_stats298
pg_statistic298
pg_stat_all_indexes 76
pg_stat_all_tables  76
pg_statio_all_tables56
pg_statio_all_indexes   76

Is there a way to inspect the pgstat file and see what's in it that it is 
taking all this space? (it's not the space that bothers me, it's the fact 
that the statistics collector has to dump 85MB of stuff once a second to 
disk...)


Thanks,

Cristian
--
Cristian Gafton
rPath, Inc.


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Tom Lane
Cristian Gafton [EMAIL PROTECTED] writes:
 Autovacuum is disabled, since the database is mostly read only. There is a 
 vacuumdb -a -z running nightly on the box. However, the application that 
 queries it does a lot of work with temporary tables - would those bloat 
 the stats at all?

Conceivably, if you mean a lot of short-lived tables rather than a lot
of operations on a few tables.  However, I'd think that would result in
a steady accumulation of stats entries, not a sudden jump as you seemed
to describe.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Tom Lane
Cristian Gafton [EMAIL PROTECTED] writes:
 On Tue, 29 Jan 2008, Cristian Gafton wrote:
 I have a ~150GB sized server, containing two databases that are active in 
 mostly read mode. I have noticed lately that the global/pgstat.stat file is 
 somewhere around 1MB freshly after a restart, but at some point it baloons 
 to 
 74MB in size for no apparent reason, after a few hours of uptime. Needless 
 to 
 say, having the stats collector dump 74MB of stuff on disk on its every loop 
 takes a big bite of the I/O capabilities of this box.

 Of course, leaving out the most important thing - this is postgresql 8.2.6 
 on x86_64

Hmm ... do you have autovacuum enabled?  If not, what's the vacuuming
policy on that box?  I'm wondering if this is triggered by something
deciding to vacuum or analyze a bunch of otherwise-unused tables, and
thereby causing stats entries to be created for those tables.

You could investigate by comparing the contents of the stats views
before and after the file balloons.  I would expect to see a lot more
rows, and the key is exactly what non-null activity is recorded in
the extra rows.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Cristian Gafton

On Tue, 29 Jan 2008, Tom Lane wrote:


On Tue, 29 Jan 2008, Cristian Gafton wrote:

I have a ~150GB sized server, containing two databases that are active in
mostly read mode. I have noticed lately that the global/pgstat.stat file is
somewhere around 1MB freshly after a restart, but at some point it baloons to
74MB in size for no apparent reason, after a few hours of uptime. Needless to
say, having the stats collector dump 74MB of stuff on disk on its every loop
takes a big bite of the I/O capabilities of this box.



Of course, leaving out the most important thing - this is postgresql 8.2.6
on x86_64


Hmm ... do you have autovacuum enabled?  If not, what's the vacuuming
policy on that box?  I'm wondering if this is triggered by something
deciding to vacuum or analyze a bunch of otherwise-unused tables, and
thereby causing stats entries to be created for those tables.


Autovacuum is disabled, since the database is mostly read only. There is a 
vacuumdb -a -z running nightly on the box. However, the application that 
queries it does a lot of work with temporary tables - would those bloat 
the stats at all?



You could investigate by comparing the contents of the stats views
before and after the file balloons.  I would expect to see a lot more
rows, and the key is exactly what non-null activity is recorded in
the extra rows.


Any one of the stats views in particular? Currently all of the stats_*
flags are set to on.

Thanks,

Cristian
--
Cristian Gafton
rPath, Inc.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Tom Lane
Cristian Gafton [EMAIL PROTECTED] writes:
 We are churning through a bunch of short-lived temp tables.

I think that's probably the root of the problem ...

 Since I 
 reported the problem, the pgstat file is now sitting at 85M, yet the 
 pg_stat* tables barely have any entries in them:

   count(*)
 pg_stats  298
 pg_statistic  298
 pg_stat_all_indexes   76
 pg_stat_all_tables76
 pg_statio_all_tables  56
 pg_statio_all_indexes 76

Those views are joins against pg_class, so only tables that have live
pg_class rows can possibly show up there.  You could try remembering the
OIDs of some temp tables and probing the underlying pg_stat_get_xxx()
functions to see if there are stats-table entries for them.

(Pokes around in the code...)  I think the problem here is that the only
active mechanism for flushing dead stats-table entries is
pgstat_vacuum_tabstat(), which is invoked by a VACUUM command or an
autovacuum.  Once-a-day VACUUM isn't gonna cut it for you under those
circumstances.  What you might do is just issue a VACUUM on some
otherwise-uninteresting small table, once an hour or however often you
need to keep the stats file bloat to a reasonable level.

There is a pgstat_drop_relation() function to tell the stats collector
to drop a single table entry, but it's not being called from anyplace.
We probably ought to try a bit harder to make that work.  The problem
is described here:

2007-07-08 18:23  tgl

* src/: backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,
include/pgstat.h (REL8_1_STABLE), backend/postmaster/pgstat.c,
backend/storage/smgr/smgr.c, include/pgstat.h (REL8_2_STABLE),
backend/postmaster/pgstat.c, backend/storage/smgr/smgr.c,
include/pgstat.h: Remove the pgstat_drop_relation() call from
smgr_internal_unlink(), because we don't know at that point which
relation OID to tell pgstat to forget.  The code was passing the
relfilenode, which is incorrect, and could possibly cause some
other relation's stats to be zeroed out.  While we could try to
clean this up, it seems much simpler and more reliable to let the
next invocation of pgstat_vacuum_tabstat() fix things; which indeed
is how it worked before I introduced the buggy code into 8.1.3 and
later :-(.  Problem noticed by Itagaki Takahiro, fix is per
subsequent discussion.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Large pgstat.stat file causes I/O storm

2008-01-29 Thread Tom Lane
Cristian Gafton [EMAIL PROTECTED] writes:
 I just ran a vacuumdb -a on the box - the pgstat file is still 90MB in 
 size. If vacuum is supposed to clean up the cruft from pgstat, then I 
 don't know if we're looking at the right cruft - I kind of expected the 
 pgstat file to go down in size and the I/O storm to subside, but that is 
 not happening after vacuum.

Hmph ... I did a simple test here involving creating a lot of temp
tables, and indeed it made the stats file bigger, but the size went
right down again after vacuuming.  Is it possible that the vacuumdb
failed to connect to the particular database in which the temp tables
are coming and going?

 I will try to instrument the application to record the oids of the temp 
 tables it creates and investigate from that angle, but in the meantime is 
 there any way to reset the stats collector altogether? Could this be a 
 corrupt stat file that gets read and written right back on every loop 
 without any sort of validation?

There's stats_reset_on_server_start (sp?), and I think 8.2 also has a
stats-reset function.  But what might be more interesting is to pull the
file-reading function out of pgstat.c and dump out the stats file in
readable form to see what the heck is in there.  (If you decide to try
resetting the stats, I'd suggest saving a copy of the stats file first
for possible analysis later.)  I have the beginnings of such a program
laying about, which I'll attach --- note that it was last used for 8.1
and might require some tweaks for 8.2, and that you'd need to flesh it
out a lot if you want details about individual entries instead of just
a count.

regards, tom lane

/*
 * dumpstat --- simple standalone program to read and analyze a PG stats
 * file.  Based on pgstat_read_statsfile() from 8.1 sources.
 *
 * Currently works with either 8.0 or 8.1 formats depending on which
 * headers it is compiled against.
 */
#include postgres.h

#include pgstat.h


int
main(int argc, char **argv)
{
PgStat_StatDBEntry dbbuf;
PgStat_StatTabEntry tabbuf;
PgStat_StatBeEntry beentry;
FILE   *fpin;
int32   format_id;
int maxbackends = 0;
int havebackends = 0;
int havedbs = 0;
int havetabs = 0;

/*
 * Try to open the status file. If it doesn't exist, the backends simply
 * return zero for anything and the collector simply starts from scratch
 * with empty counters.
 */
if ((fpin = fopen(argv[1], rb)) == NULL)
{
perror(argv[1]);
return 1;
}

/*
 * Verify it's of the expected format.
 */
#ifdef PGSTAT_FILE_FORMAT_ID
if (fread(format_id, 1, sizeof(format_id), fpin) != sizeof(format_id)
|| format_id != PGSTAT_FILE_FORMAT_ID)
{
fprintf(stderr, corrupted pgstat.stat file\n);
goto done;
}
#endif

/*
 * We found an existing collector stats file. Read it and put all the
 * hashtable entries into place.
 */
for (;;)
{
switch (fgetc(fpin))
{
/*
 * 'D'  A PgStat_StatDBEntry struct describing 
a database
 * follows. Subsequently, zero to many 'T' 
entries will follow
 * until a 'd' is encountered.
 */
case 'D':
if (fread(dbbuf, 1, sizeof(dbbuf), fpin) != 
sizeof(dbbuf))
{
fprintf(stderr, corrupted pgstat.stat 
file\n);
goto done;
}
havedbs++;
break;

/*
 * 'd'  End of this database.
 */
case 'd':
break;

/*
 * 'T'  A PgStat_StatTabEntry follows.
 */
case 'T':
if (fread(tabbuf, 1, sizeof(tabbuf), fpin) != 
sizeof(tabbuf))
{
fprintf(stderr, corrupted pgstat.stat 
file\n);
goto done;
}
havetabs++;
break;

/*
 * 'M'  The maximum number of backends to 
expect follows.
 */
case 'M':