Hi
have strange situation - too many xlog files.
PostgreSQL 8.3.11 on i386-pc-solaris2.10, compiled by cc -Xa
config:
# select name, setting from pg_settings where name ~ 'checkpoint|wal' order
by 1;
name | setting
------------------------------+---------------
checkpoint_completion_target | 0.9
checkpoint_segments | 100
checkpoint_timeout | 900
checkpoint_warning | 30
log_checkpoints | on
wal_buffers | 2048
wal_sync_method | open_datasync
wal_writer_delay | 200
(8 rows)
as I understand, max number of xlog files in pg_xlog should be ( 1 + 2 *
checkpoint_segments ).
And in our case - it's more.
Added cronjob to log data about number of segments, current segment
name, number of segments in pg_xlog that are before current, and after
current. script is:
------------------------------------------------------------
#!/usr/bin/bash
LOGFILE=/home/postgres/logs/check_pg_xlog.out
LS_OUTPUT=$( ls -l /pgdata/main/pg_xlog | egrep -v
"xlogtemp|backup|status|total" | sort -k9 )
FIRST_SEGMENT_LINE=$( echo "$LS_OUTPUT" | head -1 )
LAST_SEGMENT_LINE=$( echo "$LS_OUTPUT" | tail -1 )
FIRST_SEGMENT=$( echo "$FIRST_SEGMENT_LINE" | awk '{print $NF}' )
LAST_SEGMENT=$( echo "$LAST_SEGMENT_LINE" | awk '{print $NF}' )
FIRST_SEGMENT_NUM=$( echo "$FIRST_SEGMENT" | awk '{print $NF}' | cut -b
9-16,23-24 )
LAST_SEGMENT_NUM=$( echo "$LAST_SEGMENT" | awk '{print $NF}' | cut -b
9-16,23-24 )
SEGMENT_COUNT=$( printf $'ibase=16\n1 + %s - %s\n' $LAST_SEGMENT_NUM
$FIRST_SEGMENT_NUM | bc )
CURRENT_WAL_FILE=$( /opt/pgsql8311/bin/psql -U postgres -qAtX -c 'select
file_name from pg_xlogfile_name_offset( pg_current_xlog_location())' )
CURRENT_WAL_FILE_NUM=$( echo "$CURRENT_WAL_FILE" | cut -b 9-16,23-24 )
SEGMENTS_BEFORE_CURRENT=$( printf $'ibase=16\n%s - %s\n' $CURRENT_WAL_FILE_NUM
$FIRST_SEGMENT_NUM | bc )
SEGMENTS_AFTER_CURRENT=$( printf $'ibase=16\n%s - %s\n' $LAST_SEGMENT_NUM
$CURRENT_WAL_FILE_NUM | bc )
CURRENT_SEGMENT_LINE=$( echo "$LS_OUTPUT" | grep "$CURRENT_WAL_FILE" )
(
date
printf $'First segment : %s\n' "$FIRST_SEGMENT_LINE"
printf $'Current segment : %s\n' "$CURRENT_SEGMENT_LINE"
printf $'Last segment : %s\n' "$LAST_SEGMENT_LINE"
printf $'Segment count : %s\n' "$SEGMENT_COUNT"
printf $'Current wal segment : %s\n' "$CURRENT_WAL_FILE"
printf $'Segments before current : %s\n' "$SEGMENTS_BEFORE_CURRENT"
printf $'Segments after current : %s\n' "$SEGMENTS_AFTER_CURRENT"
printf $'Last checkpoint time : %s\n' "$(
/opt/pgsql8311/bin/pg_controldata /pgdata/main | egrep '^Time of latest
checkpoint:' | sed 's/^[^:]*: *//' )"
/opt/pgsql8311/bin/psql -U postgres -c "select name, setting from
pg_settings where name =
any('{checkpoint_timeout,checkpoint_segments,archive_mode,archive_command}')"
) >> $LOGFILE
------------------------------------------------------------
sample output looks like this:
| Mon Nov 1 13:46:00 EDT 2010
| First segment : -rw------- 1 postgres postgres 16777216 Nov 1
13:16 000000010000376700000053
| Current segment : -rw------- 1 postgres postgres 16777216 Nov 1
13:45 000000010000376700000064
| Last segment : -rw------- 1 postgres postgres 16777216 Nov 1
13:01 000000010000376800000029
| Segment count : 215
| Current wal segment : 000000010000376700000064
| Segments before current : 17
| Segments after current : 197
| Last checkpoint time : Mon Nov 01 13:31:29 2010
| name | setting
| ---------------------+---------------
| archive_command | /usr/bin/true
| archive_mode | on
| checkpoint_segments | 100
| checkpoint_timeout | 900
| (4 rows)
As you can see, now we have 215 segments, with 17 that represent wal before
current location and 197 that are after current segment!
Here - you can see graph which plots number of wal segments in the last week
http://depesz.com/various/bad-wal.jpg
it virtually never goes below 215, and it spikes to 270-300.
In here: http://www.depesz.com/various/bad-wal.log.gz is log from my test
script since 20th of october.
Any ideas why number of segments is higher than expected?
Just so that I am clear: I do not want to lower it by changing
checkpoint_segments. I'm looking for information/enlightenment about why
it works the way it works, and what could be possibly wrong.
Best regards,
depesz
--
Linkedin: http://www.linkedin.com/in/depesz / blog: http://www.depesz.com/
jid/gtalk: [email protected] / aim:depeszhdl / skype:depesz_hdl / gg:6749007
--
Sent via pgsql-general mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general