Hi, We've detected what I think is some sort of index corruption in 8.1. The server is running 8.1.11, so AFAICT the problem with truncated pages in vacuum is already patched and accounted for (i.e. we reindexed, and a bit later the problem presented itself again). There haven't been any relevant fixes after that AFAICT.
What we see is that after a bit of updating the index, it starts having tuples that poing to heap entries which are marked unused. I detected one of these almost by accident, and then built a tool to discover them by examining pg_filedump output. Basically what it does is scan the heap, note which heap tuples are marked "unused", and then scan the index and for each index tuple in leaf pages, see if it points to an unused heap tuple. The number of occurrences is amazingly high. Right after a reindex there isn't any occurrence; but after a while of application load, a lot of them appear. I catalogued this as index corruption: the theory is that as soon as the unused heap tuple is reused, the index will have a pointer with the wrong index key pointing to a live heap tuple. (We saw an occurrence of this too, but I wasn't motivated enough to write a tool to verify the data in index vs. heap tuples.) However, seeing the high prevalence of the problem, I started considering whether the tool is misreading the output anyway -- i.e. maybe it's a bug in the tool, or a misconception on my part. The tool output looks like this: INDEX PTR TO UNUSED HEAP: index 273375 (78,18) -> (5530, 17) INDEX PTR TO UNUSED HEAP: index 273375 (96,84) -> (5436, 3) INDEX PTR TO UNUSED HEAP: index 273375 (111,1) -> (1317, 26) INDEX PTR TO UNUSED HEAP: index 273375 (145,1) -> (1665, 26) INDEX PTR TO UNUSED HEAP: index 273375 (174,1) -> (2656, 17) INDEX PTR TO UNUSED HEAP: index 273375 (199,1) -> (1745, 21) INDEX PTR TO UNUSED HEAP: index 273375 (207,1) -> (358, 26) INDEX PTR TO UNUSED HEAP: index 273375 (214,1) -> (2800, 17) (many more lines of the same stuff) What this means (first line) is that on index 273375, page 78, offset 18, there's a pointer to heap page 5530, offset 17; but that heap offset shows up as unused. This is from the heap: Block 5530 ******************************************************** <Header> ----- Block Offset: 0x02b34000 Offsets: Lower 136 (0x0088) Block: Size 8192 Version 3 Upper 992 (0x03e0) LSN: logid 130 recoff 0xd53b3090 Special 8192 (0x2000) Items: 29 Free Space: 856 Length (including item array): 140 [ ... snip other items ...] Item 17 -- Length: 0 Offset: 504 (0x01f8) Flags: 0x00 This is from the index: Block 78 ******************************************************** <Header> ----- Block Offset: 0x0009c000 Offsets: Lower 788 (0x0314) Block: Size 8192 Version 3 Upper 5104 (0x13f0) LSN: logid 131 recoff 0x01b3c6e0 Special 8176 (0x1ff0) Items: 192 Free Space: 4316 Length (including item array): 792 [ ... snip other items ... ] Item 18 -- Length: 16 Offset: 5184 (0x1440) Flags: USED Block Id: 5530 linp Index: 17 Size: 16 Has Nulls: 0 Has Varwidths: 0 Of course, we're very concerned about this problem. Personally I am prepared to believe that this could be a hardware problem, because no one else seems to be reporting this kind of thing, but I don't see how it could cause this particular issue and not more wide-ranging data corruption. I would like comments on whether this is really a problem or I am just misreading pg_filedump output. If we confirm this, we can try to proceed to investigate it further. I attach the checker tool in case it is of any interest. Thanks, -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
#!/usr/bin/perl use warnings; use strict; my $PG_FILEDUMP = "pg_filedump-8.1"; process_table(); exit(); sub process_table { my ($table, @indexes) = read_toc(); foreach my $elem ($table, @indexes) { die "file $elem does not exist" unless -f "$elem"; if (!-f "$elem.dump") { print STDERR "generating pg_filedump for $elem\n"; `$PG_FILEDUMP -i $elem > $elem.dump`; } } print STDERR "loading unused line pointers for table $table\n"; my $unused = get_heap_unused($table); foreach my $index (@indexes) { print STDERR "processing index $index\n"; process_index($unused, $index); } } # Reads a "toc" file, which is a description of a table and its indexes. A # table line is "table: xxx" where xxx is the relfilenode of the table, and an # index line is "index: xxx" where xxx is the relfilenode of the index. sub read_toc { my $table; my @indexes; print STDERR "now expecting a TOC in stdin...\n"; while (<>) { if (/^table: ([0-9]+)$/) { $table = $1; } if (/^index: ([0-9]+)$/) { push @indexes, $1; } } print STDERR "finished reading the TOC for table $table\n"; print STDERR ("indexes: ", join(", ", @indexes), "\n"); return $table, @indexes; } # Reads a pg_filedump -i report for the given table, and fetches into a hash # all the line pointers that are marked "unused". Returns the hash so built. sub get_heap_unused { my $table = shift; my $curblock; my $unused = {}; my $infile = "$table.dump"; open IN, "<", "$infile" or die "can't open $infile: $!"; while (<IN>) { if (/^Block\s+([0-9]+) /) { $curblock = $1; } if (/^ Item\s+([0-9]+) -- Length:.*Flags: 0x00/) { push @{$unused->{$curblock}}, $1; } } return $unused; } # Given an index and a hash built by get_heap_unused, reads the pg_filedump -i # report for that index and prints a list of all index pointers that point to # any element in the hash. sub process_index { my $unused = shift; my $index = shift; my $curblock; my $special = 0; my $isleaf = 0; my $collect = 0; my @lines = (); my $infile = "$index.dump"; open IN, "<", "$infile" or die "can't open $infile: $!"; while (<IN>) { if (/^Block\s+([0-9]+) /) { $curblock = $1; } if (/^<Data> --/) { $collect = 1; next; } if (/^<Special Section> -----$/) { $special = 1; next; } if ($collect) { push @lines, $_; } if ($special) { if (/^ Flags.*LEAF/) { $isleaf = 1; } } if (/^$/) { if ($special) { if ($isleaf) { report_broken_block($unused, $index, $curblock, @lines); } $isleaf = 0; $special = 0; @lines = (); } } } } # workhorse for process_index; gets an array of lines comprising an index # block Data entries, and prints those that match the unused hash. sub report_broken_block { my $unused = shift; my $indexid = shift; my $block = shift; my $item; foreach my $line (@_) { if ($line =~ /^ Item\s+([0-9]+)/) { $item = $1; } if ($line =~ /^ Block Id:\s+([0-9]+)\s+linp Index: ([0-9]+)/) { my $blk = $1; my $lp = $2; next unless defined $unused->{$blk}; foreach my $hlp (@{$unused->{$blk}}) { next unless $lp eq $hlp; print "INDEX PTR TO UNUSED HEAP: ". "index $indexid ($block,$item) -> ($blk, $lp)\n"; } } } }
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers