Tomas Szepe <[EMAIL PROTECTED]> writes:
>> Hmm, please see if you can get a stack trace from that (set the
>> breakpoint at errfinish()).  You might want to use vacuum verbose
>> first so that you can figure out which individual table is causing it.

> Ok, I recompiled CVS head with --enable-debug and with --enable-cassert
> and hit the following assert on "vacuum full verbose analyze":
> [etc]

It seems fairly clear that both of these symptoms mean that
empty_end_pages got to be larger than fraged_pages->num_pages.
In the first case the Assert catches that directly, but with
asserts disabled the code just allows num_pages to go negative
and then the space calculation in vac_update_fsm goes nuts.

So the question is, how could that happen?  There are only three places
where empty_end_pages is incremented, and the first two definitely add
the page to fraged_pages as well.  What I'm thinking is you must have
had a few pages where notup was true but do_frag didn't get set, and
it's not quite clear how that could be.  It seems most likely that the
page contained only LP_DEAD tuples but didn't have free space large
enough to get it put into the fraged_pages list.  But the only place
that would mark tuples LP_DEAD is pruneheap.c, and it should have
done a PageRepairFragmentation() after doing so.

Do you perhaps have a ridiculously low fillfactor attached to
the system catalogs?

The fix should probably be to force pages to be put in fraged_pages
if notup is true, but first I want to understand exactly how it got
into this state --- there may be something else going on here.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Reply via email to