Hi Tom, thanks for your review.
On 9/7/24 22:10, Tom Lane wrote:
I looked at this patch. I agree with making the change. However,
I don't agree with the CF entry's marking of "target version: stable"
(i.e., requesting back-patch). I think this falls somewhere in the
gray area between a bug fix and a definitional change. Also, people
are unlikely to be happy if they suddenly get new, not-comparable
numbers after a minor version update. So I think we should just fix
it in HEAD.
OK, I did the change.
As far as the patch itself goes, the one thing that is bothering me
is this comment change
/*
- * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
+ * It's not safe to call PageGetExactFreeSpace() on new pages, so we
* treat them as being free space for our purposes.
*/
which looks like it wasn't made with a great deal of thought.
Now it seems to me that the comment was already bogus when written:
there isn't anything uncertain about what will happen if you call
either of these functions on a "new" page. PageIsNew checks for
return ((PageHeader) page)->pd_upper == 0;
If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed
to return zero, even if pd_lower contains garbage. And then
Indeed. I failed to notice that LocationIndex was an unsigned int, so I
thought that pg_upper - pd_upper could be positive with garbage in pg_upper.
PageGetHeapFreeSpace will likewise return zero. Perhaps there
could be trouble if we got into the line-pointer-checking part
of PageGetHeapFreeSpace, but we can't. So this comment is wrong,
and is even more obviously wrong after the above change. I thought
for a moment about removing the PageIsNew test altogether, but
then I decided that it probably*is* what we want and is just
mis-explained. I think the comment should read more like
/*
* PageGetExactFreeSpace() will return zero for a "new" page,
* but it's actually usable free space, so count it that way.
*/
Now alternatively you could argue that a "new" page isn't usable free
space yet and so we should count it as zero, just as we don't count
dead tuples as usable free space. You need VACUUM to turn either of
those things into real free space. But that'd be a bigger definitional
change, and I'm not sure we want it. Thoughts?
Also, do we need any documentation change for this? I looked through
https://www.postgresql.org/docs/devel/pgstattuple.html
and didn't see anything that was being very specific about what
"free space" means, so maybe it's fine as-is.
It's not easy. Maybe something like this?
"For any initialized page, free space refers to anything that isn't page
metadata (header and special), a line pointer or a tuple pointed to by a
valid line pointer. In particular, a dead tuple is not free space
because there's still a valid line pointer pointer pointing to it, until
VACUUM or some other maintenance mechanism (e.g. page pruning) cleans up
the page. A dead line pointer is not free space either, but the tuple it
points to has become free space. An unused line pointer could be
considered free space, but pgstattuple doesn't take it into account."