I added some code to selfuncs.c to estimate the selectivity of CTID, including nullness, in my ongoing attempt to add TID range scans [1]. And as Tom pointed out [2], no system attribute can be null, so we might as well handle them all.
That's what the attached patch does.
I observed a few interesting things with outer join selectivity:
While system attributes aren't NULL in the table, they can be in queries
such as:
SELECT *
FROM a LEFT JOIN b ON a.id = b.id
WHERE b.ctid IS NULL;
And the patch does affect the estimates for such plans. But it's just
replacing one hardcoded nullness (0.005) for another (0.0), which seems no
worse than the original.
I was a bit concerned that with, for example,
CREATE TABLE a (id INTEGER);
INSERT INTO a SELECT * FROM generate_series(1,1000);
ANALYZE a;
CREATE TABLE b (id INTEGER, id2 INTEGER);
INSERT INTO b SELECT *, * FROM generate_series(1,10);
ANALYZE b;
EXPLAIN ANALYZE
SELECT * FROM a LEFT JOIN b ON a.id = b.id
WHERE b.ctid IS NULL;
you get a row estimate of 1 (vs the actual 990). It's not specific to
system attributes. Plain left-join selectivity calculation doesn't seem to
take into account the join selectivity, while anti-join calculation does.
I do not think this affects the usefulness of the present patch, but maybe
it's something we could improve.
Finally: I thought about introducing a macro to attnum.h:
/*
* AttrNumberIsForSystemAttr
* True iff the attribute number corresponds to a system attribute.
*/
#define AttrNumberIsForSystemAttr(attributeNumber) \
((bool) ((attributeNumber) < 0))
But there's a zillion places that could be changed to use it, so I haven't
in this version of the patch.
Edmund
[1]
https://www.postgresql.org/message-id/flat/31682.1545415852%40sss.pgh.pa.us#bdca5c18ed64f847f44c2645f98ea3a0
[2] https://www.postgresql.org/message-id/31682.1545415852%40sss.pgh.pa.us
v1-nullness-selectivity-for-system-cols.patch
Description: Binary data
