notnulls discussion is forked from UniqueKey stuff, you can see the
attachment
for the UnqiueKey introduction. Tom raised his opinion to track the
nullability
inside Var[1][2][3], this thread would start from there based on my
understanding.

Generally tracking the null attributes inside Var would have something like:

struct Var
{
...;
      int nullable;  // -1 unknown,  0 - not nullable.  1 - nullable
};

and then semantics of Var->nullable must be attached to a RelOptInfo. For
example:

CREATE TABLE t1(a int, b int);

SELECT abs(a) FROM t1 WHERE a > -100;

The var in RelOptInfo->reltarget should have nullable = 0 but the var in
RelOptInfo->baserestrictinfo should have nullable = 1;  The beauty of this
are: a). It can distinguish the two situations perfectly b). Whenever we
want
to know the nullable attribute of a Var for an expression, it is super easy
to
know. In summary, we need to maintain the nullable attribute at 2 different
places. one is the before the filters are executed(baserestrictinfo,
joininfo,
ec_list at least).  one is after the filters are executed
(RelOptInfo.reltarget
only?)

Come to JoinRel, we still need to maintain the 2 different cases as well.

As for the joinrel.reltarget, currently it looks up the inputrel's
reltarget to
get the Var, so it is easy to inherit from Var->nullable from inputrel, but
we need to consider the new  changes introduced by current join,
Like new NOT nullable attributes because of join clauses OR new nullable
attributes because of outer join.  Everything looks good for now.

The hard part is RelOptInfo.joininfo & root->eq_classes. All of them uses
the shared RestrictInfo, and it is unclear which Var->nullable should be
used in
them. To not provide a wrong answer, I think we can assume nullable=-1
(unknown)
and let the upper layer decides what to do (do we have known use cases to
use
the nullable attribute here?).

More considerations about this strategy:
1. We might use more memory for different var copies, the only known cases
   RelOptInfo->reltarget for now.
2. _equalVar() has more complex semantics: shall we consider nulls or not.

My recent experience reminds me of another interesting use case of UniqueKey
which may reduce the planning time a lot IIUC (Value 3 in then attachment).
Since
PG15 has just been released, I wonder if more people have time to discuss
this topic
again. Do I think the way in the right direction?

[1] https://www.postgresql.org/message-id/1551312.1613142245%40sss.pgh.pa.us
[2]
https://www.postgresql.org/message-id/CAApHDvrRwhWCPKUD5H-EQoezHf%3DfnUUsPgTAnXsEOV8f8SF7XQ%40mail.gmail.com
[3] https://www.postgresql.org/message-id/1664320.1625577290%40sss.pgh.pa.us

-- 
Best Regards
Andy Fan

Attachment: uniquekey.README
Description: Binary data

Reply via email to