Hi Tom: Thanks for your attention!
On Wed, May 18, 2022 at 1:25 AM Tom Lane <t...@sss.pgh.pa.us> wrote: > Andy Fan <zhihui.fan1...@gmail.com> writes: > > notnulls discussion is forked from UniqueKey stuff, you can see the > > attachment > > for the UnqiueKey introduction. Tom raised his opinion to track the > > nullability > > inside Var[1][2][3], this thread would start from there based on my > > understanding. > > I'm pretty certain that I never suggested this: > > > struct Var > > { > > ...; > > int nullable; // -1 unknown, 0 - not nullable. 1 - nullable > > }; > > You're free to pursue it if you like, but I think it will be a dead end. > OK, Here is a huge misunderstanding. I have my own solution at the beginning and then I think you want to go with this direction and I think it is really hard to understand, so I started this thread to make things clear. It is so great that the gap is filled now. The fundamental problem as you note is that equalVar() cannot do anything > sane with a field defined that way. Also, we'd have to generate Vars > initially with nullable = unknown (else, for example, ALTER SET/DROP NOT > NULL breaks stored views referring to the column). It'd be on the planner > to run through the tree and replace that with "nullable" or "not > nullable". It's hard to see how that's more advantageous than just > keeping the info in the associated RelOptInfo. > Agreed. > > Also, I think you're confusing two related but distinct issues. For > certain optimization issues, we'd like to keep track of whether a column > stored in a table is known NOT NULL. However, that's not the same thing > as the question that I've muttered about, which is how to treat a Var > that's been possibly forced to null due to null-extension of an outer > join. That is a different value from the Var as read from the table, > but we currently represent it the same within the planner, which causes > various sorts of undesirable complication. We cannot fix that by setting Var.nullable = true in above-the-join instances, because it might also > be true in below-the-join instances. "Known not null in the table" is > not the inverse of "potentially nulled by an outer join". Moreover, we > probably need to know *which* join is the one potentially nulling the Var, > so a bool is not likely enough anyway. > I read the above graph several times, but *I think probably my code can express better than my words*. It would be great that you can have a look at them. Just one point to mention now: Seems you didn't mention the case where the NULL values are filtered by qual, not sure it is negligible or by mistake. CREATE TABLE t(a int); SELECT * FROM t WHERE a > 1; My patch is my previous solution not the Inside Var one. > The schemes I've been toying with tend to look more like putting a > PlaceHolderVar-ish wrapper around the Var or expression that represents > the below-the-join value. The wrapper node could carry any join ID > info that we find necessary. The thing that I'm kind of stalled on is > how to define this struct so that it's not a big headache for join > strength reduction (which could remove the need for a wrapper altogether) > or outer-join reordering (which makes it a bit harder to define which > join we think is the one nulling the value). > > Not sure if the "NULL values are filtered by qual '' matters in this solution, and I'm pretty open for direction. But to avoid further misunderstanding from me, I would like to fill more gaps first by raising my patch now and continue talking in this direction. -- Best Regards Andy Fan
v1-0001-Introduce-notnull_attrs-for-RelOptInfo.patch
Description: Binary data