Hi Ashutosh:

   Nice to see you again!

On Tue, May 17, 2022 at 8:50 PM Ashutosh Bapat <ashutosh.bapat....@gmail.com>
wrote:

> On Sun, May 15, 2022 at 8:41 AM Andy Fan <zhihui.fan1...@gmail.com> wrote:
>
> >
> > The var in RelOptInfo->reltarget should have nullable = 0 but the var in
> > RelOptInfo->baserestrictinfo should have nullable = 1;  The beauty of
> this
> > are: a). It can distinguish the two situations perfectly b). Whenever we
> want
> > to know the nullable attribute of a Var for an expression, it is super
> easy to
> > know. In summary, we need to maintain the nullable attribute at 2
> different
> > places. one is the before the filters are executed(baserestrictinfo,
> joininfo,
> > ec_list at least).  one is after the filters are executed
> (RelOptInfo.reltarget
> > only?)
>
> Thanks for identifying this. What you have written makes sense and it
> might open a few optimization opportunities. But let me put down some
> other thoughts here. You might want to take those into consideration
> when designing your solution.
>

Thanks.


>
> Do we want to just track nullable and non-nullable. May be we want
> expand this class to nullable (var may be null), non-nullable (Var is
> definitely non-NULL), null (Var will be always NULL).
>
>
Currently it doesn't support "Var will be always NULL" .  Do you have any
use cases for this? and I can't think of too many cases where we can get
such information except something like "SELECT a FROM t WHERE a
IS NULL".

But the other way to look at this is along the lines of equivalence
> classes. Equivalence classes record the expressions which are equal in
> the final result of the query. The equivalence class members are not
> equal at all the stages of query execution.  But because they are
> equal in the final result, we can impose that restriction on the lower
> levels as well. Can we think of nullable in that fashion? If a Var is
> non-nullable in the final result, we can impose that restriction on
> the intermediate stages since rows with NULL values for that Var will
> be filtered out somewhere. Similarly we could argue for null Var. But
> knowledge that a Var is nullable in the final result does not impose a
> NULL, non-NULL restriction on the intermediate stages. If we follow
> this thought process, we don't need to differentiate Var at different
> stages in query.
>

I agree this is an option.  If so we need to track it under the PlannerInfo
struct but it would not be as fine-grained as my previous. Without
intermediate information,  We can't know if a UnqiueKey contains multiple
NULLs, this would not be an issue for the "MARK Distinct as no-op" case,
but I'm not sure it is OK for other UniqueKey user cases.  So my current
idea
is I still prefer to maintain the intermediate information, unless we are
sure it
costs too much or it is too complex to implement which I don't think so for
now
at least.  So if you have time to look at the attached patch, that would be
super
great as well.

-- 
Best Regards
Andy Fan

Reply via email to