[ 
https://issues.apache.org/jira/browse/CALCITE-6451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Victor Barua updated CALCITE-6451:
----------------------------------
    Description: 
SetOp overrides `deriveRowType()` and computes the output row type to be the 
least restrictive across all inputs 
[here|https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rel/core/SetOp.java#L116-L127].
 

So for example given
{code:java}
Input 1: (I64, I64, I64?, I64?)
Input 2: (I64, I64?, I64, I64?) {code}
where ? denotes nullable, the least restrictive output computes:
{code:java}
Output:  (I64, I64?, I64?, I64?) {code}
For UNION operations, these nullabilities are accurate.

However for MINUS and INTERSECT there is room for improvement.

*MINUS* only returns rows from the first input, as such its output nullability 
should always match that of its first input:
{code:java}
Output: (I64, I64, I64?, I64?)  {code}
*INTERSECT* only returns rows that match across all inputs. If a column is not 
nullable in any of the inputs, then it is not nullable in the output because no 
rows can be emitted in which that column is null:
{code:java}
Output: (I64, I64, I64, I64?)  {code}

  was:
SetOp overrides `deriveRowType()` and computes the output row type to be the 
least restrictive across all inputs 
[here|https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rel/core/SetOp.java#L116-L127].
 

So for example given

 
{code:java}
Input 1: (I64, I64, I64?, I64?)
Input 2: (I64, I64?, I64, I64?) {code}
where ? denotes nullable, the least restrictive output computes:

 

 
{code:java}
Output:  (I64, I64?, I64?, I64?) {code}
For UNION operations, these nullabilities are accurate.

However for MINUS and INTERSECT there is room for improvement.

*MINUS* only returns rows from the first input, as such its output nullability 
should always match that of its first input:

 
{code:java}
Output: (I64, I64, I64?, I64?)  {code}
*INTERSECT* only returns rows that match across all inputs. If a column is not 
nullable in any of the inputs, then it is not nullable in the output because no 
rows can be emitted in which that column is null:
{code:java}
Output: (I64, I64, I64, I64?)  {code}


> Improve Nullability Derivation for Intersect and Minus
> ------------------------------------------------------
>
>                 Key: CALCITE-6451
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6451
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Victor Barua
>            Assignee: Victor Barua
>            Priority: Minor
>
> SetOp overrides `deriveRowType()` and computes the output row type to be the 
> least restrictive across all inputs 
> [here|https://github.com/apache/calcite/blob/8ab0b03326730aa2cc6b476b2cbd8f99799bdacb/core/src/main/java/org/apache/calcite/rel/core/SetOp.java#L116-L127].
>  
> So for example given
> {code:java}
> Input 1: (I64, I64, I64?, I64?)
> Input 2: (I64, I64?, I64, I64?) {code}
> where ? denotes nullable, the least restrictive output computes:
> {code:java}
> Output:  (I64, I64?, I64?, I64?) {code}
> For UNION operations, these nullabilities are accurate.
> However for MINUS and INTERSECT there is room for improvement.
> *MINUS* only returns rows from the first input, as such its output 
> nullability should always match that of its first input:
> {code:java}
> Output: (I64, I64, I64?, I64?)  {code}
> *INTERSECT* only returns rows that match across all inputs. If a column is 
> not nullable in any of the inputs, then it is not nullable in the output 
> because no rows can be emitted in which that column is null:
> {code:java}
> Output: (I64, I64, I64, I64?)  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to