Maryann Xue created CALCITE-1208:
------------------------------------

             Summary: Improve two-level column structure handling
                 Key: CALCITE-1208
                 URL: https://issues.apache.org/jira/browse/CALCITE-1208
             Project: Calcite
          Issue Type: Improvement
          Components: core
    Affects Versions: 1.7.0
            Reporter: Maryann Xue
            Assignee: Maryann Xue


Calcite now has support for nested column structure in parsing and validation, 
by representing the inner-level columns as a RexFieldAccess based on a 
RexInputRef. Meanwhile it does not flatten the inner level structure in 
wildcard expansion, which would then cause an UnsupportedOperationException in 
Avatica.
 
The idea is to take into account this nested structure in column resolving, but 
to flatten the structure when translating to RelNode/RexNode.
For example, if the table structure is defined as
{code}VARCHAR K0,
VARCHAR C1,
RecordType(INTEGER C0, INTEGER C1) F0,
RecordType(INTEGER C0, INTEGER C2) F1{code}
, it should be viewed as a flat type like
{code}VARCHAR K0,
VARCHAR C1,
INTEGER F0.C0,
INTEGER F0.C1,
INTEGER F1.C0,
INTEGER F1.C2{code}
, so that:
1) Column reference "K0" is translated as {{$0}}
2) Column reference "F0.C1" is translated as {{$3}}
3) Wildcard "*" is translated as: {{$0, $1, $2, $3, $4, $5}}
4) Complex-column wildcard "F1.*", which is translated as {{$2, $3}}
And we would like to resolve columns based on the following rules (here we only 
consider the "suffix" part of the qualified names, which means the table 
resolving is already done by this time):
a) A two-part column name is matched with its first-level column name and its 
second-level column name. For example, "F1.C0" corresponds to $4; "F1,X" will 
throw a column not found error.
b) A single-part column name is matched against non-nested columns first, and 
if no matches, it is then matched against those second-level column names. For 
example, "C1" will be matched as "$1" instead of "$3", since non-nested columns 
have a higher priority; "C2" will be matched as "$5"; "C0" will lead to an 
ambiguous column error, since it exists under both "F0" and "F1".
c) We would also like to have a way for defining "default first-level column" 
so that it has a precedence in column resolving over other first-level columns. 
For example, if "F0" is defined as default, "C0" will not cause an ambiguous 
column error, but instead be matched as "$2".
d) Reference to first-level column only without wildcard is not allowed, e.g., 
"F1".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to