[
https://issues.apache.org/jira/browse/CALCITE-1208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15292382#comment-15292382
]
Julian Hyde commented on CALCITE-1208:
--------------------------------------
bq. Can we add comment / example to explain the three new StructKind? What are
the implication for each StructKind?
Yes, we will improve the description of each StructKind.
bq. For the 3rd case in testStructType() "select c2 from struct.t", I'm not
sure if it makes sense to use "c2" to refer a sub-field under field "F1". Are
we aware of any other system that would allow such behavior?
It isn't the "normal" behavior of SQL but it absolutely makes sense in
Phoenix/HBase because HBase organizes columns into column families and you want
to be able to optionally qualify a column with its family name. (Qualifying the
column becomes mandatory if the same name occurs in more than one column
family.) If your database doesn't need those semantics, don't use that
particular StructKind.
> Improve two-level column structure handling
> -------------------------------------------
>
> Key: CALCITE-1208
> URL: https://issues.apache.org/jira/browse/CALCITE-1208
> Project: Calcite
> Issue Type: Improvement
> Components: core
> Affects Versions: 1.7.0
> Reporter: Maryann Xue
> Assignee: Julian Hyde
> Fix For: 1.8.0
>
>
> Calcite now has support for nested column structure in parsing and
> validation, by representing the inner-level columns as a RexFieldAccess based
> on a RexInputRef. Meanwhile it does not flatten the inner level structure in
> wildcard expansion, which would then cause an UnsupportedOperationException
> in Avatica.
>
> The idea is to take into account this nested structure in column resolving,
> but to flatten the structure when translating to RelNode/RexNode.
> For example, if the table structure is defined as
> {code}VARCHAR K0,
> VARCHAR C1,
> RecordType(INTEGER C0, INTEGER C1) F0,
> RecordType(INTEGER C0, INTEGER C2) F1{code}
> , it should be viewed as a flat type like
> {code}VARCHAR K0,
> VARCHAR C1,
> INTEGER F0.C0,
> INTEGER F0.C1,
> INTEGER F1.C0,
> INTEGER F1.C2{code}
> , so that:
> 1) Column reference "K0" is translated as {{$0}}
> 2) Column reference "F0.C1" is translated as {{$3}}
> 3) Wildcard "*" is translated as: {{$0, $1, $2, $3, $4, $5}}
> 4) Complex-column wildcard "F1.*", which is translated as {{$2, $3}}
> And we would like to resolve columns based on the following rules (here we
> only consider the "suffix" part of the qualified names, which means the table
> resolving is already done by this time):
> a) A two-part column name is matched with its first-level column name and its
> second-level column name. For example, "F1.C0" corresponds to $4; "F1,X" will
> throw a column not found error.
> b) A single-part column name is matched against non-nested columns first, and
> if no matches, it is then matched against those second-level column names.
> For example, "C1" will be matched as "$1" instead of "$3", since non-nested
> columns have a higher priority; "C2" will be matched as "$5"; "C0" will lead
> to an ambiguous column error, since it exists under both "F0" and "F1".
> c) We would also like to have a way for defining "default first-level column"
> so that it has a precedence in column resolving over other first-level
> columns. For example, if "F0" is defined as default, "C0" will not cause an
> ambiguous column error, but instead be matched as "$2".
> d) Reference to first-level column only without wildcard is not allowed,
> e.g., "F1".
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)