[
https://issues.apache.org/jira/browse/TAJO-1359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14532895#comment-14532895
]
ASF GitHub Bot commented on TAJO-1359:
--------------------------------------
Github user jihoonson commented on a diff in the pull request:
https://github.com/apache/tajo/pull/422#discussion_r29865101
--- Diff:
tajo-plan/src/main/java/org/apache/tajo/plan/nameresolver/NameResolver.java ---
@@ -240,33 +292,94 @@ static Column
resolveAliasedName(LogicalPlan.QueryBlock block, ColumnReferenceEx
}
/**
- * It returns a pair of names, which the first value is
${database}.${table} and the second value
- * is a simple column name.
+ * Lookup a qualifier and a canonical name of column.
+ *
+ * It returns a pair of names, which the first value is the qualifier
${database}.${table} and
+ * the second value is column's simple name.
*
* @param block The current block
* @param columnRef The column name
* @return A pair of normalized qualifier and column name
* @throws PlanningException
*/
- static Pair<String, String>
normalizeQualifierAndCanonicalName(LogicalPlan.QueryBlock block,
-
ColumnReferenceExpr columnRef)
+ static Pair<String, String>
lookupQualifierAndCanonicalName(LogicalPlan.QueryBlock block,
+
ColumnReferenceExpr columnRef)
throws PlanningException {
- String qualifier;
- String canonicalName;
+ Preconditions.checkArgument(columnRef.hasQualifier(),
"ColumnReferenceExpr must be qualified.");
+
+ String [] qualifierParts = columnRef.getQualifier().split("\\.");
- if (CatalogUtil.isFQTableName(columnRef.getQualifier())) {
- qualifier = columnRef.getQualifier();
- canonicalName = columnRef.getCanonicalName();
+ // This method assumes that column name consists of two or more dot
chained names.
+ // In this case, there must be three cases as follows:
+ //
+ // - dbname.tbname.column_name.nested_field...
+ // - tbname.column_name.nested_field...
+ // - column.nested_fieldX...
+
+ Set<RelationNode> guessedRelations = TUtil.newHashSet();
+
+ // this position indicates the index of column name in qualifierParts;
+ // It must be 0 or more because a qualified column is always passed to
lookupQualifierAndCanonicalName().
+ int columnNamePosition = -1;
+
+ // check for dbname.tbname.column_name.nested_field
+ if (qualifierParts.length >= 2) {
+ RelationNode rel = lookupTable(block,
CatalogUtil.buildFQName(qualifierParts[0], qualifierParts[1]));
+ if (rel != null) {
+ guessedRelations.add(rel);
+ columnNamePosition = 2;
+ }
+ }
+
+ // check for tbname.column_name.nested_field
+ if (qualifierParts.length >= 1) {
+ RelationNode rel = lookupTable(block, qualifierParts[0]);
+ if (rel != null) {
+ guessedRelations.add(rel);
+ columnNamePosition = 1;
+ }
+ }
+
+ // column.nested_fieldX...
+ if (guessedRelations.size() == 0 && qualifierParts.length == 1) {
--- End diff --
I have a question for the condition ```guessedRelations.size() == 0```.
It looks to find every candidate relation, but stop finding if any
relations are found for flat fields. I wonder the reason.
> Add nested field projector and language extension to project nested record
> --------------------------------------------------------------------------
>
> Key: TAJO-1359
> URL: https://issues.apache.org/jira/browse/TAJO-1359
> Project: Tajo
> Issue Type: Sub-task
> Components: parser, physical operator, planner/optimizer
> Reporter: Hyunsik Choi
> Assignee: Hyunsik Choi
> Fix For: 0.11.0
>
> Attachments: TAJO-1359.patch, TAJO-1359_2.patch, TAJO-1359_3.patch,
> TAJO-1359_4.patch, TAJO-1359_5.patch, TAJO-1359_6.patch, TAJO-1359_7.patch
>
>
> We need to improve Projector class to get nested record fields, and we also
> add some language extension to specify certain nested records in table
> schema. Both works should be done together. Otherwise, we need to test an
> entire work process.
> Using dot '.' would be good for the syntax to specify nested fields. Many
> systems (Hive, Google BigQuery, and Drill) already use this syntax. Probably,
> many users are familiar with this form.
> For example, if *employee* is a root nested record field and it includes
> *age* and *name* fields, consisting two fields lastname and firstname, we can
> specify them individually as follows:
> {code}
> SELECT employee.age, employee.name.lastname, employee.name.firstname FROM ...
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)