Robin Qiu created BEAM-10896:
--------------------------------

             Summary: Support UNNEST an array of structs
                 Key: BEAM-10896
                 URL: https://issues.apache.org/jira/browse/BEAM-10896
             Project: Beam
          Issue Type: New Feature
          Components: dsl-sql-zetasql
            Reporter: Robin Qiu
            Assignee: Robin Qiu


Currently UNNEST an array of structs does not work properly in Beam ZetaSQL:

 

e.g. SELECT p.some_field FROM table, UNNEST(table.array_of_structs) AS p

 

Execution of such queries will crash with error in ProjectScanConverter:

Exception in thread "main" java.lang.AssertionError: Field ordinal 1 is invalid 
for  type 'RecordType(VARCHAR id)'Exception in thread "main" 
java.lang.AssertionError: Field ordinal 1 is invalid for  type 
'RecordType(VARCHAR id)' at 
org.apache.beam.vendor.calcite.v1_20_0.org.apache.calcite.rex.RexBuilder.makeFieldAccess(RexBuilder.java:197)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccessInternal(ExpressionConverter.java:881)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:871)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:869)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertResolvedStructFieldAccess(ExpressionConverter.java:869)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromResolvedExpr(ExpressionConverter.java:309)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.convertRexNodeFromComputedColumnWithFieldList(ExpressionConverter.java:368)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ExpressionConverter.retrieveRexNode(ExpressionConverter.java:196)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert(ProjectScanConverter.java:45)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.ProjectScanConverter.convert(ProjectScanConverter.java:29)
 at 
org.apache.beam.sdk.extensions.sql.zetasql.translation.QueryStatementConverter.convertNode(QueryStatementConverter.java:99)

 

The root cause is that Calcite 1.20.0 Uncollect will "unwrap" the struct/row in 
an array being unnested: 
[https://github.com/apache/calcite/blob/calcite-1.20.0/core/src/main/java/org/apache/calcite/rel/core/Uncollect.java#L146-L152]

 

The Calcite Uncollect API has a change in 1.23.0 that could provide us a way to 
bypass this "unwrapping": 
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/rel/core/Uncollect.java#L171-L172



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to