Hi Brian,
Here is my code to create the PCollection<Row>.
PCollection<FileIO.ReadableFile> files = pipeline
.apply(FileIO.match().filepattern(path))
.apply(FileIO.readMatches());
PCollection<Row> input = files
.apply(ParquetIO.readFiles(avroSchema))
.apply(MapElements
.into(TypeDescriptors.rows())
.via(AvroUtils.getGenericRecordToRowFunction(AvroUtils.toBeamSchema(avroSchema))))
.setCoder(RowCoder.of(AvroUtils.toBeamSchema(avroSchema)));
From: Brian Hulette <[email protected]>
Reply-To: "[email protected]" <[email protected]>
Date: Tuesday, March 2, 2021 at 10:31 AM
To: user <[email protected]>
Subject: Re: A problem with ZetaSQL
Thanks for reporting this Tao - could you share what the type of your input
PCollection is?
On Tue, Mar 2, 2021 at 9:33 AM Tao Li <[email protected]<mailto:[email protected]>>
wrote:
Hi all,
I was following the instructions from this doc to play with ZetaSQL
https://beam.apache.org/documentation/dsls/sql/overview/<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbeam.apache.org%2Fdocumentation%2Fdsls%2Fsql%2Foverview%2F&data=04%7C01%7Ctaol%40zillow.com%7Cde9c6a92756146a41b8308d8dda95de7%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637503066785410226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=jv7rLyLR5pHlokEv1Ngnglfp%2Fvw6Ui5Mzn%2BfvJ4B104%3D&reserved=0>
The query is really simple:
options.as<https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Foptions.as%2F&data=04%7C01%7Ctaol%40zillow.com%7Cde9c6a92756146a41b8308d8dda95de7%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637503066785410226%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=q0epUsinWTFpWWJ%2BjjtAFw5RRasgT2ivm5%2FG%2FrXU1Hg%3D&reserved=0>(BeamSqlPipelineOptions.class).setPlannerName("org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner")
input.apply(SqlTransform.query("SELECT * from PCOLLECTION"))
I am seeing this error with ZetaSQL :
Exception in thread "main" java.lang.UnsupportedOperationException: Unknown
Calcite type: INTEGER
at
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSqlCalciteTranslationUtils.toZetaSqlType(ZetaSqlCalciteTranslationUtils.java:114)
at
org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.addFieldsToTable(SqlAnalyzer.java:359)
at
org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.addTableToLeafCatalog(SqlAnalyzer.java:350)
at
org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.lambda$createPopulatedCatalog$1(SqlAnalyzer.java:225)
at
com.google.common.collect.ImmutableList.forEach(ImmutableList.java:406)
at
org.apache.beam.sdk.extensions.sql.zetasql.SqlAnalyzer.createPopulatedCatalog(SqlAnalyzer.java:225)
at
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLPlannerImpl.rel(ZetaSQLPlannerImpl.java:102)
at
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRelInternal(ZetaSQLQueryPlanner.java:180)
at
org.apache.beam.sdk.extensions.sql.zetasql.ZetaSQLQueryPlanner.convertToBeamRel(ZetaSQLQueryPlanner.java:168)
at
org.apache.beam.sdk.extensions.sql.impl.BeamSqlEnv.parseQuery(BeamSqlEnv.java:114)
at
org.apache.beam.sdk.extensions.sql.SqlTransform.expand(SqlTransform.java:140)
at
org.apache.beam.sdk.extensions.sql.SqlTransform.expand(SqlTransform.java:86)
This query works fine when using Calcite (by just removing setPlannerName
call). Am I missing anything here? For example I am specifying
'com.google.guava:guava:23.0' as the dependency.
Thanks!