can-sun opened a new issue, #5768:
URL: https://github.com/apache/iceberg/issues/5768
### Apache Iceberg version
0.13.0
### Query engine
Spark
### Please describe the bug 🐞
iceberg-spark3-runtime version: 0.13.0
I attempted to skip the glut table name validation by setting
**glue.skip-name-validation** to true. None of the following spark sql was
successful.
iceberg catalog properties:
```
spark-shell --packages $DEPENDENCIES \
--conf
spark.sql.catalog.my_catalog=org.apache.iceberg.spark.SparkCatalog \
--conf spark.sql.catalog.my_catalog.glue.skip-name-validation=true \
--conf spark.sql.catalog.my_catalog.warehouse=<s3-placeholder>\
--conf
spark.sql.catalog.my_catalog.catalog-impl=org.apache.iceberg.aws.glue.GlueCatalog
\
--conf
spark.sql.catalog.my_catalog.io-impl=org.apache.iceberg.aws.s3.S3FileIO
```
spark sql queries tried so far
```
spark.sql("""CREATE TABLE IF NOT EXISTS my_catalog.db.`iceberg-table` ( id
string,
creation_date string,
last_update_time string)
LOCATION '<my-s3-bucket>'
TBLPROPERTIES ('table_type'='ICEBERG', 'format'='parquet',
'glue.skip-name-validation'=true) """)
spark.sql("""CREATE TABLE IF NOT EXISTS my_catalog.db.`iceberg-table` (id
string,
creation_date string,
last_update_time string)
USING iceberg
OPTIONS ( 'glue.skip-name-validation'=true )
LOCATION '<my-s3-bucket>'' """)
```
Error Stack trace:
```
java.lang.IllegalArgumentException: Invalid table identifier:
db.iceberg-table
at
org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:217)
at
org.apache.iceberg.BaseMetastoreCatalog$BaseMetastoreCatalogTableBuilder.<init>(BaseMetastoreCatalog.java:115)
at
org.apache.iceberg.BaseMetastoreCatalog.buildTable(BaseMetastoreCatalog.java:68)
at org.apache.iceberg.spark.SparkCatalog.newBuilder(SparkCatalog.java:578)
at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:148)
at org.apache.iceberg.spark.SparkCatalog.createTable(SparkCatalog.java:92)
```
Besides I also tried to write data to the glue table, also failed. The
ICEBERG table cannot be created via spark, however I can create such a table by
using Athena query.
```
df.writeTo("my_catalog.db.`iceberg-table`").append()
```
Got table nor found
```
org.apache.spark.sql.AnalysisException: Table or view not found:
my_catalog.db.`iceberg-table`;
'AppendData 'UnresolvedRelation [my_catalog, db, iceberg-table], [], false,
true
+- Project [_1#3 AS id#10, _2#4 AS creation_date#11, _3#5 AS
last_update_time#12]
+- LocalRelation [_1#3, _2#4, _3#5]
at
org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1(CheckAnalysis.scala:134)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.$anonfun$checkAnalysis$1$adapted(CheckAnalysis.scala:94)
at
org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:302)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis(CheckAnalysis.scala:94)
at
org.apache.spark.sql.catalyst.analysis.CheckAnalysis.checkAnalysis$(CheckAnalysis.scala:91)
at
org.apache.spark.sql.catalyst.analysis.Analyzer.checkAnalysis(Analyzer.scala:172)
at
org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:195)
at
org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
at
org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:192)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:90)
at
org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
at
org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:224)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
at
org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:224)
at
org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:90)
at
org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:88)
at
org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:95)
at
org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:93)
at
org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:136)
at
org.apache.spark.sql.DataFrameWriterV2.runCommand(DataFrameWriterV2.scala:194)
at
org.apache.spark.sql.DataFrameWriterV2.append(DataFrameWriterV2.scala:148)
```
I know I am not following the Glue/Athena best practices here:
https://docs.aws.amazon.com/athena/latest/ug/glue-best-practices.html, however
for purpose of backwards compatibility, I am still trying to figure out if it
is viable to use dashes in ICEBERG table name.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]