arunb2w opened a new issue, #5867:
URL: https://github.com/apache/iceberg/issues/5867
### Apache Iceberg version
0.14.0
### Query engine
EMR
### Please describe the bug 🐞
Facing error when creating iceberg table in EMR using Glue catalog.
spark version : 3.2.1
iceberg version: 0.14.0
**Sample code:**
```
catalog = glue_dev
warehouse_path = "s3_bucket"
database = "test"
table_name = "EPAYMENT"
spark = SparkSession \
.builder \
.config(f'spark.sql.catalog.{catalog}',
'org.apache.iceberg.spark.SparkCatalog') \
.config(f'spark.sql.catalog.{catalog}.warehouse',
f'{warehouse_path}') \
.config(f'spark.sql.catalog.{catalog}.catalog-impl',
'org.apache.iceberg.aws.glue.GlueCatalog') \
.config(f'spark.sql.catalog.{catalog}.io-impl',
'org.apache.iceberg.aws.s3.S3FileIO') \
.config('spark.sql.extensions',
'org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions') \
.config('spark.sql.catalog.spark_catalog',
'org.apache.iceberg.spark.SparkSessionCatalog') \
.config('spark.sql.catalog.spark_catalog.type', 'hive') \
.appName("IcebergDatalake") \
.getOrCreate()
df = spark_session.createDataFrame([
("100", "2015-01-01", "2015-01-01T13:51:39.340396Z"),
("101", "2015-01-01", "2015-01-01T12:14:58.597216Z"),
("102", "2015-01-01", "2015-01-01T13:51:40.417052Z"),
("103", "2015-01-01", "2015-01-01T13:51:40.519832Z")
], ["id", "creation_date", "last_update_time"])
df.writeTo(f"{catalog}.{database}." +
table_name).using("iceberg").create()
```
**Spark command used to run:**
`spark-submit --deploy-mode cluster--packages
org.apache.iceberg:iceberg-spark-runtime-3.2_2.12:0.14.0,software.amazon.awssdk:bundle:2.17.257,software.amazon.awssdk:url-connection-client:2.17.257
--conf spark.yarn.submit.waitAppCompletion=true --conf
"spark.executor.extraJavaOptions=-XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=\"/opt/spark\"" --conf spark.dynamicAllocation.enabled=true
--conf spark.executor.maxMemory=32g --conf
spark.dynamicAllocation.executorIdleTimeout=300 --conf
spark.shuffle.service.enabled=true --driver-memory 8g --num-executors 1
--executor-memory 32g --executor-cores 5 iceberg_main.py`
**Error stacktrace:**
```
Traceback (most recent call last):
File "iceberg_main.py", line 899, in <module>
bootstrap_table(tableName, spark, write_type, is_local_run,
hive_sync_enabled, database, catalog)
File "iceberg_main.py", line 428, in bootstrap_table
bootstrap_to_iceberg(table_name, write_type, spark_session,
is_local_run, hive_sync_enabled, database, catalog, stacks)
File "iceberg_main.py", line 407, in bootstrap_to_iceberg
df.writeTo(f"{catalog}.{database}." +
table_name).using("iceberg").create()
File
"/mnt/yarn/usercache/hadoop/appcache/application_1664278990474_0004/container_1664278990474_0004_01_000001/pyspark.zip/pyspark/sql/readwriter.py",
line 1129, in create
File
"/mnt/yarn/usercache/hadoop/appcache/application_1664278990474_0004/container_1664278990474_0004_01_000001/py4j-0.10.9.3-src.zip/py4j/java_gateway.py",
line 1322, in __call__
File
"/mnt/yarn/usercache/hadoop/appcache/application_1664278990474_0004/container_1664278990474_0004_01_000001/pyspark.zip/pyspark/sql/utils.py",
line 117, in deco
pyspark.sql.utils.IllegalArgumentException: Invalid table identifier:
test.EPAYMENT
```
Please provide insights on what am missing.
The same code works fine, if i use hadoop catalog instead of Glue
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]