ignaski opened a new issue, #8265:
URL: https://github.com/apache/iceberg/issues/8265
### Apache Iceberg version
1.3.1 (latest release)
### Query engine
Spark
### Please describe the bug 🐞
Creating a table via Spark SQL respects the default table properties,
however it does not work via DataFrame API.
The issue can be reproduced using quickstart example.
```
spark-shell --packages
org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.3.1\
--conf
spark.sql.extensions=org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
\
--conf spark.sql.catalog.local=org.apache.iceberg.spark.SparkCatalog \
--conf
spark.sql.catalog.local.catalog-impl=org.apache.iceberg.rest.RESTCatalog \
--conf spark.sql.catalog.local.uri=http://rest:8181 \
--conf
spark.sql.catalog.local.io-impl=org.apache.iceberg.aws.s3.S3FileIO \
--conf spark.sql.catalog.local.warehouse=s3a://warehouse/wh/ \
--conf spark.sql.catalog.local.s3.endpoint=http://minio:9000 \
--conf spark.sql.defaultCatalog=local \
--conf
spark.sql.catalog.local.table-default.write.metadata.delete-after-commit.enabled=true
```
Creation via Spark SQL:
```
scala> spark.sql("CREATE TABLE local.nyc.taxis (vendor_id bigint)
PARTITIONED BY (vendor_id);")
res0: org.apache.spark.sql.DataFrame = []
scala> spark.sql("show create table local.nyc.taxis").show(truncate=false)
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|createtab_stmt
|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|CREATE TABLE local.nyc.taxis (\n vendor_id BIGINT)\nUSING
iceberg\nPARTITIONED BY (vendor_id)\nLOCATION
's3://warehouse/nyc/taxis'\nTBLPROPERTIES (\n 'current-snapshot-id' =
'none',\n 'format' = 'iceberg/parquet',\n 'format-version' = '1',\n
'write.metadata.delete-after-commit.enabled' = 'true')\n|
+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
Creation via DataFrame API
```
scala> import org.apache.spark.sql.types._
import org.apache.spark.sql.types._
scala> import org.apache.spark.sql.Row
import org.apache.spark.sql.Row
scala> val schema = StructType( Array(
| StructField("vendor_id", LongType,true)
| ))
schema: org.apache.spark.sql.types.StructType =
StructType(StructField(vendor_id,LongType,true))
scala> val df =
spark.createDataFrame(spark.sparkContext.emptyRDD[Row],schema)
df: org.apache.spark.sql.DataFrame = [vendor_id: bigint]
scala> df.writeTo("local.nyc.taxis_df").create()
scala> spark.sql("show create table local.nyc.taxis_df").show(truncate=false)
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|createtab_stmt
|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|CREATE TABLE local.nyc.taxis_df (\n vendor_id BIGINT)\nUSING
iceberg\nLOCATION 's3://warehouse/nyc/taxis_df'\nTBLPROPERTIES (\n
'created-at' = '2023-08-09T06:41:27.531135128Z',\n 'current-snapshot-id' =
'6638980767440031836',\n 'format' = 'iceberg/parquet',\n 'format-version' =
'1')\n|
+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]