itaise opened a new issue #4212:
URL: https://github.com/apache/iceberg/issues/4212
We are trying to write field comments using spark, and comments are not
written for timestamp (without tz) fields.
For timestamp with timezone fields comments are written.
Here is a minimal example that reproduces the issue:
```
from pyspark.sql import SparkSession
spark = SparkSession.builder.master('local[1]').appName('example') \
.config("spark.sql.iceberg.use-timestamp-without-timezone-in-new-tables",
"true") \
.getOrCreate()
field_metadata = {'comment': '{"is_test": true}'}
df = spark.sql("select CAST(1000 AS TIMESTAMP)")
df = df.select([df[col_name].alias('some_field', metadata=field_metadata)
for col_name in df.columns])
spark.sql(f"use iprod") # catalog
spark.sql(f"CREATE SCHEMA IF NOT EXISTS iprod.test_schema")
df.write.mode("overwrite").format("parquet").saveAsTable("iprod.test_schema.timestamp_example")
```
When the config (use-timestamp-without...) is set to true, the data type is
Timestamp, but without the field comments.
When it is set to False - The timestamp is with time zone and with field
comments.
(Spark version is 3.1.2)
Thanks a lot!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]