[
https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-36086.
---------------------------------
Fix Version/s: 3.2.0
Resolution: Fixed
Issue resolved by pull request 33576
[https://github.com/apache/spark/pull/33576]
> The case of the delta table is inconsistent with parquet
> --------------------------------------------------------
>
> Key: SPARK-36086
> URL: https://issues.apache.org/jira/browse/SPARK-36086
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 3.1.1
> Reporter: Yuming Wang
> Priority: Major
> Fix For: 3.2.0
>
>
> How to reproduce this issue:
> {noformat}
> 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars.
> 2. bin/spark-shell --conf
> spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf
> spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog
> {noformat}
> {code:scala}
> spark.sql("create table t1 using parquet as select id, id as lower_id from
> range(5)")
> spark.sql("CREATE VIEW v1 as SELECT * FROM t1")
> spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT
> LOWER_ID, ID FROM v1")
> spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT
> LOWER_ID, ID FROM v1")
> spark.sql("desc extended t2").show(false)
> spark.sql("desc extended t3").show(false)
> {code}
> {noformat}
> scala> spark.sql("desc extended t2").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name |data_type
> |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |lower_id |bigint
> | |
> |id |bigint
> | |
> | |
> | |
> |# Partitioning |
> | |
> |Part 0 |lower_id
> | |
> | |
> | |
> |# Detailed Table Information|
> | |
> |Name |default.t2
> | |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2|
> |
> |Provider |delta
> | |
> |Table Properties
> |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] |
> |
> +----------------------------+--------------------------------------------------------------------------+-------+
> scala> spark.sql("desc extended t3").show(false)
> +----------------------------+--------------------------------------------------------------------------+-------+
> |col_name |data_type
> |comment|
> +----------------------------+--------------------------------------------------------------------------+-------+
> |ID |bigint
> |null |
> |LOWER_ID |bigint
> |null |
> |# Partition Information |
> | |
> |# col_name |data_type
> |comment|
> |LOWER_ID |bigint
> |null |
> | |
> | |
> |# Detailed Table Information|
> | |
> |Database |default
> | |
> |Table |t3
> | |
> |Owner |yumwang
> | |
> |Created Time |Mon Jul 12 14:07:16 CST 2021
> | |
> |Last Access |UNKNOWN
> | |
> |Created By |Spark 3.1.1
> | |
> |Type |MANAGED
> | |
> |Provider |PARQUET
> | |
> |Location
> |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t3|
> |
> |Serde Library
> |org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe |
> |
> |InputFormat
> |org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat |
> |
> |OutputFormat
> |org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat |
> |
> |Partition Provider |Catalog
> | |
> +----------------------------+--------------------------------------------------------------------------+-------+
> {noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]