[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395858#comment-17395858 ] Apache Spark commented on SPARK-36086: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/33686 > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner |yumwang
[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17395856#comment-17395856 ] Apache Spark commented on SPARK-36086: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/33685 > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner |yumwang
[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17391679#comment-17391679 ] Wenchen Fan commented on SPARK-36086: - [~krivosheinruslan] please open a ticket if you are working to improve the v2 describe table command. This ticket is resolved because this column name case different is fixed. > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Assignee: angerszhu >Priority: Major > Fix For: 3.2.0 > > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner
[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17389802#comment-17389802 ] Apache Spark commented on SPARK-36086: -- User 'AngersZh' has created a pull request for this issue: https://github.com/apache/spark/pull/33576 > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Priority: Major > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner |yumwang > | | > |Created Time
[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17388156#comment-17388156 ] Ruslan Krivoshein commented on SPARK-36086: --- Let me get on with it, please > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Priority: Major > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner |yumwang > | | > |Created Time|Mon Jul 12 14:07:16 CST 2021 >
[jira] [Commented] (SPARK-36086) The case of the delta table is inconsistent with parquet
[ https://issues.apache.org/jira/browse/SPARK-36086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17383195#comment-17383195 ] Wenchen Fan commented on SPARK-36086: - Seems we should improve the v2 describe table command to include more information. > The case of the delta table is inconsistent with parquet > > > Key: SPARK-36086 > URL: https://issues.apache.org/jira/browse/SPARK-36086 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.1.1 >Reporter: Yuming Wang >Priority: Major > > How to reproduce this issue: > {noformat} > 1. Add delta-core_2.12-1.0.0-SNAPSHOT.jar to ${SPARK_HOME}/jars. > 2. bin/spark-shell --conf > spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension --conf > spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog > {noformat} > {code:scala} > spark.sql("create table t1 using parquet as select id, id as lower_id from > range(5)") > spark.sql("CREATE VIEW v1 as SELECT * FROM t1") > spark.sql("CREATE TABLE t2 USING DELTA PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("CREATE TABLE t3 USING PARQUET PARTITIONED BY (LOWER_ID) SELECT > LOWER_ID, ID FROM v1") > spark.sql("desc extended t2").show(false) > spark.sql("desc extended t3").show(false) > {code} > {noformat} > scala> spark.sql("desc extended t2").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |lower_id|bigint > | | > |id |bigint > | | > || > | | > |# Partitioning | > | | > |Part 0 |lower_id > | | > || > | | > |# Detailed Table Information| > | | > |Name|default.t2 > | | > |Location > |file:/Users/yumwang/Downloads/spark-3.1.1-bin-hadoop2.7/spark-warehouse/t2| > | > |Provider|delta > | | > |Table Properties > |[Type=MANAGED,delta.minReaderVersion=1,delta.minWriterVersion=2] | > | > ++--+---+ > scala> spark.sql("desc extended t3").show(false) > ++--+---+ > |col_name|data_type > |comment| > ++--+---+ > |ID |bigint > |null | > |LOWER_ID|bigint > |null | > |# Partition Information | > | | > |# col_name |data_type > |comment| > |LOWER_ID|bigint > |null | > || > | | > |# Detailed Table Information| > | | > |Database|default > | | > |Table |t3 > | | > |Owner |yumwang > | | > |Created Time|Mon Jul 12 14:07:16 CST