[
https://issues.apache.org/jira/browse/SPARK-34060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wenchen Fan resolved SPARK-34060.
---------------------------------
Fix Version/s: 3.2.0
Resolution: Fixed
Issue resolved by pull request 31112
[https://github.com/apache/spark/pull/31112]
> ALTER TABLE .. DROP PARTITION uncaches Hive table while updating table stats
> ----------------------------------------------------------------------------
>
> Key: SPARK-34060
> URL: https://issues.apache.org/jira/browse/SPARK-34060
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.2.0
> Reporter: Maxim Gekk
> Assignee: Maxim Gekk
> Priority: Major
> Fix For: 3.2.0
>
>
> The example below portraits the issue:
> {code:scala}
> scala> spark.conf.set("spark.sql.statistics.size.autoUpdate.enabled", true)
> scala> sql(s"CREATE TABLE tbl (id int, part int) USING hive PARTITIONED BY
> (part)")
> 21/01/10 13:19:59 WARN HiveMetaStore: Location:
> file:/Users/maximgekk/proj/apache-spark/spark-warehouse/tbl specified for
> non-external table:tbl
> res12: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=0) SELECT 0")
> res13: org.apache.spark.sql.DataFrame = []
> scala> sql("INSERT INTO tbl PARTITION (part=1) SELECT 1")
> res14: org.apache.spark.sql.DataFrame = []
> scala> sql("CACHE TABLE tbl")
> res15: org.apache.spark.sql.DataFrame = []
> scala> sql("SELECT * FROM tbl").show(false)
> +---+----+
> |id |part|
> +---+----+
> |0 |0 |
> |1 |1 |
> +---+----+
> scala> spark.catalog.isCached("tbl")
> res17: Boolean = true
> scala> sql("ALTER TABLE tbl DROP PARTITION (part=0)")
> res18: org.apache.spark.sql.DataFrame = []
> scala> spark.catalog.isCached("tbl")
> res19: Boolean = false
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]