[
https://issues.apache.org/jira/browse/SPARK-34084?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Apache Spark reassigned SPARK-34084:
------------------------------------
Assignee: Apache Spark
> ALTER TABLE .. ADD PARTITION does not update table stats
> --------------------------------------------------------
>
> Key: SPARK-34084
> URL: https://issues.apache.org/jira/browse/SPARK-34084
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 3.0.2, 3.2.0, 3.1.1
> Environment: strong text
> Reporter: Maxim Gekk
> Assignee: Apache Spark
> Priority: Major
>
> The example below portraits the issue:
> {code:sql}
> spark-sql> create table tbl (col0 int, part int) partitioned by (part);
> spark-sql> insert into tbl partition (part = 0) select 0;
> spark-sql> set spark.sql.statistics.size.autoUpdate.enabled=true;
> spark-sql> alter table tbl add partition (part = 1);
> {code}
> There are no stats:
> {code:sql}
> spark-sql> describe table extended tbl;
> col0 int NULL
> part int NULL
> # Partition Information
> # col_name data_type comment
> part int NULL
> # Detailed Table Information
> Database default
> Table tbl
> Owner maximgekk
> Created Time Tue Jan 12 12:00:03 MSK 2021
> Last Access UNKNOWN
> Created By Spark 3.2.0-SNAPSHOT
> Type MANAGED
> Provider hive
> Table Properties [transient_lastDdlTime=1610442003]
> Location
> file:/Users/maximgekk/proj/fix-stats-in-add-partition/spark-warehouse/tbl
> Serde Library org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
> InputFormat org.apache.hadoop.mapred.TextInputFormat
> OutputFormat org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Storage Properties [serialization.format=1]
> Partition Provider Catalog
> {code}
> *As we can see there is no stats.* For instance, ALTER TABLE .. DROP
> PARTITION updates stats:
> {code:sql}
> spark-sql> alter table tbl drop partition (part = 1);
> spark-sql> describe table extended tbl;
> col0 int NULL
> part int NULL
> # Partition Information
> # col_name data_type comment
> part int NULL
> # Detailed Table Information
> ...
> Statistics 2 bytes
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]