sunchao commented on PR #43629: URL: https://github.com/apache/spark/pull/43629#issuecomment-1789852270
Thanks @dongjoon-hyun for the quick reply. > According to the title and first sentence of PR description, is this related to another JIRA Not really. The title means this PR proposes to in addition of updating table stats, also update partition stats with `ANALYZE TABLE` command. > Just a question. Why don't we use `REPAIR TABLE` before this? Hmm I think `REPAIR TABLE` serves a different purpose, and is used to recover partitions for an existing table that is created from a directory which contains sub-directories for partitions. On the other hand, `ANALYZE TABLE` can be used to update table & partition stats. For instance, a partition could already exist for a table, but its stats could be out-of sync, due to reasons such as data was written to the partition directory without going through Spark. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
