pan3793 commented on code in PR #52168: URL: https://github.com/apache/spark/pull/52168#discussion_r2308947087
########## sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala: ########## @@ -1616,11 +1616,10 @@ class StatisticsSuite extends StatisticsCollectionTestBase with TestHiveSingleto Seq(tbl, ext_tbl).foreach { tblName => sql(s"INSERT INTO $tblName VALUES (1, 'a', '2019-12-13')") - val expectedSize = 690 // analyze table sql(s"ANALYZE TABLE $tblName COMPUTE STATISTICS NOSCAN") var tableStats = getTableStats(tblName) - assert(tableStats.sizeInBytes == expectedSize) + val expectedSize = tableStats.sizeInBytes Review Comment: I read the original PR, the intention of this test is to make sure partition stats get updated, even though it equals existing table stats. The number of table's sizeInBytes does not really matter here. Generally, asserting the size of binary data files like Parquet/ORC does not make sense, it can vary due to metadata change, as you pointed out, this is likely caused by the version string change, and also might be affected by the compression codec, the compressed data length might be different in different snappy version or platform. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org