Hi, all The ANALYZE TABLE command run from Spark on a Hive table.
Question: Before I run ANALYZE TABLE' Command on Spark-sql client, I ran 'ANALYZE TABLE' Command on Hive client, the wrong Statistic Info show up. For example 1. run the analyze table command o hive client - create table test_anaylze (id int) partitioned by (dt string); - insert into test_anaylze partition (dt = "2023-11-24") values(1321); - analyze table test_anaylze partition(dt = "2023-11-24") COMPUTE STATISTICS; 2. run the analyze table command o spark-sql client - analyze table test_anaylze partition(dt = "2023-11-24") COMPUTE STATISTICS; - DESC EXTENED test_anaylze PARTITION (dt = "2023-11-24") I got the correct Info at the first time, but when I inserted another value by using spark-sql, and ran 'ANALYZE TABLE' Command on spark-sql client, i still got right information of numRows ,totalSize. But when I inserted third value into Hive table, and ran 'ANALYZE TABLE' Command on Hive client, then I ran ran 'ANALYZE TABLE' Command on spark-sql client, I got wrong Statistic INFO from the PARTITION STATISTICS.It seems that Spark will check the INFO from hive metastore whether the params of hive (numRows, TotalSize) is currect, the param of spark (spark.sql.statistics.numRows, spark.sql.statistics.TotalSize) will not update anymore Can anyone explain why this suitation occurs? [image: 1516f391eab71a4533593f0cf167c4e.png] [image: 5b8b8067878e22875b524b49b39fa3c.png]