Jinhua Fu created SPARK-25701:
---------------------------------

             Summary: Supports calculation of table statistics from partition's 
catalog statistics
                 Key: SPARK-25701
                 URL: https://issues.apache.org/jira/browse/SPARK-25701
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.3.2
            Reporter: Jinhua Fu


When obtaining table statistics, if the `totalSize` of the table is not 
defined, we fallback to HDFS to get the table statistics when 
`spark.sql.statistics.fallBackToHdfs` is `true`, otherwise the default 
value(`spark.sql.defaultSizeInBytes`) will be taken.

Fortunately, in most case the data is written into the table by a insertion 
command which will save the data-size in meta data, so it's possible to use 
meta data to calculate the table statistics.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to