wangyum commented on a change in pull request #24715: [SPARK-25474][SQL] Data
source tables support fallback to HDFS for size estimation
URL: https://github.com/apache/spark/pull/24715#discussion_r317194291
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -1216,8 +1216,12 @@ object SQLConf {
.createWithDefault(true)
val ENABLE_FALL_BACK_TO_HDFS_FOR_STATS =
buildConf("spark.sql.statistics.fallBackToHdfs")
- .doc("If the table statistics are not available from table metadata enable
fall back to hdfs." +
- " This is useful in determining if a table is small enough to use auto
broadcast joins.")
+ .doc("This flag is effective only if it is Hive table. When true, it will
fall back to HDFS " +
+ "if the table statistics are not available from table metadata. This is
useful in " +
+ "determining if a table is small enough to use auto broadcast joins. " +
+ "For non-partitioned data source table, it will be automatically
recalculated if table " +
+ "statistics are not available. For partitioned data source table, It is
" +
+ s"'${DEFAULT_SIZE_IN_BYTES.key}' if table statistics are not available.")
.booleanConf
Review comment:
cc @shahidki31
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]