HyukjinKwon commented on a change in pull request #26200: [SPARK-29542] Make
the description of spark.sql.files.maxPartitionBytes be clearly
URL: https://github.com/apache/spark/pull/26200#discussion_r337363240
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -980,7 +980,8 @@ object SQLConf {
.createWithDefault(true)
val FILES_MAX_PARTITION_BYTES =
buildConf("spark.sql.files.maxPartitionBytes")
- .doc("The maximum number of bytes to pack into a single partition when
reading files.")
+ .doc("The maximum number of bytes to pack into a single partition when
reading files" +
+ " for data source table.")
Review comment:
I think @wangyum meant that, if we enable
`spark.sql.hive.convertMetastoreParquet` or
`spark.sql.hive.convertMetastoreOrc`, reading Hive table is also affected by
this configurations. So, saying "for data source table" is not entirely correct.
"data source table" is also incorrect in the sense of DataFrame or Dataset.
Also, it does not support, IIRC, regular other data source tables. it only
applies to file sources.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]