HyukjinKwon commented on a change in pull request #26200:
[SPARK-29542][SQL][DOC] Make the descriptions of spark.sql.files.* be clearly
URL: https://github.com/apache/spark/pull/26200#discussion_r337816686
##########
File path:
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -980,33 +980,36 @@ object SQLConf {
.createWithDefault(true)
val FILES_MAX_PARTITION_BYTES =
buildConf("spark.sql.files.maxPartitionBytes")
- .doc("The maximum number of bytes to pack into a single partition when
reading files.")
+ .doc("The maximum number of bytes to pack into a single partition when
Spark file-based " +
+ "sources are used to read files.")
.bytesConf(ByteUnit.BYTE)
.createWithDefault(128 * 1024 * 1024) // parquet.block.size
val FILES_OPEN_COST_IN_BYTES = buildConf("spark.sql.files.openCostInBytes")
.internal()
- .doc("The estimated cost to open a file, measured by the number of bytes
could be scanned in" +
- " the same time. This is used when putting multiple files into a
partition. It's better to" +
- " over estimated, then the partitions with small files will be faster
than partitions with" +
- " bigger files (which is scheduled first).")
+ .doc("The estimated cost to open a file, measured by the number of bytes
could be scanned in " +
+ "the same time. This is used when putting multiple file-source files
into a partition. " +
Review comment:
file-source files sounds odd ...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]