srowen commented on a change in pull request #23625: [SPARK-26700][CORE] enable
fetch-big-block-to-memory by default
URL: https://github.com/apache/spark/pull/23625#discussion_r250223566
##########
File path: core/src/main/scala/org/apache/spark/internal/config/package.scala
##########
@@ -631,17 +631,17 @@ package object config {
private[spark] val MAX_REMOTE_BLOCK_SIZE_FETCH_TO_MEM =
ConfigBuilder("spark.maxRemoteBlockSizeFetchToMem")
.doc("Remote block will be fetched to disk when size of the block is
above this threshold " +
- "in bytes. This is to avoid a giant request takes too much memory. We
can enable this " +
- "config by setting a specific value(e.g. 200m). Note this
configuration will affect " +
- "both shuffle fetch and block manager remote block fetch. For users
who enabled " +
- "external shuffle service, this feature can only be worked when
external shuffle" +
- "service is newer than Spark 2.2.")
+ "in bytes. This is to avoid a giant request takes too much memory.
Note this " +
+ "configuration will affect both shuffle fetch and block manager remote
block fetch. " +
+ "For users who enabled external shuffle service, this feature can only
work when " +
+ "external shuffle service is newer than Spark 2.2.")
.bytesConf(ByteUnit.BYTE)
// fetch-to-mem is guaranteed to fail if the message is bigger than 2
GB, so we might
// as well use fetch-to-disk in that case. The message includes some
metadata in addition
// to the block data itself (in particular UploadBlock has a lot of
metadata), so we leave
// extra room.
- .createWithDefault(Int.MaxValue - 512)
+ .checkValue(_ <= Int.MaxValue - 512, "maxRemoteBlockSizeFetchToMem must
be less than 2GB.")
Review comment:
If someone specifies '2g' this will fail right? which might be surprising
given the message. What about reusing that lower limit in the message?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]