Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/4550#issuecomment-74152340
@sryza spillToPartitionFiles is this fake spill case, where when there are
fewer than bypassMergeThreshold (default 200) partitions, the sort shuffle
"pretends" to be like the hash shuffle by first writing all of the data to
per-partition files, and then combining all of the files at the end. My
understanding is that we don't consider this "spill" because it's different
than the other spill case (where we actually need to spill because we've run
out of space in-memory) in that we always "spill" all of the data when there
are fewer than <= bypassMergeThreshold partitions (and this is kind of a hack
that we don't want to expose to the user). We already include the time to
write this "spill" data in the shuffle write time, so including the time to
open those files is consistent with that.
I was first thinking that we should also track the time to open the file in
writePartitionedFile, but I did some measurement on a small cluster and found
that the time to open the file in writePartitionedFile ends up being basically
insignificant, because it's only one file (compared to in
spillToPartitionedFiles, where we may open up to 200 files, and it can take a
long time when the disk is busy). I did some measurements of this that I wrote
up in the JIRA: https://issues.apache.org/jira/browse/SPARK-3570
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]