Sandy Ryza created SPARK-5581:
---------------------------------
Summary: When writing sorted map output file, avoid open / close
between each partition
Key: SPARK-5581
URL: https://issues.apache.org/jira/browse/SPARK-5581
Project: Spark
Issue Type: Improvement
Affects Versions: 1.3.0
Reporter: Sandy Ryza
{code}
// Bypassing merge-sort; get an iterator by partition and just write
everything directly.
for ((id, elements) <- this.partitionedIterator) {
if (elements.hasNext) {
val writer = blockManager.getDiskWriter(
blockId, outputFile, ser, fileBufferSize,
context.taskMetrics.shuffleWriteMetrics.get)
for (elem <- elements) {
writer.write(elem)
}
writer.commitAndClose()
val segment = writer.fileSegment()
lengths(id) = segment.length
}
}
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]