We can see that when the number of been written objects equals
serializerBatchSize, the flush() will be called. But if the objects written
exceeds the default buffer size, what will happen? if this situation
happens,will the flush() be called automatelly?
private[this] def spillMemoryIteratorToDisk(inMemoryIterator:
WritablePartitionedIterator)
: SpilledFile = {
// ignore some code here
try {
while (inMemoryIterator.hasNext) {
val partitionId = inMemoryIterator.nextPartition()
require(partitionId >= 0 && partitionId < numPartitions,
s"partition Id: ${partitionId} should be in the range [0,
${numPartitions})")
inMemoryIterator.writeNext(writer)
elementsPerPartition(partitionId) += 1
objectsWritten += 1
if (objectsWritten == serializerBatchSize) {
flush()
}
}
// ignore some code here
SpilledFile(file, blockId, batchSizes.toArray, elementsPerPartition)
}
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Doubt-about-ExternalSorter-spillMemoryIteratorToDisk-tp18969.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe e-mail: [email protected]