We can see that when the number of been written objects equals
serializerBatchSize, the flush() will be called.  But if the objects written
exceeds the  default buffer size, what will happen? if this situation
happens,will the flush() be called automatelly?

private[this] def spillMemoryIteratorToDisk(inMemoryIterator:
WritablePartitionedIterator)
      : SpilledFile = {

    // ignore some code here
    try {
      while (inMemoryIterator.hasNext) {
        val partitionId = inMemoryIterator.nextPartition()
        require(partitionId >= 0 && partitionId < numPartitions,
          s"partition Id: ${partitionId} should be in the range [0,
${numPartitions})")
        inMemoryIterator.writeNext(writer)
        elementsPerPartition(partitionId) += 1
        objectsWritten += 1
        if (objectsWritten == serializerBatchSize) {
          flush()
        }
      }
     
     // ignore some code here
    SpilledFile(file, blockId, batchSizes.toArray, elementsPerPartition)
  }



--
View this message in context: 
http://apache-spark-developers-list.1001551.n3.nabble.com/Doubt-about-ExternalSorter-spillMemoryIteratorToDisk-tp18969.html
Sent from the Apache Spark Developers List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Reply via email to