cxzl25 commented on issue #26323: [SPARK-29657][CORE] Iterator spill supporting radix sort with null prefix URL: https://github.com/apache/spark/pull/26323#issuecomment-547856822 ```UnsafeInMemorySorter#getSortedIterator``` Use radix sort, some keyPrefix has null, return ChainedIterator https://github.com/apache/spark/blob/44a27bdccdc39d5394ee95d935455eb7ff4b84c2/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L364-L376 ```readingIterator.spill``` https://github.com/apache/spark/blob/44a27bdccdc39d5394ee95d935455eb7ff4b84c2/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java#L519-L524 The following is a log of an error we encountered in the production environment. [Executor task launch worker for task 66055] INFO TaskMemoryManager: Memory used in task 66055 [Executor task launch worker for task 66055] INFO TaskMemoryManager: Acquired by org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@39dd866e: 64.0 KB [Executor task launch worker for task 66055] INFO TaskMemoryManager: Acquired by org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@74d17927: 4.6 GB [Executor task launch worker for task 66055] INFO TaskMemoryManager: Acquired by org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter@31478f9c: 61.0 MB [Executor task launch worker for task 66055] INFO TaskMemoryManager: 0 bytes of memory were used by task 66055 but are not associated with specific consumers [Executor task launch worker for task 66055] INFO TaskMemoryManager: 4962998749 bytes of memory are used for execution and 2218326 bytes of memory are used for storage [Executor task launch worker for task 66055] ERROR Executor: Exception in task 42.3 in stage 29.0 (TID 66055) SparkOutOfMemoryError: Unable to acquire 3436 bytes of memory, got 0
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
