I do not have a repro, too.
But, when I took a quick browse at the file 'UnsafeInMemorySort.java', I 
am afraid about the similar cast issue like 
https://issues.apache.org/jira/browse/SPARK-18458 at the following line.
https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L156

Regards,
Kazuaki Ishizaki



From:   Reynold Xin <r...@databricks.com>
To:     Nicholas Chammas <nicholas.cham...@gmail.com>
Cc:     Spark dev list <dev@spark.apache.org>
Date:   2016/12/07 14:27
Subject:        Re: Reduce memory usage of UnsafeInMemorySorter



This is not supposed to happen. Do you have a repro?


On Tue, Dec 6, 2016 at 6:11 PM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
[Re-titling thread.]
OK, I see that the exception from my original email is being triggered 
from this part of UnsafeInMemorySorter:
https://github.com/apache/spark/blob/v2.0.2/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L209-L212
So I can ask a more refined question now: How can I ensure that 
UnsafeInMemorySorter has room to insert new records? In other words, how 
can I ensure that hasSpaceForAnotherRecord() returns a true value?
Do I need:
More, smaller partitions?
More memory per executor?
Some Java or Spark option enabled?
etc.
I’m running Spark 2.0.2 on Java 7 and YARN. Would Java 8 help here? 
(Unfortunately, I cannot upgrade at this time, but it would be good to 
know regardless.)
This is morphing into a user-list question, so accept my apologies. Since 
I can’t find any information anywhere else about this, and the question 
is about internals like UnsafeInMemorySorter, I hope this is OK here.
Nick
On Mon, Dec 5, 2016 at 9:11 AM Nicholas Chammas nicholas.cham...@gmail.com 
wrote:
I was testing out a new project at scale on Spark 2.0.2 running on YARN, 
and my job failed with an interesting error message:
TaskSetManager: Lost task 37.3 in stage 31.0 (TID 10684, server.host.name
): java.lang.IllegalStateException: There is no space for new record
05:27:09.573     at 
org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.insertRecord(UnsafeInMemorySorter.java:211)
05:27:09.574     at 
org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:127)
05:27:09.574     at 
org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:244)
05:27:09.575     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown
 
Source)
05:27:09.575     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
 
Source)
05:27:09.576     at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
05:27:09.576     at 
org.apache.spark.sql.execution.WholeStageCodegenExec$anonfun$8$anon$1.hasNext(WholeStageCodegenExec.scala:370)
05:27:09.577     at 
scala.collection.Iterator$anon$11.hasNext(Iterator.scala:408)
05:27:09.577     at 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
05:27:09.577     at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
05:27:09.578     at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
05:27:09.578     at org.apache.spark.scheduler.Task.run(Task.scala:86)
05:27:09.578     at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
05:27:09.579     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
05:27:09.579     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
05:27:09.579     at java.lang.Thread.run(Thread.java:745)

I’ve never seen this before, and searching on Google/DDG/JIRA doesn’t 
yield any results. There are no other errors coming from that executor, 
whether related to memory, storage space, or otherwise.
Could this be a bug? If so, how would I narrow down the source? Otherwise, 
how might I work around the issue?
Nick
​
​



Reply via email to