This is not supposed to happen. Do you have a repro?
On Tue, Dec 6, 2016 at 6:11 PM, Nicholas Chammas <nicholas.cham...@gmail.com > wrote: > [Re-titling thread.] > > OK, I see that the exception from my original email is being triggered > from this part of UnsafeInMemorySorter: > > https://github.com/apache/spark/blob/v2.0.2/core/src/ > main/java/org/apache/spark/util/collection/unsafe/sort/ > UnsafeInMemorySorter.java#L209-L212 > > So I can ask a more refined question now: How can I ensure that > UnsafeInMemorySorter has room to insert new records? In other words, how > can I ensure that hasSpaceForAnotherRecord() returns a true value? > > Do I need: > > - More, smaller partitions? > - More memory per executor? > - Some Java or Spark option enabled? > - etc. > > I’m running Spark 2.0.2 on Java 7 and YARN. Would Java 8 help here? > (Unfortunately, I cannot upgrade at this time, but it would be good to know > regardless.) > > This is morphing into a user-list question, so accept my apologies. Since > I can’t find any information anywhere else about this, and the question is > about internals like UnsafeInMemorySorter, I hope this is OK here. > > Nick > > On Mon, Dec 5, 2016 at 9:11 AM Nicholas Chammas nicholas.cham...@gmail.com > <http://mailto:nicholas.cham...@gmail.com> wrote: > > I was testing out a new project at scale on Spark 2.0.2 running on YARN, >> and my job failed with an interesting error message: >> >> TaskSetManager: Lost task 37.3 in stage 31.0 (TID 10684, server.host.name): >> java.lang.IllegalStateException: There is no space for new record >> 05:27:09.573 at >> org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.insertRecord(UnsafeInMemorySorter.java:211) >> 05:27:09.574 at >> org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:127) >> 05:27:09.574 at >> org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:244) >> 05:27:09.575 at >> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown >> Source) >> 05:27:09.575 at >> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown >> Source) >> 05:27:09.576 at >> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) >> 05:27:09.576 at >> org.apache.spark.sql.execution.WholeStageCodegenExec$anonfun$8$anon$1.hasNext(WholeStageCodegenExec.scala:370) >> 05:27:09.577 at >> scala.collection.Iterator$anon$11.hasNext(Iterator.scala:408) >> 05:27:09.577 at >> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125) >> 05:27:09.577 at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79) >> 05:27:09.578 at >> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47) >> 05:27:09.578 at org.apache.spark.scheduler.Task.run(Task.scala:86) >> 05:27:09.578 at >> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) >> 05:27:09.579 at >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> 05:27:09.579 at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) >> 05:27:09.579 at java.lang.Thread.run(Thread.java:745) >> >> I’ve never seen this before, and searching on Google/DDG/JIRA doesn’t >> yield any results. There are no other errors coming from that executor, >> whether related to memory, storage space, or otherwise. >> >> Could this be a bug? If so, how would I narrow down the source? >> Otherwise, how might I work around the issue? >> >> Nick >> >> > >