Re: Reduce memory usage of UnsafeInMemorySorter

Kazuaki Ishizaki Thu, 08 Dec 2016 08:51:06 -0800

The line where I pointed out would work correctly. This is because a type 
of this division is double. d2i correctly handles overflow cases.

Kazuaki Ishizaki

From:   Nicholas Chammas <nicholas.cham...@gmail.com>
To:     Kazuaki Ishizaki/Japan/IBM@IBMJP, Reynold Xin 
<r...@databricks.com>
Cc:     Spark dev list <dev@spark.apache.org>
Date:   2016/12/08 10:56
Subject:        Re: Reduce memory usage of UnsafeInMemorySorter

Unfortunately, I don't have a repro, and I'm only seeing this at scale. 
But I was able to get around the issue by fiddling with the distribution 
of my data before asking GraphFrames to process it. (I think that's where 
the error was being thrown from.)

On Wed, Dec 7, 2016 at 7:32 AM Kazuaki Ishizaki <ishiz...@jp.ibm.com> 
wrote:
I do not have a repro, too.
But, when I took a quick browse at the file 'UnsafeInMemorySort.java', I 
am afraid about the similar cast issue like 
https://issues.apache.org/jira/browse/SPARK-18458at the following line.
https://github.com/apache/spark/blob/master/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L156

Regards,
Kazuaki Ishizaki

From:        Reynold Xin <r...@databricks.com>
To:        Nicholas Chammas <nicholas.cham...@gmail.com>
Cc:        Spark dev list <dev@spark.apache.org>
Date:        2016/12/07 14:27
Subject:        Re: Reduce memory usage of UnsafeInMemorySorter

This is not supposed to happen. Do you have a repro?

On Tue, Dec 6, 2016 at 6:11 PM, Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
[Re-titling thread.]
OK, I see that the exception from my original email is being triggered 
from this part of UnsafeInMemorySorter:
https://github.com/apache/spark/blob/v2.0.2/core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeInMemorySorter.java#L209-L212

So I can ask a more refined question now: How can I ensure that 
UnsafeInMemorySorterhas room to insert new records? In other words, how 
can I ensure that hasSpaceForAnotherRecord()returns a true value?
Do I need:
More, smaller partitions?
More memory per executor?
Some Java or Spark option enabled?
etc.
I’m running Spark 2.0.2 on Java 7 and YARN. Would Java 8 help here? 
(Unfortunately, I cannot upgrade at this time, but it would be good to 
know regardless.)
This is morphing into a user-list question, so accept my apologies. Since 
I can’t find any information anywhere else about this, and the question 
is about internals like UnsafeInMemorySorter, I hope this is OK here.
Nick
On Mon, Dec 5, 2016 at 9:11 AM Nicholas Chammas nicholas.cham...@gmail.com
wrote:
I was testing out a new project at scale on Spark 2.0.2 running on YARN, 
and my job failed with an interesting error message:
TaskSetManager: Lost task 37.3 in stage 31.0 (TID 10684, server.host.name
): java.lang.IllegalStateException: There is no space for new record
05:27:09.573     at 
org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.insertRecord(UnsafeInMemorySorter.java:211)
05:27:09.574     at 
org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:127)
05:27:09.574     at 
org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:244)
05:27:09.575     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown

Source)
05:27:09.575     at 
org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown

Source)
05:27:09.576     at 
org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
05:27:09.576     at 
org.apache.spark.sql.execution.WholeStageCodegenExec$anonfun$8$anon$1.hasNext(WholeStageCodegenExec.scala:370)
05:27:09.577     at 
scala.collection.Iterator$anon$11.hasNext(Iterator.scala:408)
05:27:09.577     at 
org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
05:27:09.577     at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:79)
05:27:09.578     at 
org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:47)
05:27:09.578     at org.apache.spark.scheduler.Task.run(Task.scala:86)
05:27:09.578     at 
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
05:27:09.579     at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
05:27:09.579     at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
05:27:09.579     at java.lang.Thread.run(Thread.java:745)

I’ve never seen this before, and searching on Google/DDG/JIRA doesn’t 
yield any results. There are no other errors coming from that executor, 
whether related to memory, storage space, or otherwise.
Could this be a bug? If so, how would I narrow down the source? Otherwise, 
how might I work around the issue?
Nick
?
?

Re: Reduce memory usage of UnsafeInMemorySorter

Reply via email to