[ 
https://issues.apache.org/jira/browse/SPARK-32872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17195252#comment-17195252
 ] 

Ankur Dave edited comment on SPARK-32872 at 9/14/20, 7:30 AM:
--------------------------------------------------------------

Thanks, [~dongjoon]! Based on a quick look at the history, I believe this issue 
was introduced by [PR #6159|https://github.com/apache/spark/pull/6159] 
([SPARK-7251|https://issues.apache.org/jira/browse/SPARK-7251], commit 
[f2faa7af30662e3bdf15780f8719c71108f8e30b|https://github.com/apache/spark/commit/f2faa7af30662e3bdf15780f8719c71108f8e30b]).
 If this is true, it dates back to Spark 1.4.0. I augmented the "Affects 
Version" field accordingly.


was (Author: ankurd):
Thanks, [~dongjoon]! Based on a quick look at the history, I believe this issue 
was introduced by PR #6159 (https://issues.apache.org/jira/browse/SPARK-7251, 
commit 
https://github.com/apache/spark/commit/f2faa7af30662e3bdf15780f8719c71108f8e30b).
 If this is true, it dates back to Spark 1.4.0. I augmented the "Affects 
Version" field accordingly.

> BytesToBytesMap at MAX_CAPACITY exceeds growth threshold
> --------------------------------------------------------
>
>                 Key: SPARK-32872
>                 URL: https://issues.apache.org/jira/browse/SPARK-32872
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.1, 1.5.2, 1.6.3, 2.0.2, 2.1.3, 2.2.3, 2.3.4, 2.4.7, 
> 3.0.1
>            Reporter: Ankur Dave
>            Assignee: Ankur Dave
>            Priority: Major
>
> When BytesToBytesMap is at {{MAX_CAPACITY}} and reaches the growth threshold, 
> {{numKeys >= growthThreshold}} is true but {{longArray.size() / 2 < 
> MAX_CAPACITY}} is false. This correctly prevents the map from growing, but 
> {{canGrowArray}} incorrectly remains true. Therefore the map keeps accepting 
> new keys and exceeds its growth threshold. If we attempt to spill the map in 
> this state, the UnsafeKVExternalSorter will not be able to reuse the long 
> array for sorting, causing grouping aggregations to fail with the following 
> error:
> {{2020-09-13 18:33:48,765 ERROR Executor - Exception in task 0.0 in stage 7.0 
> (TID 69)
> org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 12982025696 
> bytes of memory, got 0
>       at 
> org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:160)
>       at 
> org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:100)
>       at 
> org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:118)
>       at 
> org.apache.spark.sql.execution.UnsafeFixedWidthAggregationMap.destructAndCreateExternalSorter(UnsafeFixedWidthAggregationMap.java:253)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doConsume_0$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithKeys_0$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.agg_doAggregateWithoutKey_0$(Unknown
>  Source)
>       at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage2.processNext(Unknown
>  Source)
>       at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>       at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:733)
>       at 
> org.apache.spark.sql.execution.collect.UnsafeRowBatchUtils$.encodeUnsafeRows(UnsafeRowBatchUtils.scala:80)
>       at 
> org.apache.spark.sql.execution.collect.Collector.$anonfun$processFunc$1(Collector.scala:187)
>       at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
>       at org.apache.spark.scheduler.Task.doRunTask(Task.scala:144)
>       at org.apache.spark.scheduler.Task.run(Task.scala:117)
>       at 
> org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$9(Executor.scala:660)
>       at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1581)
>       at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:663)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)}}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to