[ 
https://issues.apache.org/jira/browse/SPARK-49977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wenchen Fan resolved SPARK-49977.
---------------------------------
    Fix Version/s: 4.0.0
       Resolution: Fixed

Issue resolved by pull request 48481
[https://github.com/apache/spark/pull/48481]

> Use stack-based iterative computation to avoid creating many Scala List 
> objects for deep expression trees
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-49977
>                 URL: https://issues.apache.org/jira/browse/SPARK-49977
>             Project: Spark
>          Issue Type: Task
>          Components: Optimizer
>    Affects Versions: 4.0.0
>            Reporter: Utkarsh Agarwal
>            Assignee: Utkarsh Agarwal
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 4.0.0
>
>
> In some use cases with deep expression trees, the driver's heap shows many 
> `{{{}scala.collection.immutable.$colon$colon`{}}} objects from the heap.  The 
> objects are allocated due to deep recursion in the {{gatherCommutative}} 
> method which uses {{flatmap}} recursively. Each invocation of {{flatmap}} 
> creates a new temporary Scala collection. Our claim is based on the following 
> stack trace (>1K lines) of a thread in the driver below, truncated here for 
> brevity:
>  
> {code:java}
> "HiveServer2-Background-Pool: Thread-9867" #9867 daemon prio=5 os_prio=0 
> tid=0x00007f35080bf000 nid=0x33e7 runnable [0x00007f3393372000]   
> java.lang.Thread.State: RUNNABLE           at 
> scala.collection.immutable.List$Appender$1.apply(List.scala:350)     at 
> scala.collection.immutable.List$Appender$1.apply(List.scala:341)     at 
> scala.collection.immutable.List.flatMap(List.scala:431)      at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression.gatherCommutative(Expression.scala:1479)
>      at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression.$anonfun$gatherCommutative$1(Expression.scala:1479)
>   at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression$$Lambda$5280/143713747.apply(Unknown
>  Source) at scala.collection.immutable.List.flatMap(List.scala:366)....  at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression.gatherCommutative(Expression.scala:1479)
>      at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression.$anonfun$gatherCommutative$1(Expression.scala:1479)
>   at 
> org.apache.spark.sql.catalyst.expressions.CommutativeExpression$$Lambda$5280/143713747.apply(Unknown
>  Source) at scala.collection.immutable.List.flatMap(List.scala:366).... {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to