Ivan Tsukanov created SPARK-28742:
-------------------------------------

             Summary: StackOverflowError when using otherwise(col()) in a loop
                 Key: SPARK-28742
                 URL: https://issues.apache.org/jira/browse/SPARK-28742
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.4.3, 2.4.0
            Reporter: Ivan Tsukanov


The following code

 
{code:java}
val rdd = sparkContext.makeRDD(Seq(Row("1")))
val schema = StructType(Seq(
  StructField("c1", StringType)
))

val df = sparkSession.createDataFrame(rdd, schema)
val column = when(col("c1").isin("1"), "1").otherwise(col("c1"))

(1 to 9).foldLeft(df) { case (acc, _) =>
  val res = acc.withColumn("c1", column)
  res.take(1)
  res
}
{code}
falls with

 
{code:java}
java.lang.StackOverflowError
   at org.codehaus.janino.CodeContext.flowAnalysis(CodeContext.java:395)
   ...{code}
 

Probably, the problem is spark generates unexplainable big Physical Plan -

!image-2019-08-15-15-10-13-397.png!

 

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to