Hi, While working on an issue with Whole-stage codegen as reported @ https://stackoverflow.com/q/48026060/1305344 I found out that spark.sql.codegen.wholeStage=false does *not* turn whole-stage codegen off completely.
It looks like SparkPlan.newPredicate [1] gets called regardless of the value of spark.sql.codegen.wholeStage property. $ ./bin/spark-shell --conf spark.sql.codegen.wholeStage=false ... scala> spark.sessionState.conf.wholeStageEnabled res7: Boolean = false That leads to an issue in the SO question with whole-stage codegen regardless of the value: ... at org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:385) at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(basicPhysicalOperators.scala:214) at org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(basicPhysicalOperators.scala:213) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24.apply(RDD.scala:816) ... Is this a bug or does it work as intended? Why? [1] https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala?utf8=%E2%9C%93#L386 Pozdrawiam, Jacek Laskowski ---- https://about.me/JacekLaskowski Mastering Spark SQL https://bit.ly/mastering-spark-sql Spark Structured Streaming https://bit.ly/spark-structured-streaming Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski