Hi,

While working on an issue with Whole-stage codegen as reported @
https://stackoverflow.com/q/48026060/1305344 I found out
that spark.sql.codegen.wholeStage=false does *not* turn whole-stage codegen
off completely.

It looks like SparkPlan.newPredicate [1] gets called regardless of the
value of spark.sql.codegen.wholeStage property.

$ ./bin/spark-shell --conf spark.sql.codegen.wholeStage=false
...
scala> spark.sessionState.conf.wholeStageEnabled
res7: Boolean = false

That leads to an issue in the SO question with whole-stage codegen
regardless of the value:

...
  at
org.apache.spark.sql.execution.SparkPlan.newPredicate(SparkPlan.scala:385)
  at
org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(basicPhysicalOperators.scala:214)
  at
org.apache.spark.sql.execution.FilterExec$$anonfun$18.apply(basicPhysicalOperators.scala:213)
  at
org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndexInternal$1$$anonfun$apply$24.apply(RDD.scala:816)
...

Is this a bug or does it work as intended? Why?

[1]
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala?utf8=%E2%9C%93#L386

Pozdrawiam,
Jacek Laskowski
----
https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

Reply via email to