davidvrba commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with single branch to If URL: https://github.com/apache/spark/pull/26294#issuecomment-548297522 @maropu I rerun the query (from the description) again (in different order) and got slightly more conservative result: 52s for query with `when` and 40s for query with `if`. I also run your query also in current master (now on different laptop with 8 cores): ``` val df = spark.range(10000000000L) val ifVer = df.withColumn("r", expr("if(id % 2 = 0, 1, 0)")).agg(sum($"r")) val whenVer = df.withColumn("r", when($"id" % lit(2) === lit(0), lit(1)).otherwise(lit(0))).agg(sum($"r")) spark.time(ifVer.write.format("noop").mode("overwrite").save()) spark.time(whenVer.write.format("noop").mode("overwrite").save()) ``` And I run the `save` 10 times and compute average `Time taken` and get 3980.3ms for `ifVer` vs 4630.6ms for `whenVer` which leads to 14% reduction in time. @JoshRosen Could you please also provide some performance benchmarks for this?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
