davidvrba commented on issue #26294: [SPARK-28477] [SQL] Rewrite CaseWhen with 
single branch to If
URL: https://github.com/apache/spark/pull/26294#issuecomment-548297522
 
 
   @maropu  I rerun the query (from the description) again (in different order) 
and got slightly more conservative result: 52s for query with `when` and 40s 
for query with `if`. 
   
   I also run your query also in current master (now on different laptop with 8 
cores): 
   
   ```
   val df = spark.range(10000000000L)
   val ifVer = df.withColumn("r", expr("if(id % 2 = 0, 1, 0)")).agg(sum($"r"))
   val whenVer = df.withColumn("r", when($"id" % lit(2) === lit(0), 
lit(1)).otherwise(lit(0))).agg(sum($"r"))
   
   spark.time(ifVer.write.format("noop").mode("overwrite").save())
   spark.time(whenVer.write.format("noop").mode("overwrite").save())
   ```
   And I run the `save` 10 times and compute average `Time taken` and get 
3980.3ms for `ifVer` vs 4630.6ms for `whenVer` which leads to 14% reduction in 
time.
   
   @JoshRosen Could you please also provide some performance benchmarks for 
this?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to