Yuming Wang created SPARK-39248:
-----------------------------------
Summary: Decimal divide much slower than multiply
Key: SPARK-39248
URL: https://issues.apache.org/jira/browse/SPARK-39248
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 3.4.0
Reporter: Yuming Wang
How to reproduce this issue:
{code:scala}
import org.apache.spark.benchmark.Benchmark
val valuesPerIteration = 2880404L
val dir = "/tmp/spark/benchmark"
spark.range(2880404L).selectExpr("cast(id as DECIMAL(9,2)) as
d").write.mode("Overwrite").parquet(dir)
val benchmark = new Benchmark("Benchmark decimal", valuesPerIteration,
minNumIters = 5)
benchmark.addCase("d * 2 > 0") { _ =>
spark.read.parquet(dir).where("d * 2 >
0").write.format("noop").mode("Overwrite").save()
}
benchmark.addCase("d / 2 > 0") { _ =>
spark.read.parquet(dir).where("d / 2 >
0").write.format("noop").mode("Overwrite").save()
}
benchmark.run()
{code}
{noformat}
Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark decimal: Best Time(ms) Avg Time(ms)
Stdev(ms) Rate(M/s) Per Row(ns) Relative
------------------------------------------------------------------------------------------------------------------------
d * 2 > 0 435 558
151 6.6 150.9 1.0X
d / 2 > 0 5569 6208
734 0.5 1933.2 0.1X
{noformat}
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]