Yuming Wang created SPARK-39248:
-----------------------------------

             Summary: Decimal divide much slower than multiply
                 Key: SPARK-39248
                 URL: https://issues.apache.org/jira/browse/SPARK-39248
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 3.4.0
            Reporter: Yuming Wang


How to reproduce this issue:

{code:scala}
    import org.apache.spark.benchmark.Benchmark

    val valuesPerIteration = 2880404L
    val dir = "/tmp/spark/benchmark"
    spark.range(2880404L).selectExpr("cast(id as DECIMAL(9,2)) as 
d").write.mode("Overwrite").parquet(dir)

    val benchmark = new Benchmark("Benchmark decimal", valuesPerIteration, 
minNumIters = 5)
    benchmark.addCase("d * 2 > 0") { _ =>
      spark.read.parquet(dir).where("d * 2 > 
0").write.format("noop").mode("Overwrite").save()
    }

    benchmark.addCase("d / 2 > 0") { _ =>
      spark.read.parquet(dir).where("d / 2 > 
0").write.format("noop").mode("Overwrite").save()
    }
    benchmark.run()
{code}

{noformat}
Java HotSpot(TM) 64-Bit Server VM 1.8.0_281-b09 on Mac OS X 10.15.7
Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
Benchmark decimal:                        Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
------------------------------------------------------------------------------------------------------------------------
d * 2 > 0                                           435            558         
151          6.6         150.9       1.0X
d / 2 > 0                                          5569           6208         
734          0.5        1933.2       0.1X
{noformat}





--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to