On Fri, 10 Feb 2023 10:00:05 GMT, Xiaowei Lu <d...@openjdk.org> wrote:
> [JDK-8269667](https://bugs.openjdk.org/browse/JDK-8269667) has uncovered the > poor performance of BigDecimal.divide under certain circumstance. > > We confront similar situations when benchmarking Spark3 on TPC-DS test kit. > According to the flame-graph below, it is StripZeros that spends most of the > time of BigDecimal.divide. Hence we propose this patch to optimize stripping > zeros. >  > > Currently, createAndStripZerosToMatchScale() is performed linearly. That is, > the target value is parsed from back to front, each time stripping out single > ‘0’. To optimize, we can adopt the method of binary search. That is, each > time we try to strip out ${scale/2} ‘0’s. > > The performance looks good. Therotically, time complexity of our method is > O(log n), while the current one is O(n). In practice, benchmarks on Spark3 > show that 1/3 less time (102s->68s) is spent on TPC-DS query4. We also runs > Jtreg and JCK to check correctness, and it seems fine. > > More about environment: > we run Spark3.3.0 on Openjdk11, but it seems jdk version doesn’t have much > impact on BigDecimal. Spark cluster consists of a main node and 2 core nodes, > each has 4cores, 16g memory and 4x500GB storage. As for TPC-DS [AUTO-RESULT] QueryTotal=1968s vs [AUTO-RESULT] QueryTotal=1934s that gives ~1.7% of performance difference. Are you sure that this small diff is a real diff, but not run-to-run variance? ------------- PR: https://git.openjdk.org/jdk/pull/12509