On Wed, 19 Feb 2025 09:49:54 GMT, Roberto Castañeda Lozano <rcastaned...@openjdk.org> wrote:
>>> > Hi Thomas, this looks very useful, thanks! I will run some >>> > Oracle-internal functional and performance testing and come back with the >>> > results next week. >>> >>> Functional test results (Oracle internal tier1-tier5) look good. >>> >>> I measured C2 execution time before and after the changeset using DaCapo 23 >>> and did not find any statistically significant difference, except for a >>> 2-3% regression on the jython benchmark (using large input size). This >>> small regression is IMO acceptable, particularly given that these changes >>> can be seen as an investment to improve compiler resource utilization in >>> the long run. >> >> Hi @robcasloz, interesting, I did not expect this. What did you measure? >> With Compilation statistic vs without, or with old vs new, but both enabled? >> (best, give me both sets of command line args) > >> > > Hi Thomas, this looks very useful, thanks! I will run some >> > > Oracle-internal functional and performance testing and come back with >> > > the results next week. >> > >> > >> > Functional test results (Oracle internal tier1-tier5) look good. >> > I measured C2 execution time before and after the changeset using DaCapo >> > 23 and did not find any statistically significant difference, except for a >> > 2-3% regression on the jython benchmark (using large input size). This >> > small regression is IMO acceptable, particularly given that these changes >> > can be seen as an investment to improve compiler resource utilization in >> > the long run. >> >> Hi @robcasloz, interesting, I did not expect this. What did you measure? >> With Compilation statistic vs without, or with old vs new, but both enabled? >> (best, give me both sets of command line args) > > I measured and compared C2 speed in bytecodes/s as reported by `-XX:+CITime` > (averaged over a number of repetitions). I wanted to test that the feature > does not affect C2's execution time when not used, so I simply compared C2 > compilation speed for `jdk-25+10` vs. `jdk-25+10` with this changeset applied > on top (both release builds) and `-XX:+CITime -Xbatch -XX:-TieredCompilation` > on both builds (the last two flags for better stability across benchmark > repetitions). I could observe the regression on both linux-x64 and > macosx-aarch64 platforms. Let me know if you need more details. > @robcasloz I identified and hopefully fixed a small issue that hit the > "disabled" path. Turns out we allocate arena chunks a lot more frequently > than I thought, and the new unconditional call to Thread::current() in there > was hurting a bit. I now avoid this unless I know the statistic is enabled. > > With this patch, on my machine the difference between unpatched and patched > JVM with stats disabled is below one standard deviation for the benchmark in > question. Great, thanks! Will re-run benchmarking and report results early next week. ------------- PR Comment: https://git.openjdk.org/jdk/pull/23530#issuecomment-2671587462