On Wed, 19 Feb 2025 09:49:54 GMT, Roberto Castañeda Lozano 
<rcastaned...@openjdk.org> wrote:

>>> > Hi Thomas, this looks very useful, thanks! I will run some 
>>> > Oracle-internal functional and performance testing and come back with the 
>>> > results next week.
>>> 
>>> Functional test results (Oracle internal tier1-tier5) look good.
>>> 
>>> I measured C2 execution time before and after the changeset using DaCapo 23 
>>> and did not find any statistically significant difference, except for a 
>>> 2-3% regression on the jython benchmark (using large input size). This 
>>> small regression is IMO acceptable, particularly given that these changes 
>>> can be seen as an investment to improve compiler resource utilization in 
>>> the long run.
>> 
>> Hi @robcasloz, interesting, I did not expect this. What did you measure? 
>> With Compilation statistic vs without, or with old vs new, but both enabled? 
>> (best, give me both sets of command line args)
>
>> > > Hi Thomas, this looks very useful, thanks! I will run some 
>> > > Oracle-internal functional and performance testing and come back with 
>> > > the results next week.
>> > 
>> > 
>> > Functional test results (Oracle internal tier1-tier5) look good.
>> > I measured C2 execution time before and after the changeset using DaCapo 
>> > 23 and did not find any statistically significant difference, except for a 
>> > 2-3% regression on the jython benchmark (using large input size). This 
>> > small regression is IMO acceptable, particularly given that these changes 
>> > can be seen as an investment to improve compiler resource utilization in 
>> > the long run.
>> 
>> Hi @robcasloz, interesting, I did not expect this. What did you measure? 
>> With Compilation statistic vs without, or with old vs new, but both enabled? 
>> (best, give me both sets of command line args)
> 
> I measured and compared C2 speed in bytecodes/s as reported by `-XX:+CITime` 
> (averaged over a number of repetitions). I wanted to test that the feature 
> does not affect C2's execution time when not used, so I simply compared C2 
> compilation speed for `jdk-25+10` vs. `jdk-25+10` with this changeset applied 
> on top (both release builds) and `-XX:+CITime -Xbatch -XX:-TieredCompilation` 
> on both builds (the last two flags for better stability across benchmark 
> repetitions). I could observe the regression on both linux-x64 and 
> macosx-aarch64 platforms. Let me know if you need more details.

> @robcasloz I identified and hopefully fixed a small issue that hit the 
> "disabled" path. Turns out we allocate arena chunks a lot more frequently 
> than I thought, and the new unconditional call to Thread::current() in there 
> was hurting a bit. I now avoid this unless I know the statistic is enabled.
> 
> With this patch, on my machine the difference between unpatched and patched 
> JVM with stats disabled is below one standard deviation for the benchmark in 
> question.

Great, thanks! Will re-run benchmarking and report results early next week.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23530#issuecomment-2671587462

Reply via email to