[
https://issues.apache.org/jira/browse/MNG-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685319#comment-17685319
]
Petr Široký commented on MNG-7677:
----------------------------------
I can confirm I am seeing better results when using the patch
[https://gist.github.com/cstamas/2c0955ae83bb2132424b5b8f21715daf], the build
times are _almost_ back to the ones with 3.8.7 (I will try to run all the tests
again and post the whole table, similarly as in the description).
That being said, I could not quite get to the exact same build times as with
3.8.7. There still seems to be some very small overhead in 3.9.0, usually
around several hundred milliseconds to 1.5s. However, with these small numbers,
it is quite hard to be sure. I certainly did not run the build enough times to
have statistically relevant results.
> Maven 3.9.0 is ~10% slower than 3.8.7 in large multi-module builds
> ------------------------------------------------------------------
>
> Key: MNG-7677
> URL: https://issues.apache.org/jira/browse/MNG-7677
> Project: Maven
> Issue Type: Bug
> Components: Performance
> Affects Versions: 3.9.0
> Reporter: Petr Široký
> Priority: Minor
> Attachments: profile-alloc-mvn-validate-3.8.7.html,
> profile-alloc-mvn-validate-3.9.0.html, profile-cpu-mvn-validate-3.8.7.html,
> profile-cpu-mvn-validate-3.9.0.html,
> profile-method-DefaultArtifact._init_-3.8.7.html,
> profile-method-DefaultArtifact._init_-3.9.0.html,
> profile-method-DefaultArtifact._init_-3.9.1-SNAPSHOT-reverted-PR166.html
>
>
> When testing the upcoming [Maven release
> 3.9.0|https://repository.apache.org/content/repositories/maven-1862/org/apache/maven/apache-maven/3.9.0/apache-maven-3.9.0-bin.tar.gz]
> I noticed the builds for large multi-module projects are somewhat slower.
> I was mainly using the [Quarkus
> projects|https://github.com/quarkusio/quarkus], which has more than 1k
> modules. I was specifically bulding the {{2.16.1.Final}} tag. Below are the
> numbers I got.
> *First batch:*
> {code:java}
> MAVEN_OPTS="-XX:+UseParallelGC -Xms2g -Xmx2g"{code}
> ||Maven cmd||Build time - Maven 3.8.7||Build time - Maven 3.9.0||
> |clean|2.9s|3.2s|
> |validate|02:17min|02:34min|
> |validate -T8|29s|33s|
> |validate -Dversion.enforcer.plugin=3.2.1|29s|40s|
> |validate -Dversion.enforcer.plugin=3.2.1 -T8|9s|14s|
> *Second batch (bigger heap):*
> {code:java}
> MAVEN_OPTS="-XX:+UseParallelGC -Xms5g -Xmx5g"{code}
> ||Maven cmd||Build time - Maven 3.8.7||Build time - Maven 3.9.0||
> |clean|2.9s|3.2s|
> |validate|02:11min|02:22min|
> |validate -T8|25s|28s|
> |validate -Dversion.enforcer.plugin=3.2.1|29s|35s|
> |validate -Dversion.enforcer.plugin=3.2.1 -T8|9s|11s|
> |install -DskipTests -Dversion.enforcer.plugin=3.2.1 -T8|01:49min|01:55min|
> *Notes:*
> * The numbers are taken from the Maven output (the Wall Clock time). When
> testing with the time utility, the "real" execution time is about 0.6s longer
> (the time JVM needs to start-up). The values in the tables are averages over
> several runs (3 to 10 depending on the time the specific run actually took).
> * All builds were run on Linux x64 (kernel 6.1.8) with JDK 19.0.1
> * HW used during testing: AMD Ryzen 5800x (8 physical cores), 32GB RAM and
> SSD Samsung 980 Pro 1TB (these should not really matter for the comparison
> itself, but I am including it for completeness)
> * Using {{ParallelGC}} since it should be the one with highest throughput
> (at the cost of longer pause times, which I don't really care about for Maven
> builds). Quick comparison with {{G1GC}} shows the builds are slightly faster,
> but it is almost negligible.
> There has already been some discussion regarding this in this older merged PR
> [https://github.com/apache/maven-resolver/pull/166] (which may be the cause
> behind the slowdown).
> See the attachments for CPU / allocation profiles (flame graphs), running
> {{mvn validate}} with both 3.8.7 and 3.9.0.
> *My analysis of the attached CPU / allocation profiles:*
> I was mainly looking at the differences in those two profiles (3.8.7 vs
> 3.9.0), not at the absolute numbers. There are possibly other places that
> could use some optimization, but that is outside of scope for this one.
> Both CPU and allocation profiles show that the CPU time / allocations
> increased the most in a code which seems to be doing some sort of dependency
> resolution, ultimately boiling down to a call to
> {{{}org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies{}}}.
> The flame graphs show a bit different method calls inside Aether
> (maven-resolver), mainly because there are some structural changes (e.g.
> different class names) to the Aether dependency resolution between
> {{maven-resolver-1.6.3}} (used in {{{}3.8.7{}}}) and {{maven-resolver-1.9.4}}
> (used in {{{}3.9.0{}}}). The dependency resolution algorithm should be the
> same though (or at least as far as I can tell).
> Digging further down, the biggest difference seems to be in the number of
> calls to {{{}org.eclipse.aether.artifact.DefaultArtifact.<init>{}}}. The
> constructor includes merging two property maps together, which results in
> quite some work (CPU) and lots of garbage (HashMap$Node, etc).
> With the above assumption, I went on to profile the number of method
> (constructor) calls to {{org.eclipse.aether.artifact.DefaultArtifact.<init>}}
> (see the attachments
> profile-method-DefaultArtifact._init_-(3.8.7|3.9.0).html). The absolute
> numbers say that Maven 3.8.7 allocated ~9mil of these objects, while Maven
> 3.9.0 allocated ~26mil instances. That is about 3x times as much. When
> drilling down to the method
> {{org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies}}
> and summing all the instances (there are three), we get ~2mil for instances
> for Maven 3.8.7 and ~19mil instances for Maven 3.9.0. Even if this would not
> be the root cause the "slowness" I am seeing, it is definitely something to
> look at.
> After some further digging in the -Aether- maven-resolver code I found the
> already mentioned change done in PR
> [https://github.com/apache/maven-resolver/pull/166]. It basically replaces
> strong reference to {{org.eclipse.aether.artifact.DefaultArtifact.<init>}}
> with a weak reference, which likely means the instances are cleaned-up much
> faster, and next time they are needed they must be constructed from scratch
> again. The change basically trades lower memory consumption for higher CPU
> usage (and more garbage).
> I then tried to revert the change in
> https://github.com/apache/maven-resolver/pull/166, build maven-resolver
> 1.9.5-SNAPSHOT, update the resolver in maven-3.9.x branch and build from
> that. With this change reverted, I am basically seeing the same numbers as
> with Maven 3.8.7. Running the method profiling for
> {{org.eclipse.aether.artifact.DefaultArtifact.<init>}} with the updated Maven
> also shows the number of instances is back ~9mil as with Maven 3.8.7. Thus,
> at this point, I believe the change in
> https://github.com/apache/maven-resolver/pull/166 is the root behind the
> "slowness". If the change itself should remain, then we should probably take
> another look at how to better cache the instances.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)