[
https://issues.apache.org/jira/browse/MNG-7677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17685849#comment-17685849
]
Tamas Cservenak commented on MNG-7677:
--------------------------------------
My last addition to this investigation: i believe [~psiroky] did identify the
real cause okay.
By inspecting and comparing "exact object allocation" concluded the following:
Callers:
387:
org.apache.maven.model.building.DefaultModelBuildingRequest 2363
<-
org.apache.maven.repository.internal.DefaultArtifactDescriptorReader.loadPom(RepositorySystemSession,
ArtifactDescriptorRequest, ArtifactDescriptorResult)
DefaultArtifactDescriptorReader.java 1235
<-
org.apache.maven.project.DefaultProjectBuilder.getModelBuildingRequest(DefaultProjectBuilder$InternalConfig)
DefaultProjectBuilder.java 1060
<-
org.apache.maven.model.building.DefaultModelBuilder.importDependencyManagement(Model,
ModelBuildingRequest, DefaultModelProblemCollector, Collection)
DefaultModelBuilder.java 68
390:
org.apache.maven.model.building.DefaultModelBuildingRequest 83819
<-
org.apache.maven.repository.internal.DefaultArtifactDescriptorReader.loadPom(RepositorySystemSession,
ArtifactDescriptorRequest, ArtifactDescriptorResult)
DefaultArtifactDescriptorReader.java 82698
<-
org.apache.maven.project.DefaultProjectBuilder.getModelBuildingRequest(DefaultProjectBuilder$InternalConfig)
DefaultProjectBuilder.java 1060
<-
org.apache.maven.model.building.DefaultModelBuilder.importDependencyManagement(Model,
ModelBuildingRequest, DefaultModelProblemCollector, Collection)
DefaultModelBuilder.java 61
Conclusion: 390 DefaultArtifactDescriptorReader.loadPom( is "sus": has x66
difference in outgoing calls (new ModelBuildingRequest).
My test: added this line and built locally (3.8.7 tag, 3.9.x branch and 3.9.x
branch w/ patched resolver
[https://gist.github.com/cstamas/2c0955ae83bb2132424b5b8f21715daf).] Let's call
them 387, 390 and 390-w-patch:
{noformat}
LoggerFactory.getLogger( getClass()).info( "XXXX: {}: REL:{}", pomArtifact, a
!= request.getArtifact() );{noformat}
To this location:
[https://github.com/apache/maven/blob/b89d5959fcde851dcb1c8946a785a163f14e1e29/maven-resolver-provider/src/main/java/org/apache/maven/repository/internal/DefaultArtifactDescriptorReader.java#L270]
(and corresponding in 3.9.0, a bit different due reformat). This line will
log, when loadPom is requested and model builder is about to be invoked (the
ModelBuildingRequest is being created).
Invoked mvn validate -Denforcer.skip on quarkus with all of them, 387, 390 and
390-patched, processed the logs as this:
{noformat}
$ mvn validate -Denforcer.skip > log.txt
$ cat log.txt | grep XXXX | sort | uniq -c | sort -n >
log-xxxx-387.txt{noformat}
So, the txt files contains UNIQUE artifacts prepended with COUNT how many times
was their POM actually loaded from disk.
Results are here:
[https://gist.github.com/cstamas/748001fa1b549835feae45ea130ccfd6]
Observations:
* all 3 of them operate on same count of unique artifacts (1195) which is good
and expected: the project defines this
* 387 parses almost all POMs once, with exception of some dozen or a bit more
* 390 parses almost about 1/3 of all POMs once, and the rest twice or even
more, probably the problem Petr pointed out, GC "ates" cached things
* 390-w-patch does almost the same thing as 387.
> Maven 3.9.0 is ~10% slower than 3.8.7 in large multi-module builds
> ------------------------------------------------------------------
>
> Key: MNG-7677
> URL: https://issues.apache.org/jira/browse/MNG-7677
> Project: Maven
> Issue Type: Bug
> Components: Performance
> Affects Versions: 3.9.0
> Reporter: Petr Široký
> Priority: Minor
> Attachments: profile-alloc-mvn-validate-3.8.7.html,
> profile-alloc-mvn-validate-3.9.0.html, profile-cpu-mvn-validate-3.8.7.html,
> profile-cpu-mvn-validate-3.9.0.html,
> profile-method-DefaultArtifact._init_-3.8.7.html,
> profile-method-DefaultArtifact._init_-3.9.0.html,
> profile-method-DefaultArtifact._init_-3.9.1-SNAPSHOT-reverted-PR166.html
>
>
> When testing the upcoming [Maven release
> 3.9.0|https://repository.apache.org/content/repositories/maven-1862/org/apache/maven/apache-maven/3.9.0/apache-maven-3.9.0-bin.tar.gz]
> I noticed the builds for large multi-module projects are somewhat slower.
> I was mainly using the [Quarkus
> projects|https://github.com/quarkusio/quarkus], which has more than 1k
> modules. I was specifically bulding the {{2.16.1.Final}} tag. Below are the
> numbers I got.
> *First batch:*
> {code:java}
> MAVEN_OPTS="-XX:+UseParallelGC -Xms2g -Xmx2g"{code}
> ||Maven cmd||Build time - Maven 3.8.7||Build time - Maven 3.9.0||
> |clean|2.9s|3.2s|
> |validate|02:17min|02:34min|
> |validate -T8|29s|33s|
> |validate -Dversion.enforcer.plugin=3.2.1|29s|40s|
> |validate -Dversion.enforcer.plugin=3.2.1 -T8|9s|14s|
> *Second batch (bigger heap):*
> {code:java}
> MAVEN_OPTS="-XX:+UseParallelGC -Xms5g -Xmx5g"{code}
> ||Maven cmd||Build time - Maven 3.8.7||Build time - Maven 3.9.0||
> |clean|2.9s|3.2s|
> |validate|02:11min|02:22min|
> |validate -T8|25s|28s|
> |validate -Dversion.enforcer.plugin=3.2.1|29s|35s|
> |validate -Dversion.enforcer.plugin=3.2.1 -T8|9s|11s|
> |install -DskipTests -Dversion.enforcer.plugin=3.2.1 -T8|01:49min|01:55min|
> *Notes:*
> * The numbers are taken from the Maven output (the Wall Clock time). When
> testing with the time utility, the "real" execution time is about 0.6s longer
> (the time JVM needs to start-up). The values in the tables are averages over
> several runs (3 to 10 depending on the time the specific run actually took).
> * All builds were run on Linux x64 (kernel 6.1.8) with JDK 19.0.1
> * HW used during testing: AMD Ryzen 5800x (8 physical cores), 32GB RAM and
> SSD Samsung 980 Pro 1TB (these should not really matter for the comparison
> itself, but I am including it for completeness)
> * Using {{ParallelGC}} since it should be the one with highest throughput
> (at the cost of longer pause times, which I don't really care about for Maven
> builds). Quick comparison with {{G1GC}} shows the builds are slightly faster,
> but it is almost negligible.
> There has already been some discussion regarding this in this older merged PR
> [https://github.com/apache/maven-resolver/pull/166] (which may be the cause
> behind the slowdown).
> See the attachments for CPU / allocation profiles (flame graphs), running
> {{mvn validate}} with both 3.8.7 and 3.9.0.
> *My analysis of the attached CPU / allocation profiles:*
> I was mainly looking at the differences in those two profiles (3.8.7 vs
> 3.9.0), not at the absolute numbers. There are possibly other places that
> could use some optimization, but that is outside of scope for this one.
> Both CPU and allocation profiles show that the CPU time / allocations
> increased the most in a code which seems to be doing some sort of dependency
> resolution, ultimately boiling down to a call to
> {{{}org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies{}}}.
> The flame graphs show a bit different method calls inside Aether
> (maven-resolver), mainly because there are some structural changes (e.g.
> different class names) to the Aether dependency resolution between
> {{maven-resolver-1.6.3}} (used in {{{}3.8.7{}}}) and {{maven-resolver-1.9.4}}
> (used in {{{}3.9.0{}}}). The dependency resolution algorithm should be the
> same though (or at least as far as I can tell).
> Digging further down, the biggest difference seems to be in the number of
> calls to {{{}org.eclipse.aether.artifact.DefaultArtifact.<init>{}}}. The
> constructor includes merging two property maps together, which results in
> quite some work (CPU) and lots of garbage (HashMap$Node, etc).
> With the above assumption, I went on to profile the number of method
> (constructor) calls to {{org.eclipse.aether.artifact.DefaultArtifact.<init>}}
> (see the attachments
> profile-method-DefaultArtifact._init_-(3.8.7|3.9.0).html). The absolute
> numbers say that Maven 3.8.7 allocated ~9mil of these objects, while Maven
> 3.9.0 allocated ~26mil instances. That is about 3x times as much. When
> drilling down to the method
> {{org.eclipse.aether.internal.impl.DefaultRepositorySystem.collectDependencies}}
> and summing all the instances (there are three), we get ~2mil for instances
> for Maven 3.8.7 and ~19mil instances for Maven 3.9.0. Even if this would not
> be the root cause the "slowness" I am seeing, it is definitely something to
> look at.
> After some further digging in the -Aether- maven-resolver code I found the
> already mentioned change done in PR
> [https://github.com/apache/maven-resolver/pull/166]. It basically replaces
> strong reference to {{org.eclipse.aether.artifact.DefaultArtifact.<init>}}
> with a weak reference, which likely means the instances are cleaned-up much
> faster, and next time they are needed they must be constructed from scratch
> again. The change basically trades lower memory consumption for higher CPU
> usage (and more garbage).
> I then tried to revert the change in
> https://github.com/apache/maven-resolver/pull/166, build maven-resolver
> 1.9.5-SNAPSHOT, update the resolver in maven-3.9.x branch and build from
> that. With this change reverted, I am basically seeing the same numbers as
> with Maven 3.8.7. Running the method profiling for
> {{org.eclipse.aether.artifact.DefaultArtifact.<init>}} with the updated Maven
> also shows the number of instances is back ~9mil as with Maven 3.8.7. Thus,
> at this point, I believe the change in
> https://github.com/apache/maven-resolver/pull/166 is the root behind the
> "slowness". If the change itself should remain, then we should probably take
> another look at how to better cache the instances.
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)