Hello
Le 2025-05-26 à 22 h 55, Sergey Chernov a écrit :
Just tried the 4.0.0-rc-3 comparing with 3.9.9, it's 4x times slower
(!) on a project of ~900 modules (700 of them are jar).
Regarding performance, some months ago I noticed that Maven contains
loops over elements of a list with a call to `List.contains(…)` inside
the loop. Because `List.contains(…)` itself iterates over the elements,
we implicitly have nested loops, which have a performance cost of O(N²)
where N is the number of elements. Replacing some `ArrayList` by
`LinkedHashMap` may change some O(N²) operations into O(N) operations.
The difference between O(N) and O(N²) is unnoticeable when N is small (I
guess that this is the case of Maven's test suite). But when N become
large enough, the difference between O(N) or O(N²) can become the
difference between completing an operation in a few minutes or requiring
the age of the univers.
I did not verified if Maven 4 has more nested loops than Maven 3.
Neither I verified if the nested loops happen in algorithms where N can
possibly be large. We do not need (neither we should) replace all
`ArrayList` by `LinkedHashMap`. However, I'm not sure if we have
investigated where such replacements should be done. We may not be able
to do this investigation now, but we could keep the door open for future
performance improvements by replacing `List` by `Collection` in the API.
Even if the implementation stay `ArrayList` for now, it would allow us
to replace some of them by `LinkedHashMap` in the future if we identify
that it could bring a performance benefit.
However, an objection to this proposal was that the order of elements
matter. We had no good solution in Java 17. But if the upgrade to Java
21, proposed in a separated thread, is accepted, Maven could replace all
methods returning `java.util.List` by methods returning
`java.util.SequencedCollection` in the API. No change in the
implementation would be done at this stage (we keep the current
`ArrayList`), but a future Maven version would have more flexibility for
addressing the O(N²) kind of bottleneck, if any.
Martin