What I missed to mention: in my case the trees in the gist are about "resolving maven-core 3.5.8", but I guess you figured it out from the tree....
T On Mon, May 2, 2022 at 3:55 PM Tamás Cservenák <ta...@cservenak.net> wrote: > Howdy, > > I did some experiment, that (partially re-using your code to dump the rev > tree) produces this output: > https://gist.github.com/cstamas/598a3266f943984442c00df30520294f > > (note: 1.8.0 resolver has two collector implementations: original > Depth-First and new Breadth-First called DF and BF respectively) > > The code is not pushed yet anywhere, but I plan to make an API for this, > and as you can see, it works > for both implementations of collectors. Also, I hook ONLY into collector, > as that's the place where the graph > is being built, but this is logically equivalent to your "More interesting > ... 2nd case". > > Will ping once again when I have the changes.... > > Thanks > Tamas > > On Thu, Apr 28, 2022 at 9:01 PM Tamás Cservenák <ta...@cservenak.net> > wrote: > >> Howdy, >> >> This is very cool, I was actually tinkering on very similar issues in >> resolver coming from totally different angles. >> >> And yes, the resolver collector is not quite "extension" friendly, but we >> will make it right. >> Just FYI, that in the latest resolver (1.8.0) there are actually two >> implementations: depth-first (original) and depth-first. >> >> By looking at your code: collection is most critical regarding >> performance and memory in the resolver, so "hooking" into it (like sending >> events per each step) might not be the best, but still, what kind of >> extension points would you envision in the collector? >> >> For example, to achieve what you want, it would be completely enough to >> receive the final CollectResult (the full graph), no? >> As -- from a resolver perspective -- that would be simplest, especially >> that now we have two collector implementations... >> >> Also, in case of multi threading, your shared stack would not cut, would >> it? >> >> I personally was also looking into these, especially after some of the >> latest additions to resolver in 1.8.0 and current master.... >> >> >> Thanks >> T >> >> >> On Thu, Apr 28, 2022 at 12:45 PM Grzegorz Grzybek <gr.grzy...@gmail.com> >> wrote: >> >>> Hello >>> >>> TL;DR: https://github.com/grgrzybek/tracking-maven-extension >>> >>> I'd like to share some proof of concept I made. It all started with a >>> question "why I'm getting log4j:log4j:1.2.12" in my local Maven >>> repository >>> when building trivial project with fresh local repo? >>> >>> I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms that >>> declare old log4j, but I needed something better. >>> >>> In short words - I managed to persist the information available in >>> >>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes >>> stack. >>> I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext or >>> used >>> with "-Dmaven.ext.class.path" which does two things: >>> >>> 1. adds org.eclipse.aether.RepositoryListener component that writes >>> some >>> information when a dependency is FIRST downloaded from remote >>> repository >>> 2. adds org.eclipse.aether.impl.DependencyCollector component >>> (extension >>> of >>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector) >>> that writes some information when a dependency is resolved against >>> local >>> repository when it's already there (where no download is needed) >>> >>> In the first case, I write something like this: >>> >>> ~~~ >>> Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central ( >>> https://repo.maven.apache.org/maven2, default, releases)) >>> -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin) >>> -> commons-digester:commons-digester:jar:1.8 (compile) (context: >>> plugin) >>> -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context: >>> plugin) >>> -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 >>> (compile) >>> (context: plugin) >>> -> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 () >>> (context: plugin) >>> Reading descriptor for artifact log4j:log4j:jar::1.2.12 (context: >>> plugin) >>> (scope: ?) (repository: central (https://repo.maven.apache.org/maven2, >>> default, releases)) >>> Transitive dependencies collection for >>> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 () >>> Resolution of plugin >>> org.apache.maven.plugins:maven-site-plugin:3.11.0 (org.apache:apache:25) >>> ~~~ >>> Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central ( >>> https://repo.maven.apache.org/maven2, default, releases)) >>> Resolution of plugin com.mycila:license-maven-plugin:3.0 >>> (org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT) >>> >>> I simply write some information from available >>> org.eclipse.aether.RepositoryEvent and event's >>> org.eclipse.aether.RequestTrace. >>> >>> More interesting information is written in 2nd case. Because I wanted to >>> track ALL attempts to resolve log4j:log4j:1.2.12 (and any other >>> dependency), I needed some structure. And I decided this: >>> >>> - every dependency directory (where e.g., _remote.repositories is >>> written along with the jar/pom/sha1/md5/...) gets ".tracking" >>> directory >>> - in ".tracking" directory I write files with names of this pattern: >>> "groupId_artifactId_type_classifier_version.dep", e.g., >>> org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep >>> - each such file contains a _reverse dependency tree_ that shows my >>> why >>> given dependency was resolved. >>> >>> For example, in >>> >>> ~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep >>> (the path itself already contains information that >>> org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends (directly >>> or >>> indirectly) in log4j:logj4:1.2.12. >>> The content of this file is: >>> >>> log4j:log4j:pom:1.2.12 >>> -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin) >>> -> commons-digester:commons-digester:jar:1.8 (compile) (context: >>> plugin) >>> -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context: >>> plugin) >>> -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4 (compile) >>> (context: plugin) >>> -> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0 >>> (compile) (context: plugin) >>> -> org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2 >>> () >>> (context: plugin) >>> >>> It's kind of obvious - dependency-plugin through maven-reporint-impl, >>> through doxia, velocity, commons-digester and commons-logging "depends" >>> on >>> malicious log4j:1.2.12 library every security scanner screams about. >>> >>> Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext and >>> build >>> everything in my work. Now I know why my >>> ~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory contains 57 >>> different versions of plexus-utils for example. for example why 1.0.4 >>> from >>> 2005? >>> >>> org.codehaus.plexus:plexus-utils:pom:1.0.4 >>> -> org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1 >>> (compile) (context: plugin) >>> -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile) (context: >>> plugin) >>> -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 (compile) >>> (context: plugin) >>> -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2 () >>> (context: plugin) >>> >>> Why Guava 10.0.1? >>> >>> com.google.guava:guava:pom:10.0.1 >>> -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5 (compile) >>> (context: plugin) >>> -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile) (context: >>> plugin) >>> -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context: plugin) >>> -> org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0 >>> (runtime) (context: plugin) >>> -> org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12 >>> () >>> (context: plugin) >>> >>> yes - Spring Boot 2.5.12... >>> >>> Why Log4j 2.10.0? >>> >>> org.apache.logging.log4j:log4j-api:pom:2.10.0 >>> -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile) >>> (context: >>> project) >>> -> >>> org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE >>> (compile) (context: project) >>> -> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE >>> (compile) (context: project) >>> -> >>> org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE >>> (compile) (context: project) >>> -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1 >>> (context: project) >>> >>> (see - this time the context is "project", not "plugin"). >>> >>> And so on and so on. >>> >>> What is my motivation with this email? I don't know yet - ideally I'd >>> like >>> to have this ".tracking" information created together with >>> "_remote.repositories" and "*.lastUpdated" metadata by Maven Resolver. It >>> could be optional of course (the overhead is really minimal - 1 more >>> minute >>> when building Camel 3 - 1 hour instead of 59 minutes). >>> >>> The only problem I had is that I had to fork/shade >>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector class >>> because I had to manipulate >>> >>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes >>> stack around the call to >>> >>> org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency(). >>> Besides this, normal plexus/sisu components are used. >>> >>> The repository is https://github.com/grgrzybek/tracking-maven-extension >>> and >>> I'd be happy to see some comments about this ;) >>> >>> kind regards >>> Grzegorz Grzybek >>> >>