Howdy,

I did some experiment, that (partially re-using your code to dump the rev
tree) produces this output:
https://gist.github.com/cstamas/598a3266f943984442c00df30520294f

(note: 1.8.0 resolver has two collector implementations: original
Depth-First and new Breadth-First called DF and BF respectively)

The code is not pushed yet anywhere, but I plan to make an API for this,
and as you can see, it works
for both implementations of collectors. Also, I hook ONLY into collector,
as that's the place where the graph
is being built, but this is logically equivalent to your "More interesting
... 2nd case".

Will ping once again when I have the changes....

Thanks
Tamas

On Thu, Apr 28, 2022 at 9:01 PM Tamás Cservenák <ta...@cservenak.net> wrote:

> Howdy,
>
> This is very cool, I was actually tinkering on very similar issues in
> resolver coming from totally different angles.
>
> And yes, the resolver collector is not quite "extension" friendly, but we
> will make it right.
> Just FYI, that in the latest resolver (1.8.0) there are actually two
> implementations: depth-first (original) and depth-first.
>
> By looking at your code: collection is most critical regarding performance
> and memory in the resolver, so "hooking" into it (like sending events per
> each step) might not be the best, but still, what kind of extension points
> would you envision in the collector?
>
> For example, to achieve what you want, it would be completely enough to
> receive the final CollectResult (the full graph), no?
> As -- from a resolver perspective -- that would be simplest, especially
> that now we have two collector implementations...
>
> Also, in case of multi threading, your shared stack would not cut, would
> it?
>
> I personally was also looking into these, especially after some of the
> latest additions to resolver in 1.8.0 and current master....
>
>
> Thanks
> T
>
>
> On Thu, Apr 28, 2022 at 12:45 PM Grzegorz Grzybek <gr.grzy...@gmail.com>
> wrote:
>
>> Hello
>>
>> TL;DR: https://github.com/grgrzybek/tracking-maven-extension
>>
>> I'd like to share some proof of concept I made. It all started with a
>> question "why I'm getting log4j:log4j:1.2.12" in my local Maven repository
>> when building trivial project with fresh local repo?
>>
>> I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms that
>> declare old log4j, but I needed something better.
>>
>> In short words - I managed to persist the information available in
>>
>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
>> stack.
>> I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext or used
>> with "-Dmaven.ext.class.path" which does two things:
>>
>>    1. adds org.eclipse.aether.RepositoryListener component that writes
>> some
>>    information when a dependency is FIRST downloaded from remote
>> repository
>>    2. adds org.eclipse.aether.impl.DependencyCollector component
>> (extension
>>    of org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector)
>>    that writes some information when a dependency is resolved against
>> local
>>    repository when it's already there (where no download is needed)
>>
>> In the first case, I write something like this:
>>
>> ~~~
>> Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central (
>> https://repo.maven.apache.org/maven2, default, releases))
>>    -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
>>      -> commons-digester:commons-digester:jar:1.8 (compile) (context:
>> plugin)
>>        -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
>> plugin)
>>          -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
>> (compile)
>> (context: plugin)
>>            -> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
>> (context: plugin)
>>   Reading descriptor for artifact log4j:log4j:jar::1.2.12 (context:
>> plugin)
>> (scope: ?) (repository: central (https://repo.maven.apache.org/maven2,
>> default, releases))
>>     Transitive dependencies collection for
>> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
>>       Resolution of plugin
>> org.apache.maven.plugins:maven-site-plugin:3.11.0 (org.apache:apache:25)
>> ~~~
>> Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central (
>> https://repo.maven.apache.org/maven2, default, releases))
>>   Resolution of plugin com.mycila:license-maven-plugin:3.0
>> (org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT)
>>
>> I simply write some information from available
>> org.eclipse.aether.RepositoryEvent and event's
>> org.eclipse.aether.RequestTrace.
>>
>> More interesting information is written in 2nd case. Because I wanted to
>> track ALL attempts to resolve log4j:log4j:1.2.12 (and any other
>> dependency), I needed some structure. And I decided this:
>>
>>    - every dependency directory (where e.g., _remote.repositories is
>>    written along with the jar/pom/sha1/md5/...) gets ".tracking" directory
>>    - in ".tracking" directory I write files with names of this pattern:
>>    "groupId_artifactId_type_classifier_version.dep", e.g.,
>>    org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
>>    - each such file contains a _reverse dependency tree_ that shows my why
>>    given dependency was resolved.
>>
>> For example, in
>>
>> ~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
>> (the path itself already contains information that
>> org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends (directly
>> or
>> indirectly) in log4j:logj4:1.2.12.
>> The content of this file is:
>>
>> log4j:log4j:pom:1.2.12
>>  -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
>>    -> commons-digester:commons-digester:jar:1.8 (compile) (context:
>> plugin)
>>      -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
>> plugin)
>>        -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4 (compile)
>> (context: plugin)
>>          -> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0
>> (compile) (context: plugin)
>>            -> org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2
>> ()
>> (context: plugin)
>>
>> It's kind of obvious - dependency-plugin through maven-reporint-impl,
>> through doxia, velocity, commons-digester and commons-logging "depends" on
>> malicious log4j:1.2.12 library every security scanner screams about.
>>
>> Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext and
>> build
>> everything in my work. Now I know why my
>> ~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory contains 57
>> different versions of plexus-utils for example. for example why 1.0.4 from
>> 2005?
>>
>> org.codehaus.plexus:plexus-utils:pom:1.0.4
>>  -> org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1
>> (compile) (context: plugin)
>>    -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile) (context:
>> plugin)
>>      -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 (compile)
>> (context: plugin)
>>        -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2 ()
>> (context: plugin)
>>
>> Why Guava 10.0.1?
>>
>> com.google.guava:guava:pom:10.0.1
>>  -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5 (compile)
>> (context: plugin)
>>    -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile) (context:
>> plugin)
>>      -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context: plugin)
>>        -> org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0
>> (runtime) (context: plugin)
>>          -> org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12
>> ()
>> (context: plugin)
>>
>> yes - Spring Boot 2.5.12...
>>
>> Why Log4j 2.10.0?
>>
>> org.apache.logging.log4j:log4j-api:pom:2.10.0
>>  -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile) (context:
>> project)
>>    ->
>> org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE
>> (compile) (context: project)
>>      -> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE
>> (compile) (context: project)
>>        ->
>> org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE
>> (compile) (context: project)
>>          -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1
>> (context: project)
>>
>> (see - this time the context is "project", not "plugin").
>>
>> And so on and so on.
>>
>> What is my motivation with this email? I don't know yet - ideally I'd like
>> to have this ".tracking" information created together with
>> "_remote.repositories" and "*.lastUpdated" metadata by Maven Resolver. It
>> could be optional of course (the overhead is really minimal - 1 more
>> minute
>> when building Camel 3 - 1 hour instead of 59 minutes).
>>
>> The only problem I had is that I had to fork/shade
>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector class
>> because I had to manipulate
>>
>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
>> stack around the call to
>>
>> org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency().
>> Besides this, normal plexus/sisu components are used.
>>
>> The repository is https://github.com/grgrzybek/tracking-maven-extension
>> and
>> I'd be happy to see some comments about this ;)
>>
>> kind regards
>> Grzegorz Grzybek
>>
>

Reply via email to