What I missed to mention: in my case the trees in the gist are about
"resolving maven-core 3.5.8", but I guess you figured it out from the
tree....

T

On Mon, May 2, 2022 at 3:55 PM Tamás Cservenák <ta...@cservenak.net> wrote:

> Howdy,
>
> I did some experiment, that (partially re-using your code to dump the rev
> tree) produces this output:
> https://gist.github.com/cstamas/598a3266f943984442c00df30520294f
>
> (note: 1.8.0 resolver has two collector implementations: original
> Depth-First and new Breadth-First called DF and BF respectively)
>
> The code is not pushed yet anywhere, but I plan to make an API for this,
> and as you can see, it works
> for both implementations of collectors. Also, I hook ONLY into collector,
> as that's the place where the graph
> is being built, but this is logically equivalent to your "More interesting
> ... 2nd case".
>
> Will ping once again when I have the changes....
>
> Thanks
> Tamas
>
> On Thu, Apr 28, 2022 at 9:01 PM Tamás Cservenák <ta...@cservenak.net>
> wrote:
>
>> Howdy,
>>
>> This is very cool, I was actually tinkering on very similar issues in
>> resolver coming from totally different angles.
>>
>> And yes, the resolver collector is not quite "extension" friendly, but we
>> will make it right.
>> Just FYI, that in the latest resolver (1.8.0) there are actually two
>> implementations: depth-first (original) and depth-first.
>>
>> By looking at your code: collection is most critical regarding
>> performance and memory in the resolver, so "hooking" into it (like sending
>> events per each step) might not be the best, but still, what kind of
>> extension points would you envision in the collector?
>>
>> For example, to achieve what you want, it would be completely enough to
>> receive the final CollectResult (the full graph), no?
>> As -- from a resolver perspective -- that would be simplest, especially
>> that now we have two collector implementations...
>>
>> Also, in case of multi threading, your shared stack would not cut, would
>> it?
>>
>> I personally was also looking into these, especially after some of the
>> latest additions to resolver in 1.8.0 and current master....
>>
>>
>> Thanks
>> T
>>
>>
>> On Thu, Apr 28, 2022 at 12:45 PM Grzegorz Grzybek <gr.grzy...@gmail.com>
>> wrote:
>>
>>> Hello
>>>
>>> TL;DR: https://github.com/grgrzybek/tracking-maven-extension
>>>
>>> I'd like to share some proof of concept I made. It all started with a
>>> question "why I'm getting log4j:log4j:1.2.12" in my local Maven
>>> repository
>>> when building trivial project with fresh local repo?
>>>
>>> I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms that
>>> declare old log4j, but I needed something better.
>>>
>>> In short words - I managed to persist the information available in
>>>
>>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
>>> stack.
>>> I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext or
>>> used
>>> with "-Dmaven.ext.class.path" which does two things:
>>>
>>>    1. adds org.eclipse.aether.RepositoryListener component that writes
>>> some
>>>    information when a dependency is FIRST downloaded from remote
>>> repository
>>>    2. adds org.eclipse.aether.impl.DependencyCollector component
>>> (extension
>>>    of
>>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector)
>>>    that writes some information when a dependency is resolved against
>>> local
>>>    repository when it's already there (where no download is needed)
>>>
>>> In the first case, I write something like this:
>>>
>>> ~~~
>>> Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central (
>>> https://repo.maven.apache.org/maven2, default, releases))
>>>    -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
>>>      -> commons-digester:commons-digester:jar:1.8 (compile) (context:
>>> plugin)
>>>        -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
>>> plugin)
>>>          -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
>>> (compile)
>>> (context: plugin)
>>>            -> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
>>> (context: plugin)
>>>   Reading descriptor for artifact log4j:log4j:jar::1.2.12 (context:
>>> plugin)
>>> (scope: ?) (repository: central (https://repo.maven.apache.org/maven2,
>>> default, releases))
>>>     Transitive dependencies collection for
>>> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
>>>       Resolution of plugin
>>> org.apache.maven.plugins:maven-site-plugin:3.11.0 (org.apache:apache:25)
>>> ~~~
>>> Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central (
>>> https://repo.maven.apache.org/maven2, default, releases))
>>>   Resolution of plugin com.mycila:license-maven-plugin:3.0
>>> (org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT)
>>>
>>> I simply write some information from available
>>> org.eclipse.aether.RepositoryEvent and event's
>>> org.eclipse.aether.RequestTrace.
>>>
>>> More interesting information is written in 2nd case. Because I wanted to
>>> track ALL attempts to resolve log4j:log4j:1.2.12 (and any other
>>> dependency), I needed some structure. And I decided this:
>>>
>>>    - every dependency directory (where e.g., _remote.repositories is
>>>    written along with the jar/pom/sha1/md5/...) gets ".tracking"
>>> directory
>>>    - in ".tracking" directory I write files with names of this pattern:
>>>    "groupId_artifactId_type_classifier_version.dep", e.g.,
>>>    org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
>>>    - each such file contains a _reverse dependency tree_ that shows my
>>> why
>>>    given dependency was resolved.
>>>
>>> For example, in
>>>
>>> ~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
>>> (the path itself already contains information that
>>> org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends (directly
>>> or
>>> indirectly) in log4j:logj4:1.2.12.
>>> The content of this file is:
>>>
>>> log4j:log4j:pom:1.2.12
>>>  -> commons-logging:commons-logging:jar:1.1 (compile) (context: plugin)
>>>    -> commons-digester:commons-digester:jar:1.8 (compile) (context:
>>> plugin)
>>>      -> org.apache.velocity:velocity-tools:jar:2.0 (compile) (context:
>>> plugin)
>>>        -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4 (compile)
>>> (context: plugin)
>>>          -> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0
>>> (compile) (context: plugin)
>>>            -> org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2
>>> ()
>>> (context: plugin)
>>>
>>> It's kind of obvious - dependency-plugin through maven-reporint-impl,
>>> through doxia, velocity, commons-digester and commons-logging "depends"
>>> on
>>> malicious log4j:1.2.12 library every security scanner screams about.
>>>
>>> Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext and
>>> build
>>> everything in my work. Now I know why my
>>> ~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory contains 57
>>> different versions of plexus-utils for example. for example why 1.0.4
>>> from
>>> 2005?
>>>
>>> org.codehaus.plexus:plexus-utils:pom:1.0.4
>>>  -> org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1
>>> (compile) (context: plugin)
>>>    -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile) (context:
>>> plugin)
>>>      -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1 (compile)
>>> (context: plugin)
>>>        -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2 ()
>>> (context: plugin)
>>>
>>> Why Guava 10.0.1?
>>>
>>> com.google.guava:guava:pom:10.0.1
>>>  -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5 (compile)
>>> (context: plugin)
>>>    -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile) (context:
>>> plugin)
>>>      -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context: plugin)
>>>        -> org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0
>>> (runtime) (context: plugin)
>>>          -> org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12
>>> ()
>>> (context: plugin)
>>>
>>> yes - Spring Boot 2.5.12...
>>>
>>> Why Log4j 2.10.0?
>>>
>>> org.apache.logging.log4j:log4j-api:pom:2.10.0
>>>  -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile)
>>> (context:
>>> project)
>>>    ->
>>> org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE
>>> (compile) (context: project)
>>>      -> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE
>>> (compile) (context: project)
>>>        ->
>>> org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE
>>> (compile) (context: project)
>>>          -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1
>>> (context: project)
>>>
>>> (see - this time the context is "project", not "plugin").
>>>
>>> And so on and so on.
>>>
>>> What is my motivation with this email? I don't know yet - ideally I'd
>>> like
>>> to have this ".tracking" information created together with
>>> "_remote.repositories" and "*.lastUpdated" metadata by Maven Resolver. It
>>> could be optional of course (the overhead is really minimal - 1 more
>>> minute
>>> when building Camel 3 - 1 hour instead of 59 minutes).
>>>
>>> The only problem I had is that I had to fork/shade
>>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector class
>>> because I had to manipulate
>>>
>>> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
>>> stack around the call to
>>>
>>> org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency().
>>> Besides this, normal plexus/sisu components are used.
>>>
>>> The repository is https://github.com/grgrzybek/tracking-maven-extension
>>> and
>>> I'd be happy to see some comments about this ;)
>>>
>>> kind regards
>>> Grzegorz Grzybek
>>>
>>

Reply via email to