Hello!

Thanks for your comments and PR - I needed to switch to different tasks,
but soon (next week?) I'm going to spend more time on it. I yet have to get
a feeling of the graph/stack that could be passed around.
And check these DF/BF dependency collectors (as I didn't see them in
resolver 1.6.3). I'll keep the
https://issues.apache.org/jira/browse/MRESOLVER-248 tab open till I check
it ;)

kind regards
Grzegorz Grzybek


śr., 11 maj 2022 o 18:40 Tamás Cservenák <ta...@cservenak.net> napisał(a):

> Howdy,
>
> https://github.com/apache/maven-resolver/pull/176
>
> So here is some implementation "demo" (that could be made into extension
> point), as explained in Draft PR description.
> BUT, also as written in PR, am getting a feeling that doing this is
> "dangerous", and a simple callback with whole collected graph would be
> better....
>
>
> WDYT?
>
> Tamas
>
> On Mon, May 2, 2022 at 4:18 PM Tamás Cservenák <ta...@cservenak.net>
> wrote:
>
> > Howdy,
> >
> > just a few short answers:
> > - 1st: Personally, from a Resolver perspective, I'd just provide an API
> > (basically the author extending resolver should implement) and make it
> > simple to "click in" (Sisu component discovery).
> > - 2nd: resolver IMHO should not provide any out of the box component
> > implementation at all
> >
> > So 1st would provide a "stable" extension point for users who would like
> > to "integrate" with resolver at this point (like you did), but it could
> > become possible using simply this new API, instead the hoops and loops
> your
> > code was forced to do (as resolver is quite "closed" in this respect).
> >
> > As for 2nd point, while I do like your idea of "decorating" local
> > repository, I'd try a bit different route: I'd integrate this
> > https://github.com/lambdazen/bitsy that makes possible to use Apache
> > Tinkerpop's Gremlin queries to ask about the built graph for example...
> >
> > And one big remark: the collector is the "hottest point" in resolver
> (heap
> > and cpu wise), so ANY "new API" implementation should be aware, that each
> > "lost" millisecond directly affects resolver collection speed, but I
> think
> > for "research kind" of stuff, of just "recording the process result"
> should
> > fit in just fine. I don't see this as a "standard" feature of Maven, but
> > who knows? :)
> >
> > Just my 5 cents...
> >
> > HTH
> > Tamas
> >
> > On Mon, May 2, 2022 at 4:09 PM Grzegorz Grzybek <gr.grzy...@gmail.com>
> > wrote:
> >
> >> Thank you Tamás for checking my experiment
> >>
> >> I'm just finishing my work before tomorrow's national holiday, but will
> >> read your information more carefully soon.
> >>
> >> Whether it's DFS or BFS, as long as there's tracking from initial to
> >> ultimate dependency, it's enough. DFS sounds more "natural" here
> though. I
> >> didn't check the CollectResult class yet - is it created per dependency
> or
> >> for the entire project?
> >>
> >> And yes - I didn't check multithreading, as in normal scenario (just
> `mvn
> >> clean install`) I didn't observe concurrency issues accessing the stack.
> >> Mind that I know a bit about maven "components", but there are
> definitely
> >> few missing things in my understanding.
> >>
> >> Checking your output, I see there are two aspects of this potential
> >> enhancement to the resolver:
> >>  - 1st - how to effectively collect the "reverse dependency tree" in
> >> context of DFS/BFS/multithreading
> >>  - 2nd - how to write the information
> >>
> >> 2nd aspect could include:
> >>  - whether there should be ".tracking" for each GAV directory in local
> >> repo
> >> (tracking for the purpose of entire local repository)
> >>  - maybe there should be configurable output location for single report
> of
> >> a build? (tracking for the purpose of single project)
> >>  - which format to use (human consumable or machine readable?)
> >>
> >> For now I've used resolver 1.6.3 from Maven 3.8.5, but I'll look at
> `main`
> >> branch too.
> >>
> >> kind regards
> >> Grzegorz Grzybek
> >>
> >>
> >> pon., 2 maj 2022 o 15:57 Tamás Cservenák <ta...@cservenak.net>
> >> napisał(a):
> >>
> >> > What I missed to mention: in my case the trees in the gist are about
> >> > "resolving maven-core 3.5.8", but I guess you figured it out from the
> >> > tree....
> >> >
> >> > T
> >> >
> >> > On Mon, May 2, 2022 at 3:55 PM Tamás Cservenák <ta...@cservenak.net>
> >> > wrote:
> >> >
> >> > > Howdy,
> >> > >
> >> > > I did some experiment, that (partially re-using your code to dump
> the
> >> rev
> >> > > tree) produces this output:
> >> > > https://gist.github.com/cstamas/598a3266f943984442c00df30520294f
> >> > >
> >> > > (note: 1.8.0 resolver has two collector implementations: original
> >> > > Depth-First and new Breadth-First called DF and BF respectively)
> >> > >
> >> > > The code is not pushed yet anywhere, but I plan to make an API for
> >> this,
> >> > > and as you can see, it works
> >> > > for both implementations of collectors. Also, I hook ONLY into
> >> collector,
> >> > > as that's the place where the graph
> >> > > is being built, but this is logically equivalent to your "More
> >> > interesting
> >> > > ... 2nd case".
> >> > >
> >> > > Will ping once again when I have the changes....
> >> > >
> >> > > Thanks
> >> > > Tamas
> >> > >
> >> > > On Thu, Apr 28, 2022 at 9:01 PM Tamás Cservenák <
> ta...@cservenak.net>
> >> > > wrote:
> >> > >
> >> > >> Howdy,
> >> > >>
> >> > >> This is very cool, I was actually tinkering on very similar issues
> in
> >> > >> resolver coming from totally different angles.
> >> > >>
> >> > >> And yes, the resolver collector is not quite "extension" friendly,
> >> but
> >> > we
> >> > >> will make it right.
> >> > >> Just FYI, that in the latest resolver (1.8.0) there are actually
> two
> >> > >> implementations: depth-first (original) and depth-first.
> >> > >>
> >> > >> By looking at your code: collection is most critical regarding
> >> > >> performance and memory in the resolver, so "hooking" into it (like
> >> > sending
> >> > >> events per each step) might not be the best, but still, what kind
> of
> >> > >> extension points would you envision in the collector?
> >> > >>
> >> > >> For example, to achieve what you want, it would be completely
> enough
> >> to
> >> > >> receive the final CollectResult (the full graph), no?
> >> > >> As -- from a resolver perspective -- that would be simplest,
> >> especially
> >> > >> that now we have two collector implementations...
> >> > >>
> >> > >> Also, in case of multi threading, your shared stack would not cut,
> >> would
> >> > >> it?
> >> > >>
> >> > >> I personally was also looking into these, especially after some of
> >> the
> >> > >> latest additions to resolver in 1.8.0 and current master....
> >> > >>
> >> > >>
> >> > >> Thanks
> >> > >> T
> >> > >>
> >> > >>
> >> > >> On Thu, Apr 28, 2022 at 12:45 PM Grzegorz Grzybek <
> >> gr.grzy...@gmail.com
> >> > >
> >> > >> wrote:
> >> > >>
> >> > >>> Hello
> >> > >>>
> >> > >>> TL;DR: https://github.com/grgrzybek/tracking-maven-extension
> >> > >>>
> >> > >>> I'd like to share some proof of concept I made. It all started
> with
> >> a
> >> > >>> question "why I'm getting log4j:log4j:1.2.12" in my local Maven
> >> > >>> repository
> >> > >>> when building trivial project with fresh local repo?
> >> > >>>
> >> > >>> I knew it's possible to `grep -r --include=*.pom 1.2.12` the poms
> >> that
> >> > >>> declare old log4j, but I needed something better.
> >> > >>>
> >> > >>> In short words - I managed to persist the information available in
> >> > >>>
> >> > >>>
> >> >
> >>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
> >> > >>> stack.
> >> > >>> I wrote a Maven extension that can be put into $MAVEN_HOME/lib/ext
> >> or
> >> > >>> used
> >> > >>> with "-Dmaven.ext.class.path" which does two things:
> >> > >>>
> >> > >>>    1. adds org.eclipse.aether.RepositoryListener component that
> >> writes
> >> > >>> some
> >> > >>>    information when a dependency is FIRST downloaded from remote
> >> > >>> repository
> >> > >>>    2. adds org.eclipse.aether.impl.DependencyCollector component
> >> > >>> (extension
> >> > >>>    of
> >> > >>>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector)
> >> > >>>    that writes some information when a dependency is resolved
> >> against
> >> > >>> local
> >> > >>>    repository when it's already there (where no download is
> needed)
> >> > >>>
> >> > >>> In the first case, I write something like this:
> >> > >>>
> >> > >>> ~~~
> >> > >>> Downloaded artifact log4j:log4j:pom::1.2.12 (repository: central (
> >> > >>> https://repo.maven.apache.org/maven2, default, releases))
> >> > >>>    -> commons-logging:commons-logging:jar:1.1 (compile) (context:
> >> > plugin)
> >> > >>>      -> commons-digester:commons-digester:jar:1.8 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>        -> org.apache.velocity:velocity-tools:jar:2.0 (compile)
> >> > (context:
> >> > >>> plugin)
> >> > >>>          -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
> >> > >>> (compile)
> >> > >>> (context: plugin)
> >> > >>>            ->
> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0
> >> ()
> >> > >>> (context: plugin)
> >> > >>>   Reading descriptor for artifact log4j:log4j:jar::1.2.12
> (context:
> >> > >>> plugin)
> >> > >>> (scope: ?) (repository: central (
> >> https://repo.maven.apache.org/maven2,
> >> > >>> default, releases))
> >> > >>>     Transitive dependencies collection for
> >> > >>> org.apache.maven.plugins:maven-site-plugin:jar:3.11.0 ()
> >> > >>>       Resolution of plugin
> >> > >>> org.apache.maven.plugins:maven-site-plugin:3.11.0
> >> > (org.apache:apache:25)
> >> > >>> ~~~
> >> > >>> Downloaded artifact log4j:log4j:jar::1.2.12 (repository: central (
> >> > >>> https://repo.maven.apache.org/maven2, default, releases))
> >> > >>>   Resolution of plugin com.mycila:license-maven-plugin:3.0
> >> > >>> (org.apache.camel:camel-buildtools:3.17.0-SNAPSHOT)
> >> > >>>
> >> > >>> I simply write some information from available
> >> > >>> org.eclipse.aether.RepositoryEvent and event's
> >> > >>> org.eclipse.aether.RequestTrace.
> >> > >>>
> >> > >>> More interesting information is written in 2nd case. Because I
> >> wanted
> >> > to
> >> > >>> track ALL attempts to resolve log4j:log4j:1.2.12 (and any other
> >> > >>> dependency), I needed some structure. And I decided this:
> >> > >>>
> >> > >>>    - every dependency directory (where e.g., _remote.repositories
> is
> >> > >>>    written along with the jar/pom/sha1/md5/...) gets ".tracking"
> >> > >>> directory
> >> > >>>    - in ".tracking" directory I write files with names of this
> >> pattern:
> >> > >>>    "groupId_artifactId_type_classifier_version.dep", e.g.,
> >> > >>>    org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
> >> > >>>    - each such file contains a _reverse dependency tree_ that
> shows
> >> my
> >> > >>> why
> >> > >>>    given dependency was resolved.
> >> > >>>
> >> > >>> For example, in
> >> > >>>
> >> > >>>
> >> >
> >>
> ~/.m2/repository/log4j/log4j/1.2.12/.tracking/org.apache.maven.plugins_maven-dependency-plugin_jar_3.1.2.dep
> >> > >>> (the path itself already contains information that
> >> > >>> org.apache.maven.plugins:maven-dependency-plugin:3.1.2 depends
> >> > (directly
> >> > >>> or
> >> > >>> indirectly) in log4j:logj4:1.2.12.
> >> > >>> The content of this file is:
> >> > >>>
> >> > >>> log4j:log4j:pom:1.2.12
> >> > >>>  -> commons-logging:commons-logging:jar:1.1 (compile) (context:
> >> plugin)
> >> > >>>    -> commons-digester:commons-digester:jar:1.8 (compile)
> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.velocity:velocity-tools:jar:2.0 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>        -> org.apache.maven.doxia:doxia-site-renderer:jar:1.7.4
> >> > (compile)
> >> > >>> (context: plugin)
> >> > >>>          ->
> >> org.apache.maven.reporting:maven-reporting-impl:jar:3.0.0
> >> > >>> (compile) (context: plugin)
> >> > >>>            ->
> >> > org.apache.maven.plugins:maven-dependency-plugin:jar:3.1.2
> >> > >>> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> It's kind of obvious - dependency-plugin through
> >> maven-reporint-impl,
> >> > >>> through doxia, velocity, commons-digester and commons-logging
> >> "depends"
> >> > >>> on
> >> > >>> malicious log4j:1.2.12 library every security scanner screams
> about.
> >> > >>>
> >> > >>> Since I wrote this extension, I keep it in my @MAVEN_HOME/lib/ext
> >> and
> >> > >>> build
> >> > >>> everything in my work. Now I know why my
> >> > >>> ~/.m2/repository/org/codehaus/plexus/plexus-utils/ directory
> >> contains
> >> > 57
> >> > >>> different versions of plexus-utils for example. for example why
> >> 1.0.4
> >> > >>> from
> >> > >>> 2005?
> >> > >>>
> >> > >>> org.codehaus.plexus:plexus-utils:pom:1.0.4
> >> > >>>  ->
> >> > org.codehaus.plexus:plexus-container-default:jar:1.0-alpha-9-stable-1
> >> > >>> (compile) (context: plugin)
> >> > >>>    -> org.codehaus.plexus:plexus-velocity:jar:1.2 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.maven.doxia:doxia-site-renderer:jar:1.11.1
> >> (compile)
> >> > >>> (context: plugin)
> >> > >>>        -> org.apache.maven.plugins:maven-javadoc-plugin:jar:3.3.2
> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> Why Guava 10.0.1?
> >> > >>>
> >> > >>> com.google.guava:guava:pom:10.0.1
> >> > >>>  -> org.eclipse.sisu:org.eclipse.sisu.plexus:jar:0.0.0.M5
> (compile)
> >> > >>> (context: plugin)
> >> > >>>    -> org.apache.maven:maven-plugin-api:jar:3.1.1 (compile)
> >> (context:
> >> > >>> plugin)
> >> > >>>      -> org.apache.maven:maven-core:jar:3.1.1 (compile) (context:
> >> > plugin)
> >> > >>>        ->
> >> > org.apache.maven.shared:maven-common-artifact-filters:jar:3.2.0
> >> > >>> (runtime) (context: plugin)
> >> > >>>          ->
> >> > org.springframework.boot:spring-boot-maven-plugin:jar:2.5.12
> >> > >>> ()
> >> > >>> (context: plugin)
> >> > >>>
> >> > >>> yes - Spring Boot 2.5.12...
> >> > >>>
> >> > >>> Why Log4j 2.10.0?
> >> > >>>
> >> > >>> org.apache.logging.log4j:log4j-api:pom:2.10.0
> >> > >>>  -> org.apache.logging.log4j:log4j-to-slf4j:jar:2.10.0 (compile)
> >> > >>> (context:
> >> > >>> project)
> >> > >>>    ->
> >> > >>>
> >> org.springframework.boot:spring-boot-starter-logging:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>      ->
> >> org.springframework.boot:spring-boot-starter:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>        ->
> >> > >>> org.springframework.boot:spring-boot-starter-web:jar:2.0.5.RELEASE
> >> > >>> (compile) (context: project)
> >> > >>>          -> org.keycloak:keycloak-spring-boot-2-adapter:jar:17.0.1
> >> > >>> (context: project)
> >> > >>>
> >> > >>> (see - this time the context is "project", not "plugin").
> >> > >>>
> >> > >>> And so on and so on.
> >> > >>>
> >> > >>> What is my motivation with this email? I don't know yet - ideally
> >> I'd
> >> > >>> like
> >> > >>> to have this ".tracking" information created together with
> >> > >>> "_remote.repositories" and "*.lastUpdated" metadata by Maven
> >> Resolver.
> >> > It
> >> > >>> could be optional of course (the overhead is really minimal - 1
> more
> >> > >>> minute
> >> > >>> when building Camel 3 - 1 hour instead of 59 minutes).
> >> > >>>
> >> > >>> The only problem I had is that I had to fork/shade
> >> > >>>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector
> >> > class
> >> > >>> because I had to manipulate
> >> > >>>
> >> > >>>
> >> >
> >>
> org.eclipse.aether.internal.impl.collect.DefaultDependencyCollector.Args#nodes
> >> > >>> stack around the call to
> >> > >>>
> >> > >>>
> >> >
> >>
> org.jboss.fuse.mvnplugins.tracker.TrackingDependencyCollector#processDependency().
> >> > >>> Besides this, normal plexus/sisu components are used.
> >> > >>>
> >> > >>> The repository is
> >> > https://github.com/grgrzybek/tracking-maven-extension
> >> > >>> and
> >> > >>> I'd be happy to see some comments about this ;)
> >> > >>>
> >> > >>> kind regards
> >> > >>> Grzegorz Grzybek
> >> > >>>
> >> > >>
> >> >
> >>
> >
>

Reply via email to