Re: build q: getting the full DAG or all maven dependencies

2024-02-13 Thread Ayush Saxena
Does hadoop-dist package help? It does the packaging stuff for hadoop, IIRC
it defines all the projects so that the dist kicks post everything is built
[1], it has scripts mentioned in the pom which does the packaging work.

for the protobuf, I think in the yarn modules we don't have the scope
defined, so by default it is taking as compile, maybe putting the scope in
the parent pom [2] should help, I tried locally & it did, else you need to
define the scope thing in every POM which uses protobuf...


-Ayush

[1] https://github.com/apache/hadoop/blob/trunk/hadoop-dist/pom.xml#L32
[2]
https://github.com/apache/hadoop/commit/c373d3fa39013e8a8f6a6122c3ca230b4aa10abe


On Tue, 13 Feb 2024 at 23:37, Steve Loughran 
wrote:

> it does, but i'm not sure if there is a single module where you can ask for
> it and get the full list.
>
> For that verification project I've got I may declare more poms as
> dependencies so can do the aggregate scan there. this would also let me run
> maven dependency -verbose, save the output to a file and see what is there.
> would let us define lists of libraries we don't want in distributions
>
> On Mon, 12 Feb 2024 at 18:03, Sangjin Lee  wrote:
>
> > Does the maven dependency plugin help? I might try mvn dependency:tree
> and
> > see if it takes you somewhere.
> >
> > Sangjin
> >
> >
> > On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran
>  > >
> > wrote:
> >
> > > how can we work out the entire DAG of dependencies in a hadoop distro?
> > >
> > > I'm asking as there are things in 3.4.0 that we shouldn't need
> (protobuf
> > > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still
> > > find one in the yarn timeline lib dir
> > > https://github.com/apache/hadoop/pull/6547
> > >
> > >
> > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the
> > new
> > > shaded jar.
> > >
> >
>


Re: build q: getting the full DAG or all maven dependencies

2024-02-13 Thread Sangjin Lee
I tried running it at the root project (hadoop), and got a meaningful
dependency tree. It does print an exhaustive and transitive tree of
dependencies.

As for log4j with your patch, I see two ways log4j is introduced:
- log4j -> hadoop-common@2.8.5 ->
hadoop-yarn-server-timelineservice-hbase-tests
- log4j -> solr:slor-core@8.11.2 ->
hadoop-yarn-applications-catalog-webapp:war

That said, both are test-scoped. I'm not sure why we're packaging test-only
dependencies into the hadoop distro. Is it a known thing?

Sangjin


On Tue, Feb 13, 2024 at 10:07 AM Steve Loughran 
wrote:

> it does, but i'm not sure if there is a single module where you can ask for
> it and get the full list.
>
> For that verification project I've got I may declare more poms as
> dependencies so can do the aggregate scan there. this would also let me run
> maven dependency -verbose, save the output to a file and see what is there.
> would let us define lists of libraries we don't want in distributions
>
> On Mon, 12 Feb 2024 at 18:03, Sangjin Lee  wrote:
>
> > Does the maven dependency plugin help? I might try mvn dependency:tree
> and
> > see if it takes you somewhere.
> >
> > Sangjin
> >
> >
> > On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran
>  > >
> > wrote:
> >
> > > how can we work out the entire DAG of dependencies in a hadoop distro?
> > >
> > > I'm asking as there are things in 3.4.0 that we shouldn't need
> (protobuf
> > > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still
> > > find one in the yarn timeline lib dir
> > > https://github.com/apache/hadoop/pull/6547
> > >
> > >
> > > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the
> > new
> > > shaded jar.
> > >
> >
>


Re: build q: getting the full DAG or all maven dependencies

2024-02-13 Thread Steve Loughran
it does, but i'm not sure if there is a single module where you can ask for
it and get the full list.

For that verification project I've got I may declare more poms as
dependencies so can do the aggregate scan there. this would also let me run
maven dependency -verbose, save the output to a file and see what is there.
would let us define lists of libraries we don't want in distributions

On Mon, 12 Feb 2024 at 18:03, Sangjin Lee  wrote:

> Does the maven dependency plugin help? I might try mvn dependency:tree and
> see if it takes you somewhere.
>
> Sangjin
>
>
> On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran  >
> wrote:
>
> > how can we work out the entire DAG of dependencies in a hadoop distro?
> >
> > I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf
> > 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still
> > find one in the yarn timeline lib dir
> > https://github.com/apache/hadoop/pull/6547
> >
> >
> > see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the
> new
> > shaded jar.
> >
>


Re: build q: getting the full DAG or all maven dependencies

2024-02-12 Thread Sangjin Lee
Does the maven dependency plugin help? I might try mvn dependency:tree and
see if it takes you somewhere.

Sangjin


On Mon, Feb 12, 2024 at 9:50 AM Steve Loughran 
wrote:

> how can we work out the entire DAG of dependencies in a hadoop distro?
>
> I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf
> 2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still
> find one in the yarn timeline lib dir
> https://github.com/apache/hadoop/pull/6547
>
>
> see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the new
> shaded jar.
>


build q: getting the full DAG or all maven dependencies

2024-02-12 Thread Steve Loughran
how can we work out the entire DAG of dependencies in a hadoop distro?

I'm asking as there are things in 3.4.0 that we shouldn't need (protobuf
2.5), and when I add the pR to move off log4j 1.17 to reload4j, I still
find one in the yarn timeline lib dir
https://github.com/apache/hadoop/pull/6547


see HADOOP-19074 for a list of what is in 3.4.0 RC0, which predates the new
shaded jar.