Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-07 Thread Sunil Govindan
hadoop-sls is used for perf tests.

Thanks
Sunil

On Thu, Apr 7, 2022 at 11:57 AM Owen O'Malley 
wrote:

> We really need input from the users as well. In trunk, hadoop-ozone and
> hadoop-ant are already gone.
>
> Used:
> hadoop-aliyun
> hadoop-aws
> hadoop-distcp
> hadoop-azure-datalake
> hadoop-azure
> hadoop-dynamometer
> hadoop-federation-balance
>
> Questions:
> hadoop-archive-logs
> hadoop-archives
> hadoop-datajoin
> hadoop-extras
> hadoop-fs2img
> hadoop-gridmix
> hadoop-kafka
> hadoop-openstack
> hadoop-pipes
> hadoop-resourceestimator
> hadoop-rumen
> hadoop-sls
> hadoop-streaming
>
> On Thu, Apr 7, 2022 at 3:17 PM Ayush Saxena  wrote:
>
> > Attic will not take this or will be very painfull for us, they in general
> > take the entire project. Had a word with infra folks as well, we can
> > probably move them under a seperate repo under hadoop and then archive
> it.
> > Depandabot and such don’t scan archived repos. This should also solve our
> > pourpose.
> > Does that make sense?
> >
> > -Ayush
> >
> > > On 07-Apr-2022, at 7:03 PM, Steve Loughran  >
> > wrote:
> > >
> > > do that and we still have to worry about CVEs, dynabot complaints etc,
> > > build breaking changes.
> > >
> > > I am being more ruthless: can we move these into some attic repo.
> > >
> > >> On Fri, 1 Apr 2022 at 09:39, Vinayakumar B 
> > wrote:
> > >>
> > >> Can move unused ones to separate repo under hadoop, and introduce a
> > >> separate independent release cycle in case required.
> > >>
> > >> Any thoughts on this?
> > >> -Vinay
> > >>
> > >>> On Fri, 1 Apr 2022 at 1:31 PM, Ayush Saxena 
> > wrote:
> > >>>
> > >>> Distcp (I have a use case for it)
> > >>> Hadoop-federation-balance(RBF uses it, we will take care in case it
> > >>> bothers)
> > >>> Hadoop-Dynamometer(This I feel is being used from jira activies)
> > >>>
> > >>> So, these three we should let stay as is.
> > >>> Others the object store ones are active. You know which all are
> needed.
> > >>> For rest we can call for a vote and drop them if everyone agrees.
> > >>>
> > >>> -Ayush
> > >>>
> >  On 31-Mar-2022, at 4:40 PM, Steve Loughran
> >  > >>>
> > >>> wrote:
> > 
> >   how many of the modules under hadoop-tools get used/maintained?
> > 
> >  hadoop-aliyun
> >  hadoop-ant
> >  hadoop-archive-logs
> >  hadoop-archives
> >  hadoop-aws
> >  hadoop-azure
> >  hadoop-azure-datalake
> >  hadoop-datajoin
> >  hadoop-distcp
> >  hadoop-dynamometer
> >  hadoop-extras
> >  hadoop-federation-balance
> >  hadoop-fs2img
> >  hadoop-ftp
> >  hadoop-gridmix
> >  hadoop-kafka
> >  hadoop-openstack
> >  hadoop-ozone
> >  hadoop-pipes
> >  hadoop-resourceestimator
> >  hadoop-rumen
> >  hadoop-sls
> >  hadoop-streaming
> > 
> >  I know distcp is universal, and the aws. azure, aliyun modules are
> >  active. hadoop-azure-datalake doesn't get maintenance, but it should
> > >> stay
> >  around until microsoft remove the gen1 ADLS service
> > 
> >  But what about all the others? the hadoop-openstack one hasn't been
> > >>> touched
> >  or tested for a few years, and IMO could be cut immediately. what
> > about
> >  others? does hadoop-streaming or hadoop-pipes get use any more?
> > 
> >  Existing code may use these, but having them in the codebase only
> > >> creates
> >  maintenance work, especially if security fixes need to go in on the
> > >> code
> > >>> or
> >  are caused by dependencies.
> > >>>
> > >>> -
> > >>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > >>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >>>
> > >>> --
> > >> -Vinay
> > >>
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> >
>


Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-07 Thread Owen O'Malley
We really need input from the users as well. In trunk, hadoop-ozone and
hadoop-ant are already gone.

Used:
hadoop-aliyun
hadoop-aws
hadoop-distcp
hadoop-azure-datalake
hadoop-azure
hadoop-dynamometer
hadoop-federation-balance

Questions:
hadoop-archive-logs
hadoop-archives
hadoop-datajoin
hadoop-extras
hadoop-fs2img
hadoop-gridmix
hadoop-kafka
hadoop-openstack
hadoop-pipes
hadoop-resourceestimator
hadoop-rumen
hadoop-sls
hadoop-streaming

On Thu, Apr 7, 2022 at 3:17 PM Ayush Saxena  wrote:

> Attic will not take this or will be very painfull for us, they in general
> take the entire project. Had a word with infra folks as well, we can
> probably move them under a seperate repo under hadoop and then archive it.
> Depandabot and such don’t scan archived repos. This should also solve our
> pourpose.
> Does that make sense?
>
> -Ayush
>
> > On 07-Apr-2022, at 7:03 PM, Steve Loughran 
> wrote:
> >
> > do that and we still have to worry about CVEs, dynabot complaints etc,
> > build breaking changes.
> >
> > I am being more ruthless: can we move these into some attic repo.
> >
> >> On Fri, 1 Apr 2022 at 09:39, Vinayakumar B 
> wrote:
> >>
> >> Can move unused ones to separate repo under hadoop, and introduce a
> >> separate independent release cycle in case required.
> >>
> >> Any thoughts on this?
> >> -Vinay
> >>
> >>> On Fri, 1 Apr 2022 at 1:31 PM, Ayush Saxena 
> wrote:
> >>>
> >>> Distcp (I have a use case for it)
> >>> Hadoop-federation-balance(RBF uses it, we will take care in case it
> >>> bothers)
> >>> Hadoop-Dynamometer(This I feel is being used from jira activies)
> >>>
> >>> So, these three we should let stay as is.
> >>> Others the object store ones are active. You know which all are needed.
> >>> For rest we can call for a vote and drop them if everyone agrees.
> >>>
> >>> -Ayush
> >>>
>  On 31-Mar-2022, at 4:40 PM, Steve Loughran
>  >>>
> >>> wrote:
> 
>   how many of the modules under hadoop-tools get used/maintained?
> 
>  hadoop-aliyun
>  hadoop-ant
>  hadoop-archive-logs
>  hadoop-archives
>  hadoop-aws
>  hadoop-azure
>  hadoop-azure-datalake
>  hadoop-datajoin
>  hadoop-distcp
>  hadoop-dynamometer
>  hadoop-extras
>  hadoop-federation-balance
>  hadoop-fs2img
>  hadoop-ftp
>  hadoop-gridmix
>  hadoop-kafka
>  hadoop-openstack
>  hadoop-ozone
>  hadoop-pipes
>  hadoop-resourceestimator
>  hadoop-rumen
>  hadoop-sls
>  hadoop-streaming
> 
>  I know distcp is universal, and the aws. azure, aliyun modules are
>  active. hadoop-azure-datalake doesn't get maintenance, but it should
> >> stay
>  around until microsoft remove the gen1 ADLS service
> 
>  But what about all the others? the hadoop-openstack one hasn't been
> >>> touched
>  or tested for a few years, and IMO could be cut immediately. what
> about
>  others? does hadoop-streaming or hadoop-pipes get use any more?
> 
>  Existing code may use these, but having them in the codebase only
> >> creates
>  maintenance work, especially if security fixes need to go in on the
> >> code
> >>> or
>  are caused by dependencies.
> >>>
> >>> -
> >>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>>
> >>> --
> >> -Vinay
> >>
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-07 Thread Ayush Saxena
Attic will not take this or will be very painfull for us, they in general take 
the entire project. Had a word with infra folks as well, we can probably move 
them under a seperate repo under hadoop and then archive it.
Depandabot and such don’t scan archived repos. This should also solve our 
pourpose. 
Does that make sense?

-Ayush

> On 07-Apr-2022, at 7:03 PM, Steve Loughran  
> wrote:
> 
> do that and we still have to worry about CVEs, dynabot complaints etc,
> build breaking changes.
> 
> I am being more ruthless: can we move these into some attic repo.
> 
>> On Fri, 1 Apr 2022 at 09:39, Vinayakumar B  wrote:
>> 
>> Can move unused ones to separate repo under hadoop, and introduce a
>> separate independent release cycle in case required.
>> 
>> Any thoughts on this?
>> -Vinay
>> 
>>> On Fri, 1 Apr 2022 at 1:31 PM, Ayush Saxena  wrote:
>>> 
>>> Distcp (I have a use case for it)
>>> Hadoop-federation-balance(RBF uses it, we will take care in case it
>>> bothers)
>>> Hadoop-Dynamometer(This I feel is being used from jira activies)
>>> 
>>> So, these three we should let stay as is.
>>> Others the object store ones are active. You know which all are needed.
>>> For rest we can call for a vote and drop them if everyone agrees.
>>> 
>>> -Ayush
>>> 
 On 31-Mar-2022, at 4:40 PM, Steve Loughran >> 
>>> wrote:
 
  how many of the modules under hadoop-tools get used/maintained?
 
 hadoop-aliyun
 hadoop-ant
 hadoop-archive-logs
 hadoop-archives
 hadoop-aws
 hadoop-azure
 hadoop-azure-datalake
 hadoop-datajoin
 hadoop-distcp
 hadoop-dynamometer
 hadoop-extras
 hadoop-federation-balance
 hadoop-fs2img
 hadoop-ftp
 hadoop-gridmix
 hadoop-kafka
 hadoop-openstack
 hadoop-ozone
 hadoop-pipes
 hadoop-resourceestimator
 hadoop-rumen
 hadoop-sls
 hadoop-streaming
 
 I know distcp is universal, and the aws. azure, aliyun modules are
 active. hadoop-azure-datalake doesn't get maintenance, but it should
>> stay
 around until microsoft remove the gen1 ADLS service
 
 But what about all the others? the hadoop-openstack one hasn't been
>>> touched
 or tested for a few years, and IMO could be cut immediately. what about
 others? does hadoop-streaming or hadoop-pipes get use any more?
 
 Existing code may use these, but having them in the codebase only
>> creates
 maintenance work, especially if security fixes need to go in on the
>> code
>>> or
 are caused by dependencies.
>>> 
>>> -
>>> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
>>> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>>> 
>>> --
>> -Vinay
>> 

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-07 Thread Steve Loughran
do that and we still have to worry about CVEs, dynabot complaints etc,
build breaking changes.

I am being more ruthless: can we move these into some attic repo.

On Fri, 1 Apr 2022 at 09:39, Vinayakumar B  wrote:

> Can move unused ones to separate repo under hadoop, and introduce a
> separate independent release cycle in case required.
>
> Any thoughts on this?
> -Vinay
>
> On Fri, 1 Apr 2022 at 1:31 PM, Ayush Saxena  wrote:
>
> > Distcp (I have a use case for it)
> > Hadoop-federation-balance(RBF uses it, we will take care in case it
> > bothers)
> > Hadoop-Dynamometer(This I feel is being used from jira activies)
> >
> > So, these three we should let stay as is.
> > Others the object store ones are active. You know which all are needed.
> > For rest we can call for a vote and drop them if everyone agrees.
> >
> > -Ayush
> >
> > > On 31-Mar-2022, at 4:40 PM, Steve Loughran  >
> > wrote:
> > >
> > >  how many of the modules under hadoop-tools get used/maintained?
> > >
> > > hadoop-aliyun
> > > hadoop-ant
> > > hadoop-archive-logs
> > > hadoop-archives
> > > hadoop-aws
> > > hadoop-azure
> > > hadoop-azure-datalake
> > > hadoop-datajoin
> > > hadoop-distcp
> > > hadoop-dynamometer
> > > hadoop-extras
> > > hadoop-federation-balance
> > > hadoop-fs2img
> > > hadoop-ftp
> > > hadoop-gridmix
> > > hadoop-kafka
> > > hadoop-openstack
> > > hadoop-ozone
> > > hadoop-pipes
> > > hadoop-resourceestimator
> > > hadoop-rumen
> > > hadoop-sls
> > > hadoop-streaming
> > >
> > > I know distcp is universal, and the aws. azure, aliyun modules are
> > > active. hadoop-azure-datalake doesn't get maintenance, but it should
> stay
> > > around until microsoft remove the gen1 ADLS service
> > >
> > > But what about all the others? the hadoop-openstack one hasn't been
> > touched
> > > or tested for a few years, and IMO could be cut immediately. what about
> > > others? does hadoop-streaming or hadoop-pipes get use any more?
> > >
> > > Existing code may use these, but having them in the codebase only
> creates
> > > maintenance work, especially if security fixes need to go in on the
> code
> > or
> > > are caused by dependencies.
> >
> > -
> > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >
> > --
> -Vinay
>


Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-07 Thread Steve Loughran
hey, we aren't to touch distcp. everyone uses that.

good to know of the other two

On Fri, 1 Apr 2022 at 09:01, Ayush Saxena  wrote:

> Distcp (I have a use case for it)
> Hadoop-federation-balance(RBF uses it, we will take care in case it
> bothers)
> Hadoop-Dynamometer(This I feel is being used from jira activies)
>
> So, these three we should let stay as is.
> Others the object store ones are active. You know which all are needed.
> For rest we can call for a vote and drop them if everyone agrees.
>
> -Ayush
>
> > On 31-Mar-2022, at 4:40 PM, Steve Loughran 
> wrote:
> >
> >  how many of the modules under hadoop-tools get used/maintained?
> >
> > hadoop-aliyun
> > hadoop-ant
> > hadoop-archive-logs
> > hadoop-archives
> > hadoop-aws
> > hadoop-azure
> > hadoop-azure-datalake
> > hadoop-datajoin
> > hadoop-distcp
> > hadoop-dynamometer
> > hadoop-extras
> > hadoop-federation-balance
> > hadoop-fs2img
> > hadoop-ftp
> > hadoop-gridmix
> > hadoop-kafka
> > hadoop-openstack
> > hadoop-ozone
> > hadoop-pipes
> > hadoop-resourceestimator
> > hadoop-rumen
> > hadoop-sls
> > hadoop-streaming
> >
> > I know distcp is universal, and the aws. azure, aliyun modules are
> > active. hadoop-azure-datalake doesn't get maintenance, but it should stay
> > around until microsoft remove the gen1 ADLS service
> >
> > But what about all the others? the hadoop-openstack one hasn't been
> touched
> > or tested for a few years, and IMO could be cut immediately. what about
> > others? does hadoop-streaming or hadoop-pipes get use any more?
> >
> > Existing code may use these, but having them in the codebase only creates
> > maintenance work, especially if security fixes need to go in on the code
> or
> > are caused by dependencies.
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-01 Thread Vinayakumar B
Can move unused ones to separate repo under hadoop, and introduce a
separate independent release cycle in case required.

Any thoughts on this?
-Vinay

On Fri, 1 Apr 2022 at 1:31 PM, Ayush Saxena  wrote:

> Distcp (I have a use case for it)
> Hadoop-federation-balance(RBF uses it, we will take care in case it
> bothers)
> Hadoop-Dynamometer(This I feel is being used from jira activies)
>
> So, these three we should let stay as is.
> Others the object store ones are active. You know which all are needed.
> For rest we can call for a vote and drop them if everyone agrees.
>
> -Ayush
>
> > On 31-Mar-2022, at 4:40 PM, Steve Loughran 
> wrote:
> >
> >  how many of the modules under hadoop-tools get used/maintained?
> >
> > hadoop-aliyun
> > hadoop-ant
> > hadoop-archive-logs
> > hadoop-archives
> > hadoop-aws
> > hadoop-azure
> > hadoop-azure-datalake
> > hadoop-datajoin
> > hadoop-distcp
> > hadoop-dynamometer
> > hadoop-extras
> > hadoop-federation-balance
> > hadoop-fs2img
> > hadoop-ftp
> > hadoop-gridmix
> > hadoop-kafka
> > hadoop-openstack
> > hadoop-ozone
> > hadoop-pipes
> > hadoop-resourceestimator
> > hadoop-rumen
> > hadoop-sls
> > hadoop-streaming
> >
> > I know distcp is universal, and the aws. azure, aliyun modules are
> > active. hadoop-azure-datalake doesn't get maintenance, but it should stay
> > around until microsoft remove the gen1 ADLS service
> >
> > But what about all the others? the hadoop-openstack one hasn't been
> touched
> > or tested for a few years, and IMO could be cut immediately. what about
> > others? does hadoop-streaming or hadoop-pipes get use any more?
> >
> > Existing code may use these, but having them in the codebase only creates
> > maintenance work, especially if security fixes need to go in on the code
> or
> > are caused by dependencies.
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
> --
-Vinay


Re: [DISCUSS] cull unused modules from under hadoop-tools

2022-04-01 Thread Ayush Saxena
Distcp (I have a use case for it)
Hadoop-federation-balance(RBF uses it, we will take care in case it bothers)
Hadoop-Dynamometer(This I feel is being used from jira activies)

So, these three we should let stay as is.
Others the object store ones are active. You know which all are needed.
For rest we can call for a vote and drop them if everyone agrees. 

-Ayush

> On 31-Mar-2022, at 4:40 PM, Steve Loughran  
> wrote:
> 
>  how many of the modules under hadoop-tools get used/maintained?
> 
> hadoop-aliyun
> hadoop-ant
> hadoop-archive-logs
> hadoop-archives
> hadoop-aws
> hadoop-azure
> hadoop-azure-datalake
> hadoop-datajoin
> hadoop-distcp
> hadoop-dynamometer
> hadoop-extras
> hadoop-federation-balance
> hadoop-fs2img
> hadoop-ftp
> hadoop-gridmix
> hadoop-kafka
> hadoop-openstack
> hadoop-ozone
> hadoop-pipes
> hadoop-resourceestimator
> hadoop-rumen
> hadoop-sls
> hadoop-streaming
> 
> I know distcp is universal, and the aws. azure, aliyun modules are
> active. hadoop-azure-datalake doesn't get maintenance, but it should stay
> around until microsoft remove the gen1 ADLS service
> 
> But what about all the others? the hadoop-openstack one hasn't been touched
> or tested for a few years, and IMO could be cut immediately. what about
> others? does hadoop-streaming or hadoop-pipes get use any more?
> 
> Existing code may use these, but having them in the codebase only creates
> maintenance work, especially if security fixes need to go in on the code or
> are caused by dependencies.

-
To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-dev-h...@hadoop.apache.org



[DISCUSS] cull unused modules from under hadoop-tools

2022-03-31 Thread Steve Loughran
 how many of the modules under hadoop-tools get used/maintained?

hadoop-aliyun
hadoop-ant
hadoop-archive-logs
hadoop-archives
hadoop-aws
hadoop-azure
hadoop-azure-datalake
hadoop-datajoin
hadoop-distcp
hadoop-dynamometer
hadoop-extras
hadoop-federation-balance
hadoop-fs2img
hadoop-ftp
hadoop-gridmix
hadoop-kafka
hadoop-openstack
hadoop-ozone
hadoop-pipes
hadoop-resourceestimator
hadoop-rumen
hadoop-sls
hadoop-streaming

I know distcp is universal, and the aws. azure, aliyun modules are
active. hadoop-azure-datalake doesn't get maintenance, but it should stay
around until microsoft remove the gen1 ADLS service

But what about all the others? the hadoop-openstack one hasn't been touched
or tested for a few years, and IMO could be cut immediately. what about
others? does hadoop-streaming or hadoop-pipes get use any more?

Existing code may use these, but having them in the codebase only creates
maintenance work, especially if security fixes need to go in on the code or
are caused by dependencies.