Yes, I think it would make sense to deprecate it in Druid 27 if we're planning 
to remove the support in Druid 28.

I haven't looked into what it would take to make the Hadoop integration into an 
optional extension. That would be really nice though. Has anyone on this list 
looked into it?

Gian

On 2023/06/29 19:49:42 Xavier Léauté wrote:
> +1, does this mean we would mark Hadoop 2 deprecated in Druid 27?
> 
> Also, do we have a broader plan to remove Hadoop in general from core
> dependencies and make a an optional extension?
> 
> On Tue, Jun 27, 2023 at 11:53 PM Karan Kumar <karankumar1...@gmail.com>
> wrote:
> 
> > In favour of dropping hadoop 2 support . Another point is the lack of
> > security and vulnerability fixes in hadoop2.
> >
> >
> >
> > On Wed, Jun 28, 2023 at 12:17 PM Clint Wylie <cwy...@apache.org> wrote:
> >
> > > obvious +1 from me
> > >
> > > On Tue, Jun 27, 2023 at 11:42 PM Gian Merlino <g...@apache.org> wrote:
> > > >
> > > > I'd like to propose dropping support for Hadoop 2 in Druid 28. Not the
> > > very
> > > > next release (which I assume will be Druid 27) but the one after that,
> > > > likely late 2023 timeframe.
> > > >
> > > > In 2021, we had a discussion about moving away from Hadoop 2:
> > > > https://lists.apache.org/thread/zmc389trnkh6x444so8mdb2h0x0noqq4. For
> > > > various reasons, it didn't seem like the right time. However, I believe
> > > now
> > > > is the right time:
> > > >
> > > > 1) We didn't support Hadoop 3 in 2021, but we support it now. There is
> > > now
> > > > a Hadoop 3 build profile, as well as convenience binaries on
> > > > https://druid.apache.org/downloads.html.
> > > >
> > > > 2) We have SQL-based ingest with MSQ tasks, which provides a built-in /
> > > > scalable / robust alternative to using Hadoop at all.
> > > >
> > > > 3) It has been an additional two years. Hadoop 2 is that much older,
> > that
> > > > much more time has passed since it was superseded by Hadoop 3, and
> > people
> > > > have had that much more time to migrate.
> > > >
> > > > 4) The original main reason for wanting to move away from Hadoop 2 is
> > > still
> > > > relevant. It keeps us on various old dependencies, including an ancient
> > > > version of Guava, which in turn has been keeping us on an ancient
> > version
> > > > of Calcite. The Calcite community has graciously decided to support
> > this
> > > > old version of Guava for at least one release, but plans to drop
> > support
> > > by
> > > > Calcite 1.36, leaving us back in the same position. Managing this
> > > situation
> > > > is time-consuming for both Druid and Calcite maintainers.
> > > >
> > > > 5) Other solutions beyond dropping Hadoop 2 support were proposed in
> > > 2021,
> > > > such as reworking Hadoop support to be purely extension based, and
> > > > reworking extensions to be more isolated from each other. However,
> > these
> > > > are both substantially more complex than dropping support, and in the
> > two
> > > > years since the original thread, these more complex solutions have not
> > > been
> > > > implemented. So, I think we need to move on with the simpler solution
> > of
> > > > dropping support.
> > > >
> > > > Gian
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
> > > For additional commands, e-mail: dev-h...@druid.apache.org
> > >
> > >
> >
> > --
> > Thanks
> > Karan
> >
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@druid.apache.org
For additional commands, e-mail: dev-h...@druid.apache.org

Reply via email to