On Mon, 4 Nov 2024 at 09:02, Fokko Driesprong <fo...@apache.org> wrote:
> Hi everyone, > > Breaking the radio silence from my end, I was enjoying paternity leave. > > I wanted to bring this up for a while. In Parquet we're still supporting > Hadoop 2.7.3, which was released in August 2016 > <https://hadoop.apache.org/release/2.7.3.html>. For things like JDK21 > support, we have to drop these old versions. I was curious about what > everyone thinks as a reasonable lower bound. > > My suggested route is to bump it to Hadoop 2.9.3 > <https://github.com/apache/parquet-java/pull/2944/> (November 2019) for > Parquet 1.15.0, and then drop Hadoop 2 in the major release after that. Any > thoughts, questions or concerns? > > I'd be ruthless and say hadoop 3.3.x only. hadoop 2.x is nominally "java 7" only. really. hadoop 3.3.x is java8, but you really need to be on hadoop 3.4.x to get a set of dependencies which work OK with java 17+. Staying with older releases hampers parquet in terms of testing, maintenance, inability to use improvements written in the past five or more years, and more My proposal would be - 1.14.x: move to 2.9.3 - 1.15.x hadoop 3.3.x only