hadoop-api-shim ?

On Thu, 1 Jun 2023 at 04:07, Ayush Saxena <ayush...@gmail.com> wrote:

> +1, for the new repo.
>
> The name sounds fine, but good if we have scope of having “hadoop-”
> prefix, we have that for almost all of the subprojects/modules
>
> Can hadoop-shims or hadoop-shims-api or something on similar lines work?
>
> -Ayush
>
> > On 01-Jun-2023, at 1:18 AM, Steve Loughran <ste...@cloudera.com.invalid>
> wrote:
> >
> > I want to create a new repository to put a shim library to allow
> previous
>
> > releases to access the more recent hadoop filesystem APIs -currently the
> > open source implementations of parquet, ORC can't use vectored io, in
> > particular, even though we can in Cloudera. Providing a shim opens them
> up
> > to all *and* gets the APIs more broadly stressed/tested.
> > This needs to be in its own repository, not just for rapid initial
> release,
> > but because it is designed to be built as old a version of hadoop we can
> > reasonably support, which IMO means hadoop 3.1.0+. I know parquet still
> > wants to build against 2.8.x, but to claim support for hadoop 2 means
> > "build and test on java7", which is unrealistic in 2023.
> >
> > Initial WiP implementation, which works with 3.1.0 and tests against
> others
> > https://github.com/steveloughran/fs-api-shim
> > the complexity is about testing this -I have contract tests which then
> need
> > to be executed on every supported hadoop release, which will need a
> > separate module for each one.
> >
> > I can create the repo easily enough, just would like approval. And is the
> > name OK?
> >
> > steve
>

Reply via email to