That may break many downstream projects. At least we cannot break
parquet-hadoop (and any existing module). If you can add a new module
like parquet-core and provide limited reader/writer features without hadoop
support, and then make parquet-hadoop depend on parquet-core, that
would be acceptable.

One possible workaround is to replace various Hadoop dependencies
by hadoop-client-api and hadoop-client-runtime in the parquet-mr. This
may be much easier for users to add Hadoop dependency. But they are
only available from Hadoop 3.0.0.

On Fri, Jun 9, 2023 at 3:18 PM Atour Mousavi Gourabi <[email protected]> wrote:

> Hi Gang,
>
> Backward compatibility does indeed seem challenging here. Especially as
> I'd rather see the writers/readers moved out of parquet-hadoop after
> they've been decoupled. What are your thoughts on this?
>
> Best regards,
> Atour
> ________________________________
> From: Gang Wu <[email protected]>
> Sent: Friday, June 9, 2023 3:32 AM
> To: [email protected] <[email protected]>
> Subject: Re: Parquet without Hadoop dependencies
>
> Hi Atour,
>
> Thanks for bringing this up!
>
> From what I observed from PARQUET-1822, I think it is a valid use
> case to support parquet reading/writing without hadoop installed.
> The challenge is backward compatibility. It would be great if you can
> work on it.
>
> Best,
> Gang
>
> On Fri, Jun 9, 2023 at 12:24 AM Atour Mousavi Gourabi <[email protected]>
> wrote:
>
> > Dear all,
> >
> > The Java implementations of the Parquet readers and writers seem pretty
> > tightly coupled to Hadoop (see: PARQUET-1822). For some projects, this
> can
> > cause issues as it's an unnecessary and big dependency when you might
> just
> > need to write to disk. Is there any appetite here for separating the
> Hadoop
> > code and supporting more convenient ways to write to disk out of the
> box? I
> > am willing to work on these changes but would like some pointers on
> whether
> > such patches would be reviewed and accepted as PARQUET-1822 has been open
> > for over three years now.
> >
> > Best regards,
> > Atour Mousavi Gourabi
> >
>

Reply via email to