Re: [DISCUSS] Remove Arrow from Hive

Simhadri G Tue, 15 Apr 2025 09:33:24 -0700

Hello,

The only use case I can recall is from when we used the HWC-LLAP mode
configured to leverage Apache Arrow. In this setup, Spark would submit
queries on Hive ACID tables through HWC. The queries were executed on LLAP,
and the data exchange between LLAP and Spark was in the Apache Arrow
format. This allowed the result set to be natively available for further
processing within Spark.


However, the HWC-LLAP mode has not been in use for quite some time now, as
it has been superseded by other modes.

At this point, I am not aware of any other active use cases involving the
Arrow code in Hive.

Best regards,

Simhadri G

On Mon, Apr 14, 2025, 7:35 PM Shohei Okumiya <oku...@apache.org> wrote:

> Hi Laszlo,
>
> I had a chance to talk with Apache Arrow PMCs[1] a couple of days ago
> and shared my thoughts on seeking to integrate Hive with Arrow. The
> timing is surprising.
>
> Before the meetup, I looked through the only use case and didn't
> understand practical use cases. So, I give +1 to remove the feature.
>
> - [1] https://red-data-tools.connpass.com/event/349680/
>
> Regards,
> Okumin
>
> On Mon, Apr 14, 2025 at 10:13 PM László Bodor <bodorlaszlo0...@gmail.com>
> wrote:
> >
> > Hi all!
> >
> > https://issues.apache.org/jira/browse/HIVE-28904
> > https://github.com/apache/hive/pull/5772
> >
> > What do you think about removing Arrow from Hive?
> >
> > As far as I can remember, that was originally introduced for the Hive
> Warehouse Connector in its early days. Since then, other reader
> modes—Direct Reader v1, Direct Reader v2, and later Secure Access Mode—have
> superseded the initial read mode that used Arrow.
> >
> > Can anyone recall any current use cases where Arrow is still used in
> Hive?
> >
> >
> > Regards,
> > Laszlo Bodor
> >
>

Re: [DISCUSS] Remove Arrow from Hive

Reply via email to