Re: Subject: [DISCUSS] Removing Gandiva from the Arrow adapter

Alessandro Solimando Thu, 04 Jun 2026 00:56:53 -0700

If it's not supported anymore it's already a good enough argument to
replace it.


The proposed plan looks reasonable to me.

Alessandro

On Thu, Jun 4, 2026, 09:45 jensen <[email protected]> wrote:

> I vaguely recall that Arrow's official website also discourages the use of
> Gandiva, and Gandiva is no longer maintained. If possible, I think we
> should remove this dependency.
>
>
>
> Best regards,
>
> Zhen
>
> ---- Replied Message ----
> | From | Cancai Cai<[email protected]> |
> | Date | 6/4/2026 14:20 |
> | To | <[email protected]> |
> | Subject | Subject: [DISCUSS] Removing Gandiva from the Arrow adapter |
> Hi all,
>
> I would like to discuss the future of Gandiva in Calcite's Arrow adapter.
> My preferred long-term direction is to remove the Gandiva dependency from
> the adapter.
>
> The current adapter uses Arrow Java to read Arrow data, but relies on
> Gandiva `Projector` and `Filter` for projection and filter execution.
> Gandiva is a native LLVM-based runtime, and the Java module is a wrapper
> around that native implementation. As a result, basic Arrow adapter queries
> depend on native libraries, LLVM compatibility, platform packaging, and JDK
> baseline details.
>
> This has become a practical maintenance problem when thinking about Arrow
> dependency upgrades.
>
> The upgrade problem is not limited to Java bytecode compatibility. In our
> experiments, newer Arrow versions failed at different layers. Arrow 18
> requires a newer Java baseline than Calcite currently supports in its JDK 8
> jobs. Arrow 17 and 16.1 still use Java 8 class files, but can hit Java
> runtime API incompatibilities on JDK 8, such as `ByteBuffer.flip():
> ByteBuffer`. Arrow 16.0 avoids that Java runtime issue, but exposed Gandiva
> native / LLVM symbol issues on Linux CI.
>
> This means that as long as `arrow-gandiva` is required for the adapter's
> correctness path, upgrading the Arrow Java vector layer also requires
> validating the native Gandiva stack across all CI platforms. Even when
> `arrow-vector` itself is usable, `arrow-gandiva` can still block the
> upgrade.
>
> For that reason, I think the adapter should make projection/filter
> correctness independent of Gandiva first. Once the Java correctness path is
> in place, Arrow vector upgrades can be evaluated separately from Gandiva
> native compatibility.
>
> The direction I have in mind is a pure Java correctness path for the Arrow
> adapter:
>
> * read Arrow data with `ArrowFileReader`, `VectorSchemaRoot`, and
> `ValueVector`;
> * execute simple projections by reading selected vectors directly;
> * execute the simple filters currently translated by `ArrowTranslator` with
> a Java evaluator;
> * leave expressions that are not pushed into the adapter to Calcite's
> normal Enumerable / code generation path.
>
> With that model, Gandiva would no longer be required for correctness. A
> staged migration could be:
>
> 1. Move no-filter simple projection away from Gandiva.
> 2. Add Java evaluation for the simple filter subset currently supported by
> `ArrowTranslator`.
> 3. Validate that existing Arrow adapter tests pass without invoking
> Gandiva.
> 4. Remove `arrow-gandiva` from the adapter dependency set once the Java
> path covers the current behavior.
>
> The tradeoff is that Gandiva may be faster for supported expressions. But
> for this adapter, I think correctness, portability, and dependency
> stability should come first. If acceleration is needed later, it can be
> discussed separately.
>
> Does this direction make sense to the community? Are there current use
> cases that depend on Gandiva pushdown strongly enough that we should keep
> the native dependency?
>
> Thanks,
> Cancai
>

Re: Subject: [DISCUSS] Removing Gandiva from the Arrow adapter

Reply via email to