If it's not supported anymore it's already a good enough argument to replace it.
The proposed plan looks reasonable to me. Alessandro On Thu, Jun 4, 2026, 09:45 jensen <[email protected]> wrote: > I vaguely recall that Arrow's official website also discourages the use of > Gandiva, and Gandiva is no longer maintained. If possible, I think we > should remove this dependency. > > > > Best regards, > > Zhen > > ---- Replied Message ---- > | From | Cancai Cai<[email protected]> | > | Date | 6/4/2026 14:20 | > | To | <[email protected]> | > | Subject | Subject: [DISCUSS] Removing Gandiva from the Arrow adapter | > Hi all, > > I would like to discuss the future of Gandiva in Calcite's Arrow adapter. > My preferred long-term direction is to remove the Gandiva dependency from > the adapter. > > The current adapter uses Arrow Java to read Arrow data, but relies on > Gandiva `Projector` and `Filter` for projection and filter execution. > Gandiva is a native LLVM-based runtime, and the Java module is a wrapper > around that native implementation. As a result, basic Arrow adapter queries > depend on native libraries, LLVM compatibility, platform packaging, and JDK > baseline details. > > This has become a practical maintenance problem when thinking about Arrow > dependency upgrades. > > The upgrade problem is not limited to Java bytecode compatibility. In our > experiments, newer Arrow versions failed at different layers. Arrow 18 > requires a newer Java baseline than Calcite currently supports in its JDK 8 > jobs. Arrow 17 and 16.1 still use Java 8 class files, but can hit Java > runtime API incompatibilities on JDK 8, such as `ByteBuffer.flip(): > ByteBuffer`. Arrow 16.0 avoids that Java runtime issue, but exposed Gandiva > native / LLVM symbol issues on Linux CI. > > This means that as long as `arrow-gandiva` is required for the adapter's > correctness path, upgrading the Arrow Java vector layer also requires > validating the native Gandiva stack across all CI platforms. Even when > `arrow-vector` itself is usable, `arrow-gandiva` can still block the > upgrade. > > For that reason, I think the adapter should make projection/filter > correctness independent of Gandiva first. Once the Java correctness path is > in place, Arrow vector upgrades can be evaluated separately from Gandiva > native compatibility. > > The direction I have in mind is a pure Java correctness path for the Arrow > adapter: > > * read Arrow data with `ArrowFileReader`, `VectorSchemaRoot`, and > `ValueVector`; > * execute simple projections by reading selected vectors directly; > * execute the simple filters currently translated by `ArrowTranslator` with > a Java evaluator; > * leave expressions that are not pushed into the adapter to Calcite's > normal Enumerable / code generation path. > > With that model, Gandiva would no longer be required for correctness. A > staged migration could be: > > 1. Move no-filter simple projection away from Gandiva. > 2. Add Java evaluation for the simple filter subset currently supported by > `ArrowTranslator`. > 3. Validate that existing Arrow adapter tests pass without invoking > Gandiva. > 4. Remove `arrow-gandiva` from the adapter dependency set once the Java > path covers the current behavior. > > The tradeoff is that Gandiva may be faster for supported expressions. But > for this adapter, I think correctness, portability, and dependency > stability should come first. If acceleration is needed later, it can be > discussed separately. > > Does this direction make sense to the community? Are there current use > cases that depend on Gandiva pushdown strongly enough that we should keep > the native dependency? > > Thanks, > Cancai >
