sezruby commented on PR #12245:
URL: https://github.com/apache/gluten/pull/12245#issuecomment-4628389623

   Closing this — the unbundling direction turns out to be incompatible with 
gluten's current Spark 3.3 / 3.4 support, and I don't think the workaround is 
worth the risk.
   
   cc @zhztheplayer @FelixYBW
   
   **What CI showed.** Spark 3.5 / 4.0 / 4.1 lanes were on track, but 
`spark-test-spark33` and `spark-test-spark34` (and several `tpc-test-*` lanes 
built against them) failed early. Root cause traced to the bundled-Arrow being 
load-bearing for older Spark:
   
   - Spark 3.3.1 ships Arrow 7.0.0
   - Spark 3.4.4 ships Arrow 11.0.0
   - Spark 3.5.5 ships Arrow 15.0.0
   - Spark 4.0.x / 4.1.x ship Arrow 18.x
   
   Gluten's parent `pom.xml` pins `<arrow.version>15.0.0</arrow.version>` and 
uses it at compile scope. Today that works because gluten bundles its own Arrow 
15 into the velox bundle, which wins classloader resolution at runtime over 
Spark's older Arrow.
   
   Once `arrow-memory-*` / `arrow-vector` flip to `scope=provided` (this PR), 
the bundle stops shipping Arrow. The compile classpath still has 15, but at 
runtime on Spark 3.3 / 3.4 only the older Arrow (7 / 11) is on the classpath — 
`NoSuchMethodError` / `NoClassDefFoundError` follow.
   
   **Workarounds considered.**
   
   1. Per-Spark-profile `<arrow.version>` overrides (3.3→7.0, 3.4→11.0, 
3.5→15.0, 4.x→18.1). Compiles, but ships gluten built against Arrow 7 on the 
3.3 profile — exactly the "API stability across versions" concern you raised on 
[#12226](https://github.com/apache/gluten/pull/12226) (`> Memory and vector 
APIs should be stable across minor versions / This sounds a real risk`), now 
applied across an *eight-version* gap rather than a one-or-two-version gap. 
Surface area too large to be confident without per-version testing.
   2. Conditional `<scope>` (provided on 3.5+, compile on 3.3/3.4). Works 
mechanically but is ugly and leaves the bug 
([#12225](https://github.com/apache/gluten/issues/12225)) latent on Spark 3.3 / 
3.4.
   3. Drop Spark 3.3 / 3.4 support. Out of scope for this fix.
   
   None feels worth it as a one-shot, especially since 
[#12226](https://github.com/apache/gluten/pull/12226) already neutralized the 
immediate `NoSuchMethodError` from 
[#12225](https://github.com/apache/gluten/issues/12225) by un-shading the 
boundary types.
   
   **What I'm keeping.** [#12244](https://github.com/apache/gluten/pull/12244) 
— drop the `15.0.0-gluten` artifact rename, drop the dead 
`modify_arrow_dataset_scan_option.patch` from the Arrow JVM build, depend on 
vanilla Apache Arrow from Maven Central. CI green there. That gives non-ppc64le 
contributors a faster build-from-source path without changing the 
runtime/bundling story.
   
   **For follow-up.** If gluten ever drops Spark 3.3 / 3.4, this unbundling 
work is small — the diff is ~3 poms. Happy to revisit then.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to