jja725 commented on issue #12263: URL: https://github.com/apache/gluten/issues/12263#issuecomment-4715459674
Hi @FelixYBW, thanks for opening this — excited to see Lance support being considered for Gluten. I'm the author of the Velox Lance connector referenced in facebookincubator/velox#16556. I've also built a full Presto Lance Connector on top of it (Java coordinator + native Velox execution). But the Velox connector PR is still WIP — I'm actively refining it and plan to continue work on it in the coming weeks. On the two approaches: 1. **Velox connector (Approach 1):** This gives you full native execution — Velox handles everything from scan to output. The tradeoff is that the connector needs to mature further (filter pushdown coverage, write path, stability). I'm working on this and happy to collaborate on getting it Gluten-ready. 2. **Lance-Spark datasource (Approach 2):** This is practical to ship sooner since it builds on the existing lance-spark connector. The Arrow-in/Arrow-out path avoids C2R/R2C, which is the main win. The downside is you still cross the JVM-native boundary for the data handoff, and you miss out on Velox-level optimizations like lazy materialization and direct filter pushdown into the Lance reader. Both approaches are complementary — Approach 2 gets you running quickly while Approach 1 can deliver better long-term performance. Happy to help with either path or discuss how the Velox connector can fit into Gluten's architecture. Let me know if you'd like to discuss more about the direction. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
