jtuglu1 commented on issue #19456: URL: https://github.com/apache/druid/issues/19456#issuecomment-4485878587
> The leaf paths— the ones that read Segment— may require some additional care to avoid materializing things too early. This is a pretty well-known issue in Datafusion where you need to implement a lot of custom pushdown (e.g. physical operators to read/filter on segment bitmaps/dictionaries) yourself to avoid expensive re-materializing between operators. This will be especially painful if we need to cross the JVM/Rust boundary frequently. Another thing I want to call out here is starting to move towards deprecating and unifying the available engines in Druid. Currently I see a lot of work being done on various engines (dart, msq, native, etc.) and what I think we should be aiming towards is the native processing core to be built out on whatever the decided "next" gen engine is (and its operator interfaces). While I think the native segment readers are a bit more generic and can be shared, this kind of thing will help motivate a deprecation of older execution paths. @gianm I think there's a lot of value in considering adoption of Datafusion's planner/optimizer/physical operator ecosystem in the long term; while I think there will need to be many Druid-specific operator overrides, to me it seems like a net-positive if we're able to get statistics [support](https://docs.rs/datafusion/latest/datafusion/common/struct.ColumnStatistics.html), advanced planning (better join algos, etc.), and optimized physical operators (spilling grouping, etc.) out-of-the-box without needing to worry about implementing these ourself. That being said, I think distributed Datafusion is still in a nascent phase (Ballista, etc.), so this is definitely a longer-term thing. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
