tustvold commented on PR #14286: URL: https://github.com/apache/datafusion/pull/14286#issuecomment-2613975126
In the interests of avoiding confusion, as my objections appear to have gotten a little misinterpreted, I'd like to clarify the fact this approach comes with non-trivial overheads is **not** what concerns me with this approach. Rather that we know from experience at InfluxData that this pattern is fragile, easy to mess up, and leads to emergent behaviour that is highly non-trivial to reproduce and debug. That being said as Andrew says, nobody has emerged who is able/willing to resolve this with a more holistic approach, e.g. something closer to what polars/DuckDb/Hyper are doing to separate IO/compute, and so proceeding with something is better than nothing. I just hoped someone might step up. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org