isidentical commented on issue #3898: URL: https://github.com/apache/arrow-datafusion/issues/3898#issuecomment-1297564476
A general overview of what happened during the discussions (in regards to the expression analysis part) and the meetup: - Better value distributions rather than just knowing the `[min, max]`. Although there isn't any concrete work, this was also part of the proposed 'future' section and we might be able to turn it into a reality once we have the general system ready (this is what Spark did IIRC, they initially implemented with basic ranges and then moved over with histograms when available). - Support for comparing two expressions (`a > b`) with different ranges. This can be supported with the existing framework, but actually needs an implementation 😇 Currently for binary expressions, we have a code path that is taken when we know either side has a scalar boundary (e.g. `a=[20, 20]; b=[10, 30]`) but we can introduce a second one very easily depending on how it would work 👀 Requires further research, but I don't think it needs to block this ticket (it can actually be a part of it). - Not passing a mutable handle for the context but rather giving it up completely. I'd be fine with this though just from a purely aesthetic point of view it looks a bit hard to parse 😄 I'm happy to be convinced though. Let's discuss it in the code review again! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
