suremarc commented on issue #8227: URL: https://github.com/apache/datafusion/issues/8227#issuecomment-2457565135
Recently I began looking into implementing #10316, and the proposed approach was to add per-partition statistics to `ExecutionPlan` and use the information to determine when the partitions have non-overlapping sort keys. However, I realized this requires #8078 to be implemented, as determining whether or not ranges overlap requires upper and lower bounds. According to @alamb's comment on #8078, the (then) current state of the `Statistics` code means than implementing #8078 is intractable as of now until other issues with the statistics code are resolved (it seems there were some attempts to simplify the statistics code that ended up not getting completed). I see that work on this epic has stalled since February, is there interest in continuing it? If so, I'm a willing contributor, but it'd help to know what needs to be done first, in particular if the `Statistics` code is still in need of some simplification. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
