Thank you for driving this Gene -- I think it is an important step along the way for DataFusion to take advantage of existing data partitioning and layouts
Andrew On Wed, May 27, 2026 at 7:19 AM Gene Bordegaray via dev < [email protected]> wrote: > Hello everyone, > > DataFusion currently has three partitioning variants: Hash, > RoundRobinBatch, > and UnknownPartitioning. It cannot accurately represent some partitioning > schemes, which makes optimizer and planning behavior brittle (relevant > discussion thread <https://github.com/apache/datafusion/issues/21207>). > A few community members have designed a Range partitioning variant > accurately describe range-partitioning schemes users have today. The first > PR <https://github.com/apache/datafusion/pull/22207> is purely mechanical: > adds the model and contract, marking unsupported call sites without > changing planning behavior. Follow-up work will add planning and execution > support incrementally: > > - Implement Range Partitioning Planning > <https://github.com/apache/datafusion/issues/22397> > - Implement Range Repartitioning > <https://github.com/apache/datafusion/issues/22395> > - Expose Range Partitioning Across FFI Boundaries > <https://github.com/apache/datafusion/issues/22394> > > > Would appreciate any input, feel free to join the conversation here > <https://github.com/apache/datafusion/issues/21992>. > > Thanks, > Gene >
