paleolimbot opened a new pull request, #7398: URL: https://github.com/apache/arrow-rs/pull/7398
# Which issue does this PR close? Experiments in pursuit of #7063 and https://github.com/apache/datafusion/issues/12644 . # Rationale for this change Customizing behaviour (e.g., sorting, signatures for user defined functions, import/export from/to formats like Parquet, Arrow IPC, and FFI) is increasingly requested. arrow-rs is also increasingly the implementation of choice for compute frameworks and many APIs have been built around the `DataType` and `ArrayRef` (e.g., GeoArrow, parquet, many parts of DataFusion). To support additional types, those frameworks have to either invent a new array type and rewrite substantial wrappers (e.g., GeoArrow), or use something other than a `DataType` (e.g., a Field or MyOwnDataType) and forego having that type automatically pass through APIs from other arrow-rs-based libraries (or arrow itself). Incorporating an ExtensionType as a first-class data type provides a potentially less disruptive route for libraries like parquet and DataFusion to implement new Parquet types and other types other databases implement like JSON and UUID. I also have an experiment going for swapping out a `Field` for a `DataType` in DataFusion ( https://github.com/apache/datafusion/pull/15036 ). # What changes are included in this PR? A new `DataType::Extension()` enum member was added. I had hoped to have this be `DataType::Extension(Arc<dyn ExtensionType>)` but the current design of the ExtensionType is not dyn-compatible. The `DynExtensionType` trait is really a placeholder to collect the requirements of the rest of arrow-rs for this experiment. This is currently just an experiment/minimum change that actually builds! I'm happy to continue the experiment if there is interest. # Are there any user-facing changes? Yes, a new data type enum member at least (which is a breaking change). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org