vustef commented on issue #7299: URL: https://github.com/apache/arrow-rs/issues/7299#issuecomment-3441964483
> > Because this is a special column, we need to mark it as such. We use a new extension types for this. > > I really like the idea of using an Extension type for this usecase because it has several existing precidents (e.g. Variant and Geometry) and we already pass the Arrow schema into the parquet reader. > > I didn't quite follow the conversation above about `with_metadata_columns` with [@jkylling](https://github.com/jkylling) > > BTW I will have more time to help with this feature after I complete the following items (which I think will make threading this information down into the reader much easier) > > * [Implement Push Parquet Decoder #7997](https://github.com/apache/arrow-rs/pull/7997) > * [Rewrite `ParquetRecordBatchStream` (async API) in terms of the PushDecoder #8677](https://github.com/apache/arrow-rs/issues/8677) > > I expect to merge / make progress on this tomorrow I hope if doesn't have to be a canonical extension type? We could still implement it in the arrow-rs repo, right? > I didn't quite follow the conversation above about `with_metadata_columns`... Please let me know if we need to clarify. Although some parts of the discussion may be more about utility functions on top of the core API. But the idea was not to have a special mechanism only for this column, but to be extensible for other types of metadata/virtual columns as well. > I expect to merge / make progress on this tomorrow Cool, looking forward to that too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
