vustef commented on issue #7299:
URL: https://github.com/apache/arrow-rs/issues/7299#issuecomment-3441964483

   > > Because this is a special column, we need to mark it as such. We use a 
new extension types for this.
   > 
   > I really like the idea of using an Extension type for this usecase because 
it has several existing precidents (e.g. Variant and Geometry) and we already 
pass the Arrow schema into the parquet reader.
   > 
   > I didn't quite follow the conversation above about `with_metadata_columns` 
with [@jkylling](https://github.com/jkylling)
   > 
   > BTW I will have more time to help with this feature after I complete the 
following items (which I think will make threading this information down into 
the reader much easier)
   > 
   > * [Implement Push Parquet Decoder 
#7997](https://github.com/apache/arrow-rs/pull/7997)
   > * [Rewrite `ParquetRecordBatchStream` (async API) in terms of the 
PushDecoder #8677](https://github.com/apache/arrow-rs/issues/8677)
   > 
   > I expect to merge / make progress on this tomorrow
   
   I hope if doesn't have to be a canonical extension type? We could still 
implement it in the arrow-rs repo, right?
   
   > I didn't quite follow the conversation above about 
`with_metadata_columns`...
   
   Please let me know if we need to clarify. Although some parts of the 
discussion may be more about utility functions on top of the core API. But the 
idea was not to have a special mechanism only for this column, but to be 
extensible for other types of metadata/virtual columns as well.
   
   > I expect to merge / make progress on this tomorrow
   
   Cool, looking forward to that too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to