vustef commented on PR #1824: URL: https://github.com/apache/iceberg-rust/pull/1824#issuecomment-3496405929
> Thanks @gbrgr for this pr. But I think we need to rethink how to compute the `_file`, `_pos` metadata column. While it's somehow trivial to compute `_file`, it's non trivial to compute `_pos` efficient, since when we read parquet files, we have filtered out some row groups. I think the best way is to push reading these two columns to arrow-rs. @liurenjie1024 I agree for `_pos`, and we have a PR there: https://github.com/apache/arrow-rs/pull/8715 But `_file` seems like something that we don't need the arrow-rs to know about. Similarly, in future, for `_row_id` from V3 spec, we cannot expect arrow-rs to be responsible for computing that one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
