mhilton opened a new issue, #4702: URL: https://github.com/apache/arrow-rs/issues/4702
**Is your feature request related to a problem or challenge? Please describe what you are trying to do.** We would like to use the parquet files written from a set of arrow record batches as part of an apache-iceberg snapshot without modification. The apache-iceberg [parquet specification](https://iceberg.apache.org/spec/#parquet) requires that field-ids are present. **Describe the solution you'd like** The s[olution](https://github.com/apache/arrow/blob/f3010bac94cbd588ecebd6e7839f9d56e97b1a9b/go/parquet/pqarrow/schema.go#L397) implemented by (at least) the go parquet package seems reasonable. This uses a metadata value with the key `PARQUET:field_id` to determine the field_id when converting an arrow schema into a parquet schema. If there is no such metadata entry then the field_id will not be present. **Describe alternatives you've considered** An alternative would be to add a mechanism to `WriterProperties` to specify the `field_id` to use with a column. This presumably would work in a similar manner to [encoding](https://docs.rs/parquet/latest/parquet/file/properties/struct.WriterProperties.html#method.encoding). **Additional context** N/A -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
