tustvold commented on issue #4702: URL: https://github.com/apache/arrow-rs/issues/4702#issuecomment-1682124392
Couple of notes from digging into this: From https://iceberg.apache.org/spec/#column-projection: > Tables may also define a property schema.name-mapping.default with a JSON name mapping containing a list of field mapping objects. These mappings provide fallback field ids to be used when a data file does not contain field id information So it would appear that field mappings are not strictly required to be present, this may be a way to avoid needing to rewrite data lacking such attributes Additionally also from https://iceberg.apache.org/spec/#column-projection: > List types should contain a mapping in fields for element. > Map types should contain mappings in fields for key and value. This would appear to suggest that iceberg only _requires_ that field IDs are present for the bottom of the three-level list declaration ``` <list-repetition> group <name> (LIST) { repeated group list { <element-repetition> <element-type> element; } } ``` I think the approach suggested in this PR is perfectly acceptable, as whilst it provides no mechanism to provide a field id for `repeated group list`, I suspect this is fine for most use-cases -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
