westonpace commented on pull request #10289: URL: https://github.com/apache/arrow/pull/10289#issuecomment-840008403
> * not attempt to auto-generate any field_ids if they are not present in metadata @pitrou That should simplify things. Just to clarify, this will be a bit of a regression as we currently auto-generate field IDs today. > https://issues.apache.org/jira/browse/PARQUET-951 informs field IDs a little bit better. It is from other systems, in this case protobuf (and I imagine thrift might also have something similar) has each field in a message annotated with a unique ID. Based on this I agree with Antoine's assessment, haven't actually looked at the code (is this not what is done?). @emkornfield Correct. We already pulled the field id out of thrift and into Arrow metadata. The only problem was that the logic to do the reverse was missing. This PR is only adding that. There could be some follow-up work for integrating with other parts of the Arrow ecosystem. I will send some questions to the ML. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
