niyue commented on PR #12812: URL: https://github.com/apache/arrow/pull/12812#issuecomment-1097344685
> The IPC message's custom_metadata is independent from record batch (i.e. schema) metadata This is actually my initial understanding, according to this documentation (https://arrow.apache.org/docs/format/Columnar.html#custom-application-metadata), it says `This includes Field, Schema, and Message.` and it seems to indicate schema metadata is different from message metadata. But for the documentation here (https://arrow.apache.org/docs/python/data.html#custom-schema-and-field-metadata), it says `Arrow supports both schema-level and field-level custom key-value metadata allowing for systems to insert their own application defined metadata to customize behavior.` and record batch (message level) metadata is not mentioned. And I only see API/tests allowing users to modify schema's metadata (probably I am missing something), so I assumed my initial understanding is incorrect and came up with such a PR change. If message metadata is different from schema metadata, the question here may be slightly more complex since in this case, each record batch may provide 1) this record batch's schema specific metadata, this seems not serialized/deserialized 2) this record batch's message specific metadata, we don't seem to provide API for reading/writing this (internally we have `AppendCustomMetadata` but it is not called and there seems no higher level API allowing users to pass this information) Please advice what the correct behavior should be so that I could give it a try. Thanks. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
