niyue commented on PR #12812:
URL: https://github.com/apache/arrow/pull/12812#issuecomment-1097344685

   > The IPC message's custom_metadata is independent from record batch (i.e. 
schema) metadata
   
   This is actually my initial understanding, according to this documentation 
(https://arrow.apache.org/docs/format/Columnar.html#custom-application-metadata),
 it says `This includes Field, Schema, and Message.` and it seems to indicate 
schema metadata is different from message metadata. 
   
   But for the documentation here 
(https://arrow.apache.org/docs/python/data.html#custom-schema-and-field-metadata),
 it says `Arrow supports both schema-level and field-level custom key-value 
metadata allowing for systems to insert their own application defined metadata 
to customize behavior.` and record batch (message level) metadata is not 
mentioned.
   
   And I only see API/tests allowing users to modify schema's metadata 
(probably I am missing something), so I assumed my initial understanding is 
incorrect and came up with such a PR change.
   
   If message metadata is different from schema metadata, the question here may 
be slightly more complex since in this case, each record batch may provide 1) 
this record batch's schema specific metadata, this seems not 
serialized/deserialized  2) this record batch's message specific metadata, we 
don't seem to provide API for reading/writing this (internally we have 
`AppendCustomMetadata` but it is not called and there seems no higher level API 
allowing users to pass this information)
   
   Please advice what the correct behavior should be so that I could give it a 
try. Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to