TheNeuralBit commented on PR #41769:
URL: https://github.com/apache/arrow/pull/41769#issuecomment-2136092470

   Thanks very much for the context @jorisvandenbossche and @pitrou. To be 
clear, de-duping metadata when store_schema is set is the write-side change 
that needs to wait for a corresponding read side change to have sufficient 
distribution. How should we handle this particular change (copying schema-level 
metadata to parquet file-level metadata independent of store_schema flag)?
   
   If there's concern over opting everyone in to this I could add another flag 
in ArrowWriterProperties, as suggested in #31723. It could be a tri-state to 
maintain backward compatibility:
   - unset: use value of store_schema
   - false: never copy schema metadata
   - true: always copy schema metadata
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to