ggershinsky commented on PR #41821: URL: https://github.com/apache/arrow/pull/41821#issuecomment-2146591031
> Perhaps I am wrong, but I assumed that metadata collector collected metadata that was already encrypted, and the issue mostly stems from the fact that when we write out metadata file, we don't set the file level encryption algorithm in the metadata file, which then causes reads of the metadata to fail as it doesn't expect encrypted metadata. > > We could require to pass decryption and encryption properties to write_metadata which decrypts and re-encrypts, or just set the right properties in the file footer to indicate that its encrypted (this is assuming that the collected metadata encryption keys are self contained and do not need per-file footer keys) I'm not familiar with how the collector works and how the metadata files are written/read (is there a writeup on this?). Regarding parquet file encryption - the footer can be encrypted or unencrypted; inside the footer, the column metadata (stats etc) is encrypted with e.g. a column-specific key as required for its protection. The details are [here](https://github.com/apache/parquet-format/blob/master/Encryption.md). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
