adamreeve opened a new pull request, #34181:
URL: https://github.com/apache/arrow/pull/34181

   This PR is a replacement for #10491 which appears to be abandoned. I've 
fixed the issues pointed out in review comments as well as a few more things I 
noticed. I've also made one significant change to the API; rather than 
`CryptoFactory::RotateMasterKeys` operating on a whole directory, it rotates 
keys for a single file, as this is much more flexible and allows users to 
decide what files to rotate keys for, and there didn't appear to be a good 
reason for this to work at the whole directory level. This does mean the API 
diverges slightly from the parquet-mr API though.
   
   ### Rationale for this change
   
   Use of external key material allows rotating master encryption keys without 
having to rewrite Parquet file data. See 
https://docs.google.com/document/d/1bEu903840yb95k9q2X-BlsYKuXoygE4VnMDl9xz_zhk/edit?usp=sharing
 for more details.
   
   ### What changes are included in this PR?
   
   Adds support for writing and reading external key material for Parquet files 
from C++, as well as a new `CryptoFactory::RotateMasterKeys` function that 
allows re-encrypting key encryption keys or data encryption keys with latest 
versions of master keys.
   
   ### Are these changes tested?
   
   Yes, unit tests are included. I've added an additional test that reads a 
file generated with parquet-mr from the parquet-testing repository. This 
requires merging the PR at https://github.com/apache/parquet-testing/pull/36 
and updating the parquet-testing submodule before the new test will pass.
   
   ### Are there any user-facing changes?
   
   Yes, the existing `internal_key_material` option in 
`parquet::encryption::EncryptionConfiguration` will now work and use external 
key material. This requires using two new parameters (`file_path` and 
`file_system`) in `CryptoFactory::GetFileEncryptionProperties` and 
`CryptoFactory::GetFileDecryptionProperties`, which are needed so that we know 
where to write/read the external key material. Note that this means external 
key material won't work from Python until the new parameters are exposed in 
Python too.
   
   This changes the `CryptoFactory` ABI but the API is still source compatible.
   
   The `CryptoFactory::RotateMasterKeys` function is also a new public facing 
API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to