tustvold commented on PR #7387:
URL: https://github.com/apache/arrow-rs/pull/7387#issuecomment-2789490252

   > This is a good point but I don't think that's really feasible, it would 
require changing all code paths where encryption is used to be async which I 
expect would be a very large and breaking change.
   
   I can't help feeling this is something we are going to need eventually, and 
we should probably work out how it would work... If it means the sync APIs only 
support encryption with static keys, then maybe that is fine...
   
   > But I believe users should be directed to using this KMS based API if 
possible to push them towards better security practices
   
   I mean even better would be for them to be using the envelope encryption 
facilities the cloud providers themselves provide... My understanding of this 
PR is the user's still have to manage and store the KEKs themselves, handle 
rotation, etc... Is there an argument that whilst perhaps better than the 
low-level interface it still requires quite a sophisticated user to use it 
securely?
   
   IMO parquet-rs should provide the minimal hooks to allow people to securely 
support modular encryption in their environment with the primitives available 
to them, be they a cloud-based KMS or HSM solution or something else. I 
understand the thinking of providing an out of the box key management toolkit, 
especially as a way to dog-food these interfaces, but I worry about being able 
to maintain what is a piece of complex and security critical code.
   
   > Adding this layer will help arrow-rs readers to open files written by [AWS 
PME](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/hive-parquet-modular-encryption.html),
 [Spark 
PME](https://spark.apache.org/docs/latest/sql-data-sources-parquet.html#columnar-encryption)
 or PyArrow/ArrowC++. They all package a similar mechanism.
   
   At least AWS PME appears to integrate with AWS KMS, i.e. the approach I am 
alluding to above. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to