itamarst commented on pull request #9631:
URL: https://github.com/apache/arrow/pull/9631#issuecomment-830182532


   So here is what the end-user explained about why they specifically want the 
low-level API; I've bolded the most important part.
   
   > The way I understand things, the low-level API only cares about 
encrypting/decrypting the parquet data but does not do any key management 
beyond simply storing the key-identifiers.
   >
   > This is really useful because it means that the low-level API makes no 
assumption on how the key should be handled. Instead this responsibility is 
left to the application. Personally I think this is the right call: Apache 
Arrow/Parquet is not a Key Management System, there are plenty of other 
libraries for that.
   >
   > (Again from what I understand) the high-level API defines how the keys are 
stored in relation to the key identifier, handles the symmetric vs asymmetric 
keys, and how Parquet files metadata should be stored in order to be compatible 
with a de-facto standard (vague recollection that they want to be compatible 
with existing ecosystems such as HADOOP or Spark).
   > 
   > IMHO Trying to be compatible with the key management of an existing 
Parquet ecosystem is a mistake if it forces everyone to adopt that system. 
Instead it should be a separate library, away from Arrow/Parquet, as one of the 
possible way to manage keys wrt Parquet files.
   >
   > Unlike a data-format specification like Parquet, I am not convinced that 
KMS requirements are as universal as to mandate a standard solution that 
everyone is forced to used (which would be the case if it's the only one 
exposed in Python). I'm convinced of the opposite: that a lot of companies have 
their own tailor-made KMS. **In fact, we already have our own KMS and 
KMS-related routines to securely manage keys and accesses from the key 
identifiers . So not only is the high-level API redundant to what we already 
have, but we also cannot use it as it would be incompatible with the rest of 
the company.**
   
   I don't want to waste time if this PR will never be accepted, but AFAICT 
there is a real use-case for low-level API.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to