itamarst commented on pull request #9631: URL: https://github.com/apache/arrow/pull/9631#issuecomment-830182532
So here is what the end-user explained about why they specifically want the low-level API; I've bolded the most important part. > The way I understand things, the low-level API only cares about encrypting/decrypting the parquet data but does not do any key management beyond simply storing the key-identifiers. > > This is really useful because it means that the low-level API makes no assumption on how the key should be handled. Instead this responsibility is left to the application. Personally I think this is the right call: Apache Arrow/Parquet is not a Key Management System, there are plenty of other libraries for that. > > (Again from what I understand) the high-level API defines how the keys are stored in relation to the key identifier, handles the symmetric vs asymmetric keys, and how Parquet files metadata should be stored in order to be compatible with a de-facto standard (vague recollection that they want to be compatible with existing ecosystems such as HADOOP or Spark). > > IMHO Trying to be compatible with the key management of an existing Parquet ecosystem is a mistake if it forces everyone to adopt that system. Instead it should be a separate library, away from Arrow/Parquet, as one of the possible way to manage keys wrt Parquet files. > > Unlike a data-format specification like Parquet, I am not convinced that KMS requirements are as universal as to mandate a standard solution that everyone is forced to used (which would be the case if it's the only one exposed in Python). I'm convinced of the opposite: that a lot of companies have their own tailor-made KMS. **In fact, we already have our own KMS and KMS-related routines to securely manage keys and accesses from the key identifiers . So not only is the high-level API redundant to what we already have, but we also cannot use it as it would be incompatible with the rest of the company.** I don't want to waste time if this PR will never be accepted, but AFAICT there is a real use-case for low-level API. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
