EnricoMi opened a new pull request, #45462:
URL: https://github.com/apache/arrow/pull/45462

   ### Rationale for this change
   
   Columns can b encrypted with individual keys. For this, the column name have 
to be set in `EncryptionConfiguration::column_keys`. This poses the following 
challenges for columns with nested fields like `MapType`, `ListType`, and 
`StructType`:
   
   - Encrypting a column of such type requires providing an encryption key for 
all nested (leaf) fields. Ideally, the column name should be sufficient (as it 
is for any other data type) to encrypt all nested fields.
   - The actual name of nested fields is not obvious and intuitive from the 
Arrow schema of the table. An intuitive naming should be possible.
   
   ### What changes are included in this PR?
   
   This adds a user-friendly notation for nested fields:
   - Columns `col.key` and `col.value` can be used to reference they key and 
value nested field of a `MapType` column. Currently, `col.key_value.key` and 
`col.key_value.value` are required, respectively.
   - Columns `col.element` can be used to reference they individual list 
elements of a `ListType` column. Currently, `col.list.element` is required.
   - The actual column name can be used to encrypt all nested fields with the 
same encryption key.
   - The current column naming scheme can still be used for backward 
compatibility.
   
   ### Are these changes tested?
   Tested in C++ and Python.
   
   ### Are there any user-facing changes?
   Column encryption can be configured through simpler and intuitive naming.
   
   Documentation will be extended once #45411 is merged.
   
   Fixes #41246.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to