EnricoMi opened a new pull request, #45462: URL: https://github.com/apache/arrow/pull/45462
### Rationale for this change Columns can b encrypted with individual keys. For this, the column name have to be set in `EncryptionConfiguration::column_keys`. This poses the following challenges for columns with nested fields like `MapType`, `ListType`, and `StructType`: - Encrypting a column of such type requires providing an encryption key for all nested (leaf) fields. Ideally, the column name should be sufficient (as it is for any other data type) to encrypt all nested fields. - The actual name of nested fields is not obvious and intuitive from the Arrow schema of the table. An intuitive naming should be possible. ### What changes are included in this PR? This adds a user-friendly notation for nested fields: - Columns `col.key` and `col.value` can be used to reference they key and value nested field of a `MapType` column. Currently, `col.key_value.key` and `col.key_value.value` are required, respectively. - Columns `col.element` can be used to reference they individual list elements of a `ListType` column. Currently, `col.list.element` is required. - The actual column name can be used to encrypt all nested fields with the same encryption key. - The current column naming scheme can still be used for backward compatibility. ### Are these changes tested? Tested in C++ and Python. ### Are there any user-facing changes? Column encryption can be configured through simpler and intuitive naming. Documentation will be extended once #45411 is merged. Fixes #41246. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
