tcrasset opened a new issue, #38914: URL: https://github.com/apache/arrow/issues/38914
### Describe the enhancement requested In the C++ library, there is the notion of `uniform_encryption` in `EncryptionConfiguration`, which allows to encrypt all the column and the footer with the same encryption key. The other way is providing a list of `column_keys` with their respective encryption key. From `parquet/encryption/crypto_factory.h`: ```c++ struct PARQUET_EXPORT EncryptionConfiguration { explicit EncryptionConfiguration(const std::string& footer_key) : footer_key(footer_key) {} /// ID of the master key for footer encryption/signing std::string footer_key; /// List of columns to encrypt, with master key IDs (see HIVE-21848). /// Format: "masterKeyID:colName,colName;masterKeyID:colName..." /// Either /// (1) column_keys must be set /// or /// (2) uniform_encryption must be set to true /// If none of (1) and (2) are true, or if both are true, an exception will be /// thrown. std::string column_keys; /// Encrypt footer and all columns with the same encryption key. bool uniform_encryption = kDefaultUniformEncryption; ... } ``` I'm using the Python wrapper around the C++ library, where `uniform_encryption` is not yet present. From `python/pyarrow/_parquet_encryption.pyx`: ```python cdef class EncryptionConfiguration(_Weakrefable): """Configuration of the encryption, such as which columns to encrypt""" # Avoid mistakingly creating attributes __slots__ = () def __init__(self, footer_key, *, column_keys=None, encryption_algorithm=None, plaintext_footer=None, double_wrapping=None, cache_lifetime=None, internal_key_material=None, data_key_length_bits=None): self.configuration.reset( new CEncryptionConfiguration(tobytes(footer_key))) if column_keys is not None: self.column_keys = column_keys if encryption_algorithm is not None: self.encryption_algorithm = encryption_algorithm if plaintext_footer is not None: self.plaintext_footer = plaintext_footer if double_wrapping is not None: self.double_wrapping = double_wrapping if cache_lifetime is not None: self.cache_lifetime = cache_lifetime if internal_key_material is not None: self.internal_key_material = internal_key_material if data_key_length_bits is not None: self.data_key_length_bits = data_key_length_bits ``` The use-case I'm in requires to encrypt all the columns, however I don't have the names of the columns, as I'm streaming the file from an external source, chunk by chunk. Would it be possible to add it to the python implementation? ### Component(s) Parquet, Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@arrow.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org