adamreeve commented on code in PR #48009:
URL: https://github.com/apache/arrow/pull/48009#discussion_r2508322948
##########
python/pyarrow/tests/parquet/test_encryption.py:
##########
@@ -514,31 +538,126 @@ def kms_factory(kms_connection_configuration):
# assert table.num_rows == result_table.num_rows
[email protected](reason="External key material not supported yet")
-def test_encrypted_parquet_write_external(tempdir, data_table):
- """Write an encrypted parquet, with external key
- material.
- Currently it's not implemented, so should throw
- an exception"""
+def test_encrypted_parquet_write_read_external(tempdir, data_table,
+ external_encryption_config):
+ """Write an encrypted parquet file with external key material, verify
+ it's encrypted, then read both the table and external store.
+ """
path = tempdir / PARQUET_NAME
- # Encrypt the file with the footer key
+ kms_connection_config, crypto_factory = write_encrypted_file(
+ path, data_table, FOOTER_KEY_NAME, COL_KEY_NAME, FOOTER_KEY, COL_KEY,
+ external_encryption_config)
+
+ verify_file_encrypted(path)
+
+ decryption_config = pe.DecryptionConfiguration()
+ result_table = read_encrypted_parquet(
+ path, decryption_config, kms_connection_config, crypto_factory,
+ internal_key_material=False)
+ store = pa._parquet_encryption.FileSystemKeyMaterialStore.for_file(path)
+
+ assert len(key_ids := store.get_key_id_set()) == (
+ len(external_encryption_config.column_keys[COL_KEY_NAME]) + 1)
+ assert all([store.get_key_material(k) is not None for k in key_ids])
+ assert data_table.equals(result_table)
+
+
+def test_external_key_material_rotation(
+ tempdir,
+ data_table,
+ double_wrap_initial=True,
+ double_wrap_rotated=True):
+ """Tests CryptoFactory.rotate_master_keys
+
+ Note: The CryptoFactory.rotate_master_keys() double_wrapping keword arg
+ may be either True (the default) or False regardless of whether
+ EncryptionConfig.double_wrapping was set to true (also the default) when
+ the external key material store was written. This means double wrapping may
+ be set one way initially and then applied or removed during rotation.
+ When run as a regular pytest test, this function tests the default
+ scenario - double_wrapping before and after rotation.
+
+ A separate wrapper test function below is driven by hypothesis strategies
+ to run through the permutations."""
Review Comment:
Hmm OK fair enough then, I had assumed that the hypothesis tests are always
run. I don't see where that environment variable is actually used though. We
load different profiles depending on the `HYPOTHESIS_PROFILE` variable
[here](https://github.com/apache/arrow/blob/bbb3d48000d98b5cec868a417734422b4c9a6976/python/pyarrow/tests/conftest.py#L35-L45)
but I can't find anywhere that disables them completely.
@raulcd do you have any idea how/if the `PYARROW_TEST_HYPOTHESIS` variable
is used?
If I've interpreted this correctly and the hypothesis tests will always be
run but with the `dev` profile by default, then I think it makes sense to
simplify this to one hypothesis driven function. Otherwise we could just use
`@pytest.mark.parametrize` here instead of hypothesis as the number of
permutations is small.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]