pfparsons commented on code in PR #48009:
URL: https://github.com/apache/arrow/pull/48009#discussion_r2507036062


##########
python/pyarrow/tests/parquet/test_encryption.py:
##########
@@ -514,31 +538,126 @@ def kms_factory(kms_connection_configuration):
     # assert table.num_rows == result_table.num_rows
 
 
[email protected](reason="External key material not supported yet")
-def test_encrypted_parquet_write_external(tempdir, data_table):
-    """Write an encrypted parquet, with external key
-    material.
-    Currently it's not implemented, so should throw
-    an exception"""
+def test_encrypted_parquet_write_read_external(tempdir, data_table,
+                                               external_encryption_config):
+    """Write an encrypted parquet file with external key material, verify
+    it's encrypted, then read both the table and external store.
+    """
     path = tempdir / PARQUET_NAME
 
-    # Encrypt the file with the footer key
+    kms_connection_config, crypto_factory = write_encrypted_file(
+        path, data_table, FOOTER_KEY_NAME, COL_KEY_NAME, FOOTER_KEY, COL_KEY,
+        external_encryption_config)
+
+    verify_file_encrypted(path)
+
+    decryption_config = pe.DecryptionConfiguration()
+    result_table = read_encrypted_parquet(
+        path, decryption_config, kms_connection_config, crypto_factory,
+        internal_key_material=False)
+    store = pa._parquet_encryption.FileSystemKeyMaterialStore.for_file(path)
+
+    assert len(key_ids := store.get_key_id_set()) == (
+        len(external_encryption_config.column_keys[COL_KEY_NAME]) + 1)
+    assert all([store.get_key_material(k) is not None for k in key_ids])
+    assert data_table.equals(result_table)
+
+
+def test_external_key_material_rotation(
+        tempdir,
+        data_table,
+        double_wrap_initial=True,
+        double_wrap_rotated=True):
+    """Tests CryptoFactory.rotate_master_keys
+
+    Note: The CryptoFactory.rotate_master_keys() double_wrapping keword arg
+    may be either True (the default) or False regardless of whether
+    EncryptionConfig.double_wrapping was set to true (also the default) when
+    the external key material store was written. This means double wrapping may
+    be set one way initially and then applied or removed during rotation.
+    When run as a regular pytest test, this function tests the default
+    scenario - double_wrapping before and after rotation.
+
+    A separate wrapper test function below is driven by hypothesis strategies
+    to run through the permutations."""

Review Comment:
   I did feel a bit silly writing it this way and my (shaky) rationale might be 
based on a misunderstanding of how/when hypothesis tests are run in ci. I see 
PYARROW_TEST_HYPOTHESIS is set to ON in 
[/dev/tasks.yml:596](https://github.com/apache/arrow/blob/main/dev/tasks/tasks.yml#L596)
 for a conda-python-3.11 docker test, but I couldn't find any other place where 
the required env var or cli flag is explicitly set. So I did this to ensure at 
least one test with the default values would run regardless of whether 
hypothesis is enabled. Maybe this is all a bit too much for the sake of 
permuting a couple booleans :) but I'm also interested in learning a little 
more about how the build works.
   
   I'm interpreting your comment to mean the test function may be simplified to 
one hypothesis driven function and so I'll make this change now.  Happy to 
follow any other suggestion you may have.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to