adamreeve commented on code in PR #7459: URL: https://github.com/apache/arrow-rs/pull/7459#discussion_r2067700447
########## parquet/src/encryption/ciphers.rs: ########## @@ -23,12 +23,14 @@ use ring::rand::{SecureRandom, SystemRandom}; use std::fmt::Debug; const RIGHT_TWELVE: u128 = 0x0000_0000_ffff_ffff_ffff_ffff_ffff_ffff; -const NONCE_LEN: usize = 12; -const TAG_LEN: usize = 16; -const SIZE_LEN: usize = 4; +pub(crate) const NONCE_LEN: usize = 12; +pub(crate) const TAG_LEN: usize = 16; +pub(crate) const SIZE_LEN: usize = 4; pub(crate) trait BlockDecryptor: Debug + Send + Sync { fn decrypt(&self, length_and_ciphertext: &[u8], aad: &[u8]) -> Result<Vec<u8>>; + + fn compute_tag(&self, nonce: &[u8], aad: &[u8], ciphertext: &mut [u8]) -> Result<Vec<u8>>; Review Comment: I'm not sure it makes sense to have this method on the `BlockDecryptor` trait as computing a tag for a GCM-CTR decyptor is non-sensical. But I can't see an obvious better approach so maybe this is fine and if/when there's a CTR implementation it could just return an error? ########## parquet/tests/encryption/encryption.rs: ########## @@ -43,14 +43,25 @@ use std::sync::Arc; fn test_non_uniform_encryption_plaintext_footer() { let test_data = arrow::util::test_util::parquet_test_data(); let path = format!("{test_data}/encrypt_columns_plaintext_footer.parquet.encrypted"); - let file = File::open(path).unwrap(); + let file = File::open(path.clone()).unwrap(); // There is always a footer key even with a plaintext footer, // but this is used for signing the footer. let footer_key = "0123456789012345".as_bytes(); // 128bit/16 let column_1_key = "1234567890123450".as_bytes(); let column_2_key = "1234567890123451".as_bytes(); + let decryption_properties = FileDecryptionProperties::builder(footer_key.to_vec()) Review Comment: I think it's worth having a separate test function that sets `disable_footer_signature_verification` and provides an incorrect footer key to verify we can still read the file in that scenario. ########## parquet/src/encryption/decrypt.rs: ########## @@ -496,6 +511,13 @@ impl DecryptionPropertiesBuilder { } Ok(self) } + + /// Specify if footer signature verification should be disabled. Review Comment: ```suggestion /// Disable verification of footer tags for files that use plaintext footers. ``` ########## parquet/src/encryption/decrypt.rs: ########## @@ -351,6 +357,11 @@ impl FileDecryptionProperties { self.aad_prefix.as_ref() } + /// Returns true if footer signature verification is enabled. Review Comment: ```suggestion /// Returns true if footer signature verification is enabled for files with plaintext footers. ``` ########## parquet/src/file/metadata/reader.rs: ########## @@ -963,15 +962,24 @@ impl ParquetMetaDataReader { let schema_descr = Arc::new(SchemaDescriptor::new(schema)); if let (Some(algo), Some(file_decryption_properties)) = ( - t_file_metadata.encryption_algorithm, + t_file_metadata.encryption_algorithm.clone(), file_decryption_properties, ) { // File has a plaintext footer but encryption algorithm is set file_decryptor = Some(get_file_decryptor( algo, - t_file_metadata.footer_signing_key_metadata.as_deref(), + t_file_metadata + .footer_signing_key_metadata + .clone() + .as_deref(), file_decryption_properties, )?); + if file_decryption_properties.check_plaintext_footer_integrity() { + file_decryptor + .clone() + .unwrap() + .verify_signature(&t_file_metadata, buf.as_ref())?; Review Comment: Rather than verifying the signature using the deserialized file metadata and having to re-serialize it, it would make more sense to verify the signature using the serialized data. Eg. `verify_signature` would only take the `buf` parameter and not `t_file_metadata`. Internally it would need to take a copy of the buffer (excluding size, nonce, tag) and use that as the plaintext to compute the tag. I don't think we can assume that the metadata written by another implementation (or a different version of arrow-rs) then deserialized by us and re-serialized will result in byte-for-byte identical thrift. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org