adamreeve commented on code in PR #7439:
URL: https://github.com/apache/arrow-rs/pull/7439#discussion_r2067722573


##########
parquet/src/encryption/encrypt.rs:
##########
@@ -360,6 +361,26 @@ impl FileEncryptor {
             Some(column_key) => 
Ok(Box::new(RingGcmBlockEncryptor::new(column_key.key())?)),
         }
     }
+    
+    /// Get encryption algorithm for the plaintext footer
+    pub(crate) fn get_footer_encryptor_for_plaintext(&self) -> 
Result<Option<EncryptionAlgorithm>> {

Review Comment:
   This name doesn't seem right. Maybe `get_footer_encryption_algorithm`?
   
   There's also a lot of overlap with 
`MetadataObjectWriter::file_crypto_metadata`. It might be cleaner if this was 
moved to a method on that type and the logic de-duplicated.



##########
parquet/src/encryption/encrypt.rs:
##########
@@ -374,6 +395,33 @@ pub(crate) fn encrypt_object<T: TSerializable, W: Write>(
     Ok(())
 }
 
+pub(crate) fn sign_and_write_object<T: TSerializable, W: Write>(
+    object: &T,
+    encryptor: &mut Box<dyn BlockEncryptor>,
+    sink: &mut W,
+    module_aad: &[u8],
+) -> Result<()> {
+    let mut buffer: Vec<u8> = vec![];
+    {
+        let mut protocol = TCompactOutputProtocol::new(&mut buffer);
+        object.write_to_out_protocol(&mut protocol)?;
+    }
+    sink.write_all(&buffer)?;
+    buffer = encryptor.encrypt(buffer.as_ref(), module_aad)?;
+
+    let ciphertext_length : u32 = buffer.len()
+        .try_into()
+        .map_err(|err| general_err!("Plaintext data too long. {:?}", err))?;
+
+    // Format is: [ciphertext size, nonce, ciphertext, authentication tag]

Review Comment:
   ```suggestion
       // Format of encrypted buffer is: [ciphertext size, nonce, ciphertext, 
authentication tag]
   ```



##########
parquet/src/encryption/encrypt.rs:
##########
@@ -374,6 +395,33 @@ pub(crate) fn encrypt_object<T: TSerializable, W: Write>(
     Ok(())
 }
 
+pub(crate) fn sign_and_write_object<T: TSerializable, W: Write>(
+    object: &T,
+    encryptor: &mut Box<dyn BlockEncryptor>,
+    sink: &mut W,
+    module_aad: &[u8],
+) -> Result<()> {
+    let mut buffer: Vec<u8> = vec![];
+    {
+        let mut protocol = TCompactOutputProtocol::new(&mut buffer);
+        object.write_to_out_protocol(&mut protocol)?;
+    }
+    sink.write_all(&buffer)?;
+    buffer = encryptor.encrypt(buffer.as_ref(), module_aad)?;
+
+    let ciphertext_length : u32 = buffer.len()
+        .try_into()
+        .map_err(|err| general_err!("Plaintext data too long. {:?}", err))?;
+
+    // Format is: [ciphertext size, nonce, ciphertext, authentication tag]
+    let nonce = buffer[SIZE_LEN..SIZE_LEN + NONCE_LEN].to_vec();
+    let tag = buffer[(ciphertext_length - TAG_LEN as u32) as usize..].to_vec();
+    sink.write_all(&nonce)?;
+    sink.write_all(&tag)?;

Review Comment:
   I don't think you need the u32 `ciphertext_length` above, this can be 
simplified. Plus you don't need to copy the nonce and tag to new vectors:
   
   ```suggestion
       let nonce = &buffer[SIZE_LEN..SIZE_LEN + NONCE_LEN];
       let tag = &buffer[(buffer.len() - TAG_LEN)..];
       sink.write_all(nonce)?;
       sink.write_all(tag)?;
   ```



##########
parquet/src/encryption/encrypt.rs:
##########
@@ -374,6 +395,33 @@ pub(crate) fn encrypt_object<T: TSerializable, W: Write>(
     Ok(())
 }
 
+pub(crate) fn sign_and_write_object<T: TSerializable, W: Write>(

Review Comment:
   I think it's worth being extra-clear that this writes in plaintext, maybe:
   ```suggestion
   pub(crate) fn write_signed_plaintext_object<T: TSerializable, W: Write>(
   ```



##########
parquet/src/file/metadata/writer.rs:
##########
@@ -141,6 +142,15 @@ impl<'a, W: Write> ThriftMetadataWriter<'a, W> {
             .object_writer
             .apply_row_group_encryption(self.row_groups)?;
 
+        #[cfg(not(feature = "encryption"))]

Review Comment:
   Rather than having `#[cfg...` inline, this logic should be moved to a method 
on `MetadataObjectWriter` that has different implementations depending on 
whether encryption is enabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to