etseidl commented on code in PR #6637:
URL: https://github.com/apache/arrow-rs/pull/6637#discussion_r1955020840


##########
parquet/src/arrow/async_reader/store.rs:
##########
@@ -163,15 +164,29 @@ impl AsyncFileReader for ParquetObjectReader {
     // an `impl MetadataFetch` and calls those methods to get data from it. 
Due to `Self`'s impl of
     // `AsyncFileReader`, the calls to `MetadataFetch::fetch` are just 
delegated to
     // `Self::get_bytes`.
-    fn get_metadata(&mut self) -> BoxFuture<'_, Result<Arc<ParquetMetaData>>> {
+    fn get_metadata<'a>(
+        &'a mut self,
+        #[cfg(feature = "encryption")] file_decryption_properties: Option<
+            &'a FileDecryptionProperties,
+        >,
+    ) -> BoxFuture<'a, Result<Arc<ParquetMetaData>>> {
         Box::pin(async move {
             let file_size = self.meta.size;
+            #[cfg(not(feature = "encryption"))]

Review Comment:
   To reduce duplication, maybe instead do something like
   ```rust
               let metadata = ParquetMetaDataReader::new()
                   .with_column_indexes(self.preload_column_index)
                   .with_offset_indexes(self.preload_offset_index)
                   .with_prefetch_hint(self.metadata_size_hint);
               #[cfg(feature = "encryption")]
               let metadata = 
metadata.with_decryption_properties(file_decryption_properties);
   
               let metadata = metadata.load_and_finish(self, file_size).await?;
   ```



##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -532,6 +554,13 @@ impl<T: ChunkReader + 'static> 
ParquetRecordBatchReaderBuilder<T> {
         Ok(Self::new_with_metadata(reader, metadata))
     }
 
+    /// Create a new [`ParquetRecordBatchReaderBuilder`] with 
[`ArrowReaderOptions`] and [`FileDecryptionProperties`]
+    #[cfg(feature = "encryption")]
+    pub fn try_new_with_decryption(reader: T, options: ArrowReaderOptions) -> 
Result<Self> {

Review Comment:
   Is this needed? It seems to be identical to `try_new_with_options`.



##########
parquet/src/arrow/arrow_reader/mod.rs:
##########
@@ -788,6 +854,22 @@ impl ParquetRecordBatchReader {
             .build()
     }
 
+    /// Create a new [`ParquetRecordBatchReader`] from the provided chunk 
reader and [`FileDecryptionProperties`]
+    ///
+    /// Note: this is needed when the parquet file is encrypted
+    #[cfg(feature = "encryption")]
+    pub fn try_new_with_decryption<T: ChunkReader + 'static>(
+        reader: T,
+        batch_size: usize,
+        file_decryption_properties: Option<&FileDecryptionProperties>,
+    ) -> Result<Self> {
+        let options = ArrowReaderOptions::default()
+            
.with_file_decryption_properties(file_decryption_properties.cloned().unwrap());
+        ParquetRecordBatchReaderBuilder::try_new_with_decryption(reader, 
options)?

Review Comment:
   Same...this could just be `try_new_with_options`



##########
parquet/src/arrow/async_reader/mod.rs:
##########
@@ -343,8 +386,8 @@ impl<T: AsyncFileReader + Send + 'static> 
ParquetRecordBatchStreamBuilder<T> {
         Self::new_with_options(input, Default::default()).await
     }
 
-    /// Create a new [`ParquetRecordBatchStreamBuilder`] with the provided 
async source
-    /// and [`ArrowReaderOptions`]
+    /// Create a new [`ParquetRecordBatchStreamBuilder`] with the provided 
async source,
+    /// [`ArrowReaderOptions`] and [`FileDecryptionProperties`] if the data is 
encrypted.

Review Comment:
   I'd revert to the old wording, but add a following sentence along the lines 
of "If the data is encrypted, ArrowReaderOptions should have the 
FileDecryptionProperties set".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to