jecsand838 commented on code in PR #8100:
URL: https://github.com/apache/arrow-rs/pull/8100#discussion_r2265411002


##########
arrow-avro/src/reader/mod.rs:
##########
@@ -282,6 +310,31 @@ impl Decoder {
     pub fn batch_is_full(&self) -> bool {
         self.remaining_capacity == 0
     }
+
+    // Decode either the block count of remaining capacity from `data` (an OCF 
block payload).
+    //
+    // Returns the number of bytes consumed from `data` along with the number 
of records decoded.
+    fn decode_block(&mut self, data: &[u8], count: usize) -> Result<(usize, 
usize), ArrowError> {

Review Comment:
   That occurred to me as well. I decided to leave it that way because the only 
caller is `Reader::read` and `decode_block` is not public.
   
   As an aside, I was unsure of the value a public `decode_block` method would 
offer. This is due to block encodings only existing in Object Container Files, 
which the `Reader` handles. In the future if there's demand for decoding blocks 
outside of the `Reader`, then we'd probably want to refactor the code to 
support `Decoder::decode_block(block: Block, codec: Option<CompressionCodec>) 
-> Result<DecodeRes, ArrowError>` or something along those lines like you 
pointed out. I was just concerned this would be a pre-mature optimization.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to