jecsand838 commented on code in PR #8100: URL: https://github.com/apache/arrow-rs/pull/8100#discussion_r2265411002
########## arrow-avro/src/reader/mod.rs: ########## @@ -282,6 +310,31 @@ impl Decoder { pub fn batch_is_full(&self) -> bool { self.remaining_capacity == 0 } + + // Decode either the block count of remaining capacity from `data` (an OCF block payload). + // + // Returns the number of bytes consumed from `data` along with the number of records decoded. + fn decode_block(&mut self, data: &[u8], count: usize) -> Result<(usize, usize), ArrowError> { Review Comment: That occurred to me as well. I decided to leave it that way because the only caller is `Reader::read` and `decode_block` is not public. As an aside, I was unsure of the value a public `decode_block` method would offer. This is due to block encodings only existing in Object Container Files, which the `Reader` handles. In the future if there's demand for decoding blocks outside of the `Reader`, then we'd probably want to refactor the code to support `Decoder::decode_block(block: Block, codec: Option<CompressionCodec>) -> Result<DecodeRes, ArrowError>` or something along those lines like you pointed out. I was just concerned this would be a pre-mature optimization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org