jecsand838 commented on code in PR #8930:
URL: https://github.com/apache/arrow-rs/pull/8930#discussion_r2600791449


##########
arrow-avro/src/reader/async_reader/async_file_reader.rs:
##########
@@ -0,0 +1,18 @@
+use crate::reader::async_reader::DataFetchFutureBoxed;
+use std::ops::Range;
+
+/// A broad generic trait definition allowing fetching bytes from any source 
asynchronously.
+/// This trait has very few limitations, mostly in regard to ownership and 
lifetime,
+/// but it must return a boxed Future containing [`bytes::Bytes`] or an error.
+pub trait AsyncFileReader: Send + Unpin {
+    /// Fetch a range of bytes asynchronously using a custom reading method
+    fn fetch_range(&mut self, range: Range<u64>) -> DataFetchFutureBoxed;
+
+    /// Fetch a range that is beyond the originally provided file range,
+    /// such as reading the header before reading the file,
+    /// or fetching the remainder of the block in case the range ended before 
the block's end.
+    /// By default, this will simply point to the fetch_range function.
+    fn fetch_extra_range(&mut self, range: Range<u64>) -> DataFetchFutureBoxed 
{
+        self.fetch_range(range)
+    }
+}

Review Comment:
   What's your take on aligning this a bit more with the trait used in 
`parquet` and `arrow/async_reader`?
   
   ```suggestion
   pub trait AsyncFileReader: Send {
       /// Retrieve the bytes in `range`
       fn get_bytes(&mut self, range: Range<u64>) -> BoxFuture<'_, 
Result<Bytes>>;
   }
   ```
   
   My thinking is this:
   1. The `get_bytes` trait method is just "fetch these bytes". It doesn't know 
or care whether the range is within some "expected" range. The out-of-band 
reads (header, partial block completion) could be a concern of the reader 
logic, not the I/O trait.
   2. Users already understand `get_bytes` / `get_byte_ranges`. Reusing that 
mental model reduces friction. Plus consistency across crates is generally a 
best practice.
   3. This would unlock a clean default `impl for AsyncRead + AsyncSeek` (like 
`tokio::fs::File`) the same way Parquet does . The  current`'static` 
requirement forces all implementations to be fully owned or `Arc`-wrapped, 
which seems unnecessarily rigid for simple file readers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to