EmilyMatt commented on code in PR #8930:
URL: https://github.com/apache/arrow-rs/pull/8930#discussion_r2713366094
##########
arrow-avro/src/reader/async_reader/mod.rs:
##########
@@ -0,0 +1,1330 @@
+use crate::compression::CompressionCodec;
+use crate::reader::Decoder;
+use crate::reader::block::{BlockDecoder, BlockDecoderState};
+use arrow_array::RecordBatch;
+use arrow_schema::ArrowError;
+use bytes::Bytes;
+use futures::future::BoxFuture;
+use futures::{FutureExt, Stream};
+use std::mem;
+use std::ops::Range;
+use std::pin::Pin;
+use std::task::{Context, Poll};
+
+mod async_file_reader;
+mod builder;
+
+pub use async_file_reader::AsyncFileReader;
+pub use builder::AsyncAvroFileReaderBuilder;
+
+#[cfg(feature = "object_store")]
+mod store;
+
+#[cfg(feature = "object_store")]
+pub use store::AvroObjectReader;
+
+enum FetchNextBehaviour {
+ /// Initial read: scan for sync marker, then move to decoding blocks
+ ReadSyncMarker,
+ /// Parse VLQ header bytes one at a time until Data state, then continue
decoding
+ DecodeVLQHeader,
+ /// Continue decoding the current block with the fetched data
+ ContinueDecoding,
+}
+
+enum ReaderState<R: AsyncFileReader> {
+ /// Intermediate state to fix ownership issues
+ InvalidState,
+ /// Initial state, fetch initial range
+ Idle { reader: R },
+ /// Fetching data from the reader
+ FetchingData {
+ future: BoxFuture<'static, Result<(R, Bytes), ArrowError>>,
+ next_behaviour: FetchNextBehaviour,
+ },
+ /// Decode a block in a loop until completion
+ DecodingBlock { data: Bytes, reader: R },
+ /// Output batches from a decoded block
+ ReadingBatches {
+ data: Bytes,
+ block_data: Bytes,
+ remaining_in_block: usize,
+ reader: R,
+ },
+ /// An error occurred, should not have been polled again.
+ Error,
+ /// Done, flush decoder and return
+ Finished,
+}
Review Comment:
Addressed this, also added a lint to prevent any accidental "?" in that
function.
I will take a look at coverage later today, as it requires setting up some
infra to test locally
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]