mapleFU opened a new issue, #37434:
URL: https://github.com/apache/arrow/issues/37434

   ### Describe the enhancement requested
   
   ```c++
     Result<int64_t> Read(int64_t nbytes, void* out) {
       if (ARROW_PREDICT_FALSE(nbytes < 0)) {
         return Status::Invalid("Bytes to read must be positive. Received:", 
nbytes);
       }
   
       if (nbytes < buffer_size_) {
         // Pre-buffer for small reads
         RETURN_NOT_OK(BufferIfNeeded());
       }
   
       if (nbytes > bytes_buffered_) {
         // Copy buffered bytes into out, then read rest
         memcpy(out, buffer_data_ + buffer_pos_, bytes_buffered_);
   
         int64_t bytes_to_read = nbytes - bytes_buffered_;
         if (raw_read_bound_ >= 0) {
           bytes_to_read = std::min(bytes_to_read, raw_read_bound_ - 
raw_read_total_);
         }
         ARROW_ASSIGN_OR_RAISE(
             int64_t bytes_read,
             raw_->Read(bytes_to_read, reinterpret_cast<uint8_t*>(out) + 
bytes_buffered_));
         raw_read_total_ += bytes_read;
   
         // Do not make assumptions about the raw stream position
         raw_pos_ = -1;
         bytes_read += bytes_buffered_;
         RewindBuffer();
         return bytes_read;
       } else {
         memcpy(out, buffer_data_ + buffer_pos_, nbytes);
         ConsumeBuffer(nbytes);
         return nbytes;
       }
     }
   ```
   
   If we Set BufferSize == 100k, and read 3k bytes per IO. When we read the 34 
times, the IO would be `(99k, 102k]`
   
   In `Read`, it will read buffered `(99k, 100k]`, issue IO for `(100k, 102k]`. 
Rather than `(100k, 202k]`. Is this expected?
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to