jinchengchenghh commented on issue #7860:
URL: 
https://github.com/apache/incubator-gluten/issues/7860#issuecomment-2467101676

   Velox use `SpillReadFile` to read the file, it uses `FileInputStream` to 
read the file and `simd::memcpy` to copy the bytes, It will output batch 
RowVector one by one. `FileInputStream` uses `velox::LocalReadFile` `pread` or 
`preadv` to read the file.
   
   As I see, it reads bufferSize_  which is controlled by QueryConfig 
`kSpillReadBufferSize` (default 1MB) one time. Note: if file system supports 
async read, read double bufferSize_ one time.
   @FelixYBW 
   ```
   readBytes = readSize();
         VELOX_CHECK_LT(
             0, readBytes, "Read past end of FileInputStream {}", fileSize_);
         NanosecondTimer timer_2{&readTimeNs};
         file_->pread(fileOffset_, readBytes, buffer()->asMutable<char>());
   
   uint64_t FileInputStream::readSize() const {
     return std::min(fileSize_ - fileOffset_, bufferSize_);
   }
   ```
   ```
   /* Read data from file descriptor FD at the given position OFFSET
      without change the file pointer, and put the result in the buffers
      described by IOVEC, which is a vector of COUNT 'struct iovec's.
      The buffers are filled in the order specified.  Operates just like
      'pread' (see <unistd.h>) except that data are put in IOVEC instead
      of a contiguous buffer.
   
      This function is a cancellation point and therefore not marked with
      __THROW.  */
   extern ssize_t preadv (int __fd, const struct iovec *__iovec, int __count,
                       __off_t __offset) __wur;
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to