asfimport commented on issue #42257:
URL: https://github.com/apache/arrow/issues/42257#issuecomment-2184204288

   [Wes 
McKinney](https://issues.apache.org/jira/browse/PARQUET-474?#comment-15334441) 
/ @wesm:
   Took a look at this. Reading a range of bytes from a local file must be made 
an atomic operation so that we can lock the source while performing a seek then 
read. Currently we have code like
   
   ```Java
     source_->Seek(filesize - FOOTER_SIZE);
     int64_t bytes_read = source_->Read(FOOTER_SIZE, footer_buffer);
   ```
   
   This can be made thread-safe by having an API like
   
   ```Java
   source_->ReadAt(filesize - FOOTER_SIZE, FOOTER_SIZE, footer_buffer);
   ```
   
   We already have
   
   ```Java
     std::shared_ptr<Buffer> ReadAt(int64_t pos, int64_t nbytes);
   ```
   
   in which we can also block other threads (as needed) while performing the 
read.
   
   The stream classes are slightly different rabbit hole – to get the best 
performance we'd want to have a buffered stream that continues to buffer data 
from the source in a background thread (presently it is synchronous / on-demand 
buffering)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to