asfimport commented on issue #42257: URL: https://github.com/apache/arrow/issues/42257#issuecomment-2184204288
[Wes McKinney](https://issues.apache.org/jira/browse/PARQUET-474?#comment-15334441) / @wesm: Took a look at this. Reading a range of bytes from a local file must be made an atomic operation so that we can lock the source while performing a seek then read. Currently we have code like ```Java source_->Seek(filesize - FOOTER_SIZE); int64_t bytes_read = source_->Read(FOOTER_SIZE, footer_buffer); ``` This can be made thread-safe by having an API like ```Java source_->ReadAt(filesize - FOOTER_SIZE, FOOTER_SIZE, footer_buffer); ``` We already have ```Java std::shared_ptr<Buffer> ReadAt(int64_t pos, int64_t nbytes); ``` in which we can also block other threads (as needed) while performing the read. The stream classes are slightly different rabbit hole – to get the best performance we'd want to have a buffered stream that continues to buffer data from the source in a background thread (presently it is synchronous / on-demand buffering) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
