mapleFU commented on code in PR #43409: URL: https://github.com/apache/arrow/pull/43409#discussion_r1706409445
########## cpp/src/arrow/io/buffered.cc: ########## @@ -434,6 +434,32 @@ class BufferedInputStream::Impl : public BufferedBase { return std::shared_ptr<Buffer>(std::move(buffer)); } + Status Advance(int64_t nbytes) { + if (nbytes < 0) { + return Status::Invalid("Bytes to advance must be non-negative. Received:", nbytes); + } + if (nbytes == 0) { + return Status::OK(); + } + + if (nbytes < bytes_buffered_) { + ConsumeBuffer(nbytes); + return Status::OK(); + } + + // Invalidate buffered data, as with a Seek or large Read + int64_t remain_skip_bytes = nbytes - bytes_buffered_; + RewindBuffer(); + // TODO(mwish): Considering using raw_->Advance if available, + // currently we don't have a way to know if the underlying stream supports fast + // skipping. So we just read and discard the data. + auto result = Read(remain_skip_bytes); Review Comment: > Why not simply call Advance? The default implementation calls Read anyway. If underlying don't have `supports_fast_advance`, it would call "read" without buffering, and might be a low-efficient direct read and less efficent than "do large buffer and read" -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org