etseidl commented on issue #8534: URL: https://github.com/apache/arrow-rs/issues/8534#issuecomment-3386204551
I've looked at the PR and must admit I'm confused. IIUC the PR does not address this issue...calling `ArrowWriter::flush` does not flush the underlying writer. Instead, `ArrowWriter::flush` simply closes off the current row group and writes it to the underlying writer. Unless the underlying writer is a raw file, there will still be buffering at the OS level and still no sync to disk, correct? `TrackedWrite` has a `flush` call, which does call `flush` on the wrapped `Write`, so the issue isn't the buffering, it's the lack of a call to `self.writer.flush()` from `ArrowWriter::flush`. Perhaps we could instead add a `flush_and_sync` call to `ArrowWriter` that calls `ArrowWriter::flush` and then `self.writer.flush()`. Double buffering in `TrackedWrite` to me is a separate issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
