etseidl commented on issue #8534:
URL: https://github.com/apache/arrow-rs/issues/8534#issuecomment-3386204551

   I've looked at the PR and must admit I'm confused. IIUC the PR does not 
address this issue...calling `ArrowWriter::flush` does not flush the underlying 
writer. Instead, `ArrowWriter::flush` simply closes off the current row group 
and writes it to the underlying writer. Unless the underlying writer is a raw 
file, there will still be buffering at the OS level and still no sync to disk, 
correct? 
   
   `TrackedWrite` has a `flush` call, which does call `flush` on the wrapped 
`Write`, so the issue isn't the buffering, it's the lack of a call to 
`self.writer.flush()` from `ArrowWriter::flush`.
   
   Perhaps we could instead add a `flush_and_sync` call to `ArrowWriter` that 
calls `ArrowWriter::flush` and then `self.writer.flush()`.
   
   Double buffering in `TrackedWrite` to me is a separate issue.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to