marsupialtail opened a new pull request, #13640: URL: https://github.com/apache/arrow/pull/13640
This PR attempts to add direct IO writes to output stream. The C++ API involves adding a DirectOutputStream class which can be instantiated by calling open_output_stream on a LocalFileSystem with use_directio = True. The DirectOutputStream implements O_DIRECT writes aligned to 4096 bytes. It performs memcpy to ensure that the address it writes from is always aligned to 4096 bytes. Most disks currently used today has a sector size of 512, but newer disks might have 4096 sector size. It buffers remainder bytes from write calls, which will be padded and flushed upon close(). In Python, if you instantiate a LocalFileSystem with use_directio = True, open_output_stream will return a NativeFile that wraps the DirectOutputStream. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
