joosthooz opened a new pull request, #14400: URL: https://github.com/apache/arrow/pull/14400
The normal pyarrow feather writer supports setting a small set of properties when writing: compression, compression_level and chunk_size. These are stored in a struct, `ipc::feather::WriteProperties`. In this PR I've used this same struct to add the compression options to the dataset ipc writer. Note that this is different from the struct `ipc::IpcWriteOptions` and also from `dataset::IpcFileWriteOptions`. When creating the writer, `WriteProperties` is used to overwrite the default options in `ipc::IpcWriteOptions` for compression, the same way as happens in `ipc::feather::WriteTable`. The alternative is to mimic the way it works for CSV, and expose `ipc::IpcWriteOptions` in python. But the the dataset ipc writer would have more functionality. Also there's a bunch of property in it that I think don't make sense to use in python. Lastly, the compression codec in that struct needs to be initialized by calling some C++ code that must be run when setting the compression. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org