thisisnic commented on issue #36864: URL: https://github.com/apache/arrow/issues/36864#issuecomment-1654108667
This was discussed in the Arrow R package dev community discussion today. The general ideas discussed were the fact that convenience wrappers are important in principle to make it easier to document our functions more simply as well as ease of use, soft deprecation would help with the issue of making a breaking change, but also that to change `write_csv_arrow()` for the sake of writing a good wrapper for the `write_*_dataset()` functions is extra work. Given my speculation that the individual write function ( `write_csv_arrow()`) isn't widely used, it follows that we could implement the new `write_*_dataset()` wrappers to reflect the `readr::write_csv` parameter names (similarly to `open_*_dataset()` and come back to potentially updating `write_csv_arrow()` if there's community interest or more compelling reasons to do so later. I agree with your points about the maintenance burden @paleolimbot , and I think a middle ground here would be to finish off any partial implementations we've made but not necessarily make new changes without good reason (i.e. it makes sense to have `write_*_dataset()` to mirror `open_*_dataset()` as we've already done the latter, but no need to rewrite `write_csv_arrow()` just for the sake of it), increase our rigour around testing (there have been clear gaps in the past), and where we feel wrappers are necessary then create thin wrappers which don't do additional work (i.e. like the thin wrappers in https://github.com/apache/arrow/pull/36851) and instead favour requesting upstream fixes/changes in the C++ library or documenting any peculiarities instead of coding around them. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
