paleolimbot commented on issue #36864: URL: https://github.com/apache/arrow/issues/36864#issuecomment-1650224074
There is certainly a lot of precedent in the Arrow package for providing drop-in interfaces for other tidyverse packages. I don't happen to think this is a good idea for many of the points that Dane mentioned. Notably, it creates maintenance work by requiring that our interface keep up with the interface defined elsewhere (or introduces potential bugs in user code that relies on the expectation that everything "just works" when we do not necessarily test every possible combination). I would favour future design decisions that center around what Arrow C++ provides and how it provides it. This is the approach that pyarrow takes and has the advantage that many C++ PRs also update pyarrow because the relationship between the two is so straightfoward. Packages that depend on pyarrow (e.g., pandas 2.0, duckdb) take on the responsibility of providing its features (and associated maintenance responsibility of keeping the implementations in sync) in a more user-facing form. My ideal world would be a world where the arrow R package wraps Arrow C++ and tidyverse compatibility lives in another package where it can focus on doing that one thing well; however, I am not aware of any other arrow R developer who shares that opinion. On `write_csv_arrow()` specifically...I am neither for nor against the compatability with readr: I don't consider it something I would spend time doing; however, if somebody else finds it a useful exercise and it can be done well, I don't think that CSV writing specifically is likely to change at a pace that we can't maintain. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
