thisisnic commented on issue #38456: URL: https://github.com/apache/arrow/issues/38456#issuecomment-1781072827
I've swapped this over to a "usage question" label as it's not a bug, but no need to apologise - it's useful having this kind of user feedback! I agree that the API is a little inconsistent (in more places than this) and yeah, I believe that `read_csv_arrow()` is named like that so we don't have a clash when folks have `readr` loaded at the same time. When I implemented `open_csv_dataset()`, I prioritised the parameters matching `read_csv_arrow()` rather than the function name, but I see your point. > as_data_frame should probably be as_tibble I think this used to return a `data.frame` object, which is the origin of the name, and I'd be hesitant to change it given that a tibble is a kind of data frame *and* not wanting to break folks' existing code. FWIW I agree with a lot of your points. There are lots of other little bits and pieces like this that could be prettier, but there have always been higher priorities in terms of my own dev time, and I just don't know whether the community would prefer a stable API versus cleaner UX, though I'll count this issue as a vote for the latter! I've been thinking it might be worth us spelling out the different functions more directly in an article - what do you think? I'd be keen to hear more about what's made you think about this - your own usage, comments from clients of yours who are using arrow, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
