nealrichardson commented on issue #35593: URL: https://github.com/apache/arrow/issues/35593#issuecomment-1551385062
> `select |> compute` is appealing as it does _something_ and removes friction, and even if it's not necessarily the typical way of working with Dataset objects, it doesn't feel like adding additionally unnecessary ways of doing something. I'm not entirely opposed to an error message with a suggestion, but I think here, letting it "just work" is preferable. I agree; the risk is that it might be slower/do a lot more work than the user intended. We generally avoid scanning/evaluating a query except on explicit command. So the question is which is more surprising/worse: that `$` raises a not-implemented error if you try to extract a column (but be sure to let it work for the R6 methods!), or that `$` triggers the query to evaluate. Also I misspoke before: `$` should map to `pull(as_vector = FALSE)`. `compute()` still returns a Table, not the ChunkedArray. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
