nealrichardson commented on issue #35593:
URL: https://github.com/apache/arrow/issues/35593#issuecomment-1551385062

   > `select |> compute` is appealing as it does _something_ and removes 
friction, and even if it's not necessarily the typical way of working with 
Dataset objects, it doesn't feel like adding additionally unnecessary ways of 
doing something. I'm not entirely opposed to an error message with a 
suggestion, but I think here, letting it "just work" is preferable.
   
   I agree; the risk is that it might be slower/do a lot more work than the 
user intended. We generally avoid scanning/evaluating a query except on 
explicit command. So the question is which is more surprising/worse: that `$` 
raises a not-implemented error if you try to extract a column (but be sure to 
let it work for the R6 methods!), or that `$` triggers the query to evaluate.
   
   Also I misspoke before: `$` should map to `pull(as_vector = FALSE)`. 
`compute()` still returns a Table, not the ChunkedArray.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to