nealrichardson commented on issue #34437:
URL: https://github.com/apache/arrow/issues/34437#issuecomment-1481235488
I checked again with the current state of my branch and 4 tests fail if I
use the FetchNode, 3 of which are on Datasets. The other one is a query on a
table, but fetch comes after aggregation:
```
library(arrow)
library(dplyr)
mtcars %>%
arrow_table() %>%
summarize(mean(hp)) %>%
head() %>%
collect()
# Error in `compute.arrow_dplyr_query()`:
# ! Invalid: Fetch node's input has no meaningful ordering and so
limit/offset will be non-deterministic. Please establish order in some way
(e.g. by inserting an order_by node)
```
I guess that's reasonable that it should error (or at least warn)?
Seems like I should wait for https://github.com/apache/arrow/issues/34698 to
happen so that I'm not having to special-case datasets temporarily.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]