wjones127 commented on issue #12371: URL: https://github.com/apache/arrow/issues/12371#issuecomment-1033337388
> Error: x is not a character vector Unfortunately once you run `select` it returns a query and not a pure dataset, so it's no longer a valid input to `open_dataset()`. We haven't implemented `union_all()` for queries yet; if we did I think it would look like this: ```r df <- union_all(select(ds1, x), select(ds2, x)) # Currently, you'd have to materialize each side df <- union_all(select(ds1, x) %>% collect(), select(ds2, x) %>% collect()) ``` I think for now you'd have to either collect it in memory or if the data really is too large, process each part separately and then union together later. Thanks for bringing up this use case; it's definitely something we want better support for. I'll create a Jira ticket to implement `union_all` for arrow queries. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
