wjones127 commented on issue #12371:
URL: https://github.com/apache/arrow/issues/12371#issuecomment-1033337388


   > Error: x is not a character vector
   
   Unfortunately once you run `select` it returns a query and not a pure 
dataset, so it's no longer a valid input to `open_dataset()`.
   
   We haven't implemented `union_all()` for queries yet; if we did I think it 
would look like this:
   
   ```r
   df <- union_all(select(ds1, x), select(ds2, x))
   # Currently, you'd have to materialize each side
   df <- union_all(select(ds1, x) %>% collect(), select(ds2, x) %>% collect())
   ```
   
   I think for now you'd have to either collect it in memory or if the data 
really is too large, process each part separately and then union together later.
   
   Thanks for bringing up this use case; it's definitely something we want 
better support for. I'll create a Jira ticket to implement `union_all` for 
arrow queries.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to