jonkeane commented on PR #41425:
URL: https://github.com/apache/arrow/pull/41425#issuecomment-2082990834
Would you mind adding examples of what this looks like? That might help the
discussion too.
> I'm not entirely sure this makes sense to add; since the original issue
was opened, we've mostly satisfied the original use case as we implemented
glimpse(), and calling head() then collect() makes a lot of sense as a workflow
for previewing results, as we are explicitly pulling the data into R with those.
Agreed that `glimpse()` does a lot of what we need, though I suspect that
there's a strong contingent of folks who don't know about / use `glimpse()` and
it is a second command needed in these circumstances. `glimpse()` is also
transposed which folks might not care for (either aesthetically or on
principle) There is value add in there being a "real" `print()` method that
acts like a data.frame IMO.
> For this to make sense, we'd need to swap out the R types for the Arrow
types,
Agreed
> We still have the schema printing with Datasets, unless we want to extend
this to those too, but then are we going to only do it for local Datasets?
Yeah, thought this might be a step towards under-the-hood (possibly with
some opt in or opt out), we run the query with head and return that.
With `dbplyr` we get the head by default:
```
> db <- memdb_frame(a = rep(c(3, 4, 1, 2), 100000), b = rep(c(5, 1, 2, NA),
100000))
> db |> filter(!is.na(b))
# Source: SQL [?? x 2]
# Database: sqlite 3.45.2 [:memory:]
a b
<dbl> <dbl>
1 3 5
2 4 1
3 1 2
4 3 5
5 4 1
6 1 2
7 3 5
8 4 1
9 1 2
10 3 5
# ℹ more rows
# ℹ Use `print(n = ...)` to see more rows
```
With arrow queries that might not be possible to do in a lightweight way,
but I bet there is a way where we could cover most of these in a relatively
lightweight way (and again possibly only as an opt in if we're really worried
about long running commands...)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]