jonkeane commented on PR #41425:
URL: https://github.com/apache/arrow/pull/41425#issuecomment-2082990834

   Would you mind adding examples of what this looks like? That might help the 
discussion too.
   
   > I'm not entirely sure this makes sense to add; since the original issue 
was opened, we've mostly satisfied the original use case as we implemented 
glimpse(), and calling head() then collect() makes a lot of sense as a workflow 
for previewing results, as we are explicitly pulling the data into R with those.
   Agreed that `glimpse()` does a lot of what we need, though I suspect that 
there's a strong contingent of folks who don't know about / use `glimpse()` and 
it is a second command needed in these circumstances. `glimpse()` is also 
transposed which folks might not care for (either aesthetically or on 
principle)  There is value add in there being a "real" `print()` method that 
acts like a data.frame IMO.
   
   > For this to make sense, we'd need to swap out the R types for the Arrow 
types, 
   Agreed
   
   
   > We still have the schema printing with Datasets, unless we want to extend 
this to those too, but then are we going to only do it for local Datasets?
   Yeah, thought this might be a step towards under-the-hood (possibly with 
some opt in or opt out), we run the query with head and return that.
   
   With `dbplyr` we get the head by default:
   
   ```
   > db <- memdb_frame(a = rep(c(3, 4, 1, 2), 100000), b = rep(c(5, 1, 2, NA), 
100000))
   > db |> filter(!is.na(b))
   # Source:   SQL [?? x 2]
   # Database: sqlite 3.45.2 [:memory:]
          a     b
      <dbl> <dbl>
    1     3     5
    2     4     1
    3     1     2
    4     3     5
    5     4     1
    6     1     2
    7     3     5
    8     4     1
    9     1     2
   10     3     5
   # ℹ more rows
   # ℹ Use `print(n = ...)` to see more rows
   ```
   
   With arrow queries that might not be possible to do in a lightweight way, 
but I bet there is a way where we could cover most of these in a relatively 
lightweight way (and again possibly only as an opt in if we're really worried 
about long running commands...)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to