DavZim opened a new issue, #14732:
URL: https://github.com/apache/arrow/issues/14732

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   When I print an arrow dataset-query that involves filtering then 
summarising, the filtering operation is not shown.
   
   For example:
   ``` r
   library(arrow)
   library(dplyr)
   ds_file <- file.path(tempdir(), "mtcars")
   
   write_dataset(mtcars |> select(mpg, cyl), ds_file)
   ds <- open_dataset(ds_file)
   
   # filter is printed | EXPECTED
   ds |> filter(mpg > 25)
   #> FileSystemDataset (query)
   #> mpg: double
   #> cyl: double
   #> 
   #> * Filter: (mpg > 25)                                #<======
   #> See $.data for the source Arrow object
   
   
   # filter is NOT printed | EXPECTED
   ds |> 
     filter(mpg > 25) |> 
     summarise(mpg = mean(mpg))
   #> FileSystemDataset (query)
   #> mpg: double
   #>                                                               #<==== 
Missing?!
   #> See $.data for the source Arrow object
   
   # first is NOT printed | NOT EXPECTED!
   # second filter is printed | EXPECTED
   ds |> 
     filter(mpg > 25) |> 
     summarise(mpg = mean(mpg)) |> 
     filter(mpg  > 0)
   #> FileSystemDataset (query)
   #> mpg: double
   #> 
   #> * Filter: (mpg > 0)                                  #<==== Missing mpg > 
25 ?!
   #> See $.data for the source Arrow object
   ```
   
   I would expect to see the filtering as well as the summarise command of the 
query as well.
   
   I use R 4.1.1 with arrow version 10.0.0
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to