[ 
https://issues.apache.org/jira/browse/ARROW-16577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537051#comment-17537051
 ] 

Jonathan Keane commented on ARROW-16577:
----------------------------------------

Thanks for the report! We don't currently support calling functions with the 
package namespace attached — though it is something we are thinking about + 
something we plan to support (see ARROW-14575 for some discussion and possible 
approaches). We don't have a timeline for this, but it helps knowing that 
someone is looking for it!

If you don't mind, I'm going to close this issue, but please to feel free to 
continue the discussion on ARROW-14575 

Thanks again!

> [R] dplyr `n` function cannot be called with `dplyr::n()`
> ---------------------------------------------------------
>
>                 Key: ARROW-16577
>                 URL: https://issues.apache.org/jira/browse/ARROW-16577
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: R
>    Affects Versions: 8.0.0
>            Reporter: Sam Bashevkin
>            Priority: Major
>
> I am trying to summarize an arrow dataset in R using the `n` function from 
> dplyr, but I noticed that it does not work when called via the `dplyr::n` 
> syntax, even though it works fine just as `n`. I also tried the `n_distinct` 
> function with the same issue
> ``` r
> library(arrow)
> #> 
> #> Attaching package: 'arrow'
> #> The following object is masked from 'package:utils':
> #> 
> #>     timestamp
> library(dplyr)
> #> 
> #> Attaching package: 'dplyr'
> #> The following objects are masked from 'package:stats':
> #> 
> #>     filter, lag
> #> The following objects are masked from 'package:base':
> #> 
> #>     intersect, setdiff, setequal, union
> dir<-file.path(tempdir(), "test-data")
> test_data <- data.frame(A=1:10)
> write_dataset(test_data, dir)
> # This does work
> data2<-open_dataset(dir)%>%
>     summarise(N=n())
> data2
> #> FileSystemDataset (query)
> #> N: int32
> #> 
> #> See $.data for the source Arrow object
> collect(data2)
> #> # A tibble: 1 × 1
> #>       N
> #>   <int>
> #> 1    10
> # But this does not work
> data1<-open_dataset(dir)%>%
>     summarise(N=dplyr::n())
> #> Error: Error : Expression dplyr::n() not supported in Arrow
> #> Call collect() first to pull data into R.
> data1
> #> Error in eval(expr, envir, enclos): object 'data1' not found
> ```
> <sup>Created on 2022-05-13 by the [reprex 
> package](https://reprex.tidyverse.org) (v2.0.1)</sup>
> <details style="margin-bottom:10px;">
> <summary>
> Session info
> </summary>
> ``` r
> sessioninfo::session_info()
> #> ─ Session info 
> ───────────────────────────────────────────────────────────────
> #>  setting  value
> #>  version  R version 4.2.0 (2022-04-22 ucrt)
> #>  os       Windows 10 x64 (build 19044)
> #>  system   x86_64, mingw32
> #>  ui       RTerm
> #>  language (EN)
> #>  collate  English_United States.utf8
> #>  ctype    English_United States.utf8
> #>  tz       America/Los_Angeles
> #>  date     2022-05-13
> #>  pandoc   2.17.1.1 @ C:/Program Files/RStudio/bin/quarto/bin/ (via 
> rmarkdown)
> #> 
> #> ─ Packages 
> ───────────────────────────────────────────────────────────────────
> #>  package     * version date (UTC) lib source
> #>  arrow       * 8.0.0   2022-05-09 [1] CRAN (R 4.2.0)
> #>  assertthat    0.2.1   2019-03-21 [1] CRAN (R 4.2.0)
> #>  bit           4.0.4   2020-08-04 [1] CRAN (R 4.2.0)
> #>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.2.0)
> #>  cli           3.3.0   2022-04-25 [1] CRAN (R 4.2.0)
> #>  crayon        1.5.1   2022-03-26 [1] CRAN (R 4.2.0)
> #>  DBI           1.1.2   2021-12-20 [1] CRAN (R 4.2.0)
> #>  digest        0.6.29  2021-12-01 [1] CRAN (R 4.2.0)
> #>  dplyr       * 1.0.9   2022-04-28 [1] CRAN (R 4.2.0)
> #>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.2.0)
> #>  evaluate      0.15    2022-02-18 [1] CRAN (R 4.2.0)
> #>  fansi         1.0.3   2022-03-24 [1] CRAN (R 4.2.0)
> #>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.2.0)
> #>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.2.0)
> #>  generics      0.1.2   2022-01-31 [1] CRAN (R 4.2.0)
> #>  glue          1.6.2   2022-02-24 [1] CRAN (R 4.2.0)
> #>  highr         0.9     2021-04-16 [1] CRAN (R 4.2.0)
> #>  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.2.0)
> #>  knitr         1.39    2022-04-26 [1] CRAN (R 4.2.0)
> #>  lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.2.0)
> #>  magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.2.0)
> #>  pillar        1.7.0   2022-02-01 [1] CRAN (R 4.2.0)
> #>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.2.0)
> #>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.2.0)
> #>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.2.0)
> #>  reprex        2.0.1   2021-08-05 [1] CRAN (R 4.2.0)
> #>  rlang         1.0.2   2022-03-04 [1] CRAN (R 4.2.0)
> #>  rmarkdown     2.14    2022-04-25 [1] CRAN (R 4.2.0)
> #>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.2.0)
> #>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.2.0)
> #>  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.2.0)
> #>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.2.0)
> #>  tibble        3.1.7   2022-05-03 [1] CRAN (R 4.2.0)
> #>  tidyselect    1.1.2   2022-02-21 [1] CRAN (R 4.2.0)
> #>  tzdb          0.3.0   2022-03-28 [1] CRAN (R 4.2.0)
> #>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.2.0)
> #>  vctrs         0.4.1   2022-04-13 [1] CRAN (R 4.2.0)
> #>  withr         2.5.0   2022-03-03 [1] CRAN (R 4.2.0)
> #>  xfun          0.31    2022-05-10 [1] CRAN (R 4.2.0)
> #>  yaml          2.3.5   2022-02-21 [1] CRAN (R 4.2.0)
> #> 
> #>  [1] C:/Users/sbashevkin/AppData/Local/R/win-library/4.2
> #>  [2] C:/Program Files/R/R-4.2.0/library
> #> 
> #> 
> ──────────────────────────────────────────────────────────────────────────────
> ```
> </details>



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to