TPDeramus opened a new issue, #39038:
URL: https://github.com/apache/arrow/issues/39038
### Describe the bug, including details regarding any error messages,
version, and platform.
Hi developers.
I'm having an issue where I'm trying to use `full_join()` on two tables
(subset from the same data but filtered and operated on and appended to save
memory), but it keeps throwing the following error:
```
Error: NotImplemented: Function 'coalesce' has no kernel matching input
types (numeric(0)
attr(,"class")
[1] NA, numeric(0)
attr(,"class")
```
Specifically, it looks something like the following:
```
library(arrow)
library(tidyverse)
library(fastDummies)
temp <- open_csv_dataset(sources = cohort_csvs) %>% compute()
Subs <- data.frame(temp %>% distinct(key) %>% collect())
for (Subnum in 1:dim(Subs)[1]) {
out <-
data.frame(temp %>% filter(key == Subs[Subnum, ]) %>% collect())
out[is.na(out)] <- 'NA'
out$tags <- 'NA'
out <-
dummy_cols(
out,
select_columns = "terms",
remove_selected_columns = FALSE,
omit_colname_prefix = TRUE
)
out <-
dummy_cols(
out,
select_columns = "tags",
remove_selected_columns = FALSE,
omit_colname_prefix = TRUE
)
if (Subnum == 1){
Out_table <- arrow_table(out)
} else {
Out_table <-Out_table %>% full_join(out)
}
```
However, when it reaches past the first part of the loop to the full join,
it throws the error regardless of the call used to make the `full_join()`:
```
Out_table %>% full_join(out)
Error: NotImplemented: Function 'coalesce' has no kernel matching input
types (numeric(0)
attr(,"class")
[1] NA, numeric(0)
attr(,"class")
[1] NA)
full_join(Out_table,arrow_table(out))
Error: NotImplemented: Function 'coalesce' has no kernel matching input
types (numeric(0)
attr(,"class")
[1] NA, numeric(0)
attr(,"class")
[1] NA)
full_join(Out_table,out)
Error: NotImplemented: Function 'coalesce' has no kernel matching input
types (numeric(0)
attr(,"class")
[1] NA, numeric(0)
attr(,"class")
[1] NA)
```
It will **not** however, throw any error or display issues with left, right,
inner, semi, or anti join.
I kind of need all columns to be retained during the joining, even if as NAs.
Any idea what might be causing the issue?
Version info:
OS:
NAME="Red Hat Enterprise Linux Server"
VERSION="7.9 (Maipo)"
R Version:
R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
RStudio Version:
RStudio Server 2022.07.0 Build 548
Session Info:
```
sessionInfo()
R version 4.2.1 (2022-06-23)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Red Hat Enterprise Linux
Matrix products: default
BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices datasets utils methods base
other attached packages:
[1] lubridate_1.9.0 timechange_0.1.1 rmarkdown_2.18 here_1.0.1
fastDummies_1.7.3 arrow_12.0.1.1
[7] data.table_1.14.6 toolbox_0.1.0 janitor_2.2.0 forcats_0.5.2
stringr_1.4.1 dplyr_1.0.10
[13] purrr_0.3.5 readr_2.1.3 tidyr_1.2.1 tibble_3.1.8
ggplot2_3.4.0 tidyverse_1.3.2
loaded via a namespace (and not attached):
[1] httr_1.4.4 vroom_1.6.0 bit64_4.0.5
jsonlite_1.8.3 viridisLite_0.4.1 splines_4.2.1
[7] modelr_0.1.10 assertthat_0.2.1 pander_0.6.5 renv_0.16.0
googlesheets4_1.0.1 cellranger_1.1.0
[13] yaml_2.3.6 pillar_1.8.1 backports_1.4.1
lattice_0.20-45 glue_1.6.2 digest_0.6.30
[19] rvest_1.0.3 snakecase_0.11.1 colorspace_2.0-3
htmltools_0.5.5 Matrix_1.5-3 survey_4.1-1
[25] pkgconfig_2.0.3 broom_1.0.1 haven_2.5.1
scales_1.2.1 webshot_0.5.4 svglite_2.1.0
[31] tzdb_0.3.0 googledrive_2.0.0 generics_0.1.3 tictoc_1.1
ellipsis_0.3.2 DT_0.26
[37] withr_2.5.0 cli_3.4.1 survival_3.3-1
magrittr_2.0.3 crayon_1.5.2 readxl_1.4.1
[43] evaluate_0.18 fs_1.5.2 fansi_1.0.3 xml2_1.3.3
tableone_0.13.2 tools_4.2.1
[49] mitools_2.4 hms_1.1.2 gargle_1.2.1
lifecycle_1.0.3 munsell_0.5.0 reprex_2.0.2
[55] kableExtra_1.3.4 compiler_4.2.1 systemfonts_1.0.4 rlang_1.1.2
grid_4.2.1 rstudioapi_0.14
[61] htmlwidgets_1.6.0 gtable_0.3.1 DBI_1.1.3 R6_2.5.1
knitr_1.41 bit_4.0.5
[67] fastmap_1.1.0 utf8_1.2.2 rprojroot_2.0.3
stringi_1.7.8 parallel_4.2.1 Rcpp_1.0.9
[73] vctrs_0.6.4 dbplyr_2.2.1 tidyselect_1.2.0 xfun_0.35
```
### Component(s)
R
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]