TPDeramus opened a new issue, #39038:
URL: https://github.com/apache/arrow/issues/39038

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Hi developers.
   
   I'm having an issue where I'm trying to use `full_join()` on two tables 
(subset from the same data but filtered and operated on and appended to save 
memory), but it keeps throwing the following error:
   
   ```
   Error: NotImplemented: Function 'coalesce' has no kernel matching input 
types (numeric(0)
   attr(,"class")
   [1] NA, numeric(0)
   attr(,"class")
   ```
   
   Specifically, it looks something like the following:
   ```
   library(arrow)
   library(tidyverse)
   library(fastDummies)
   
     temp <- open_csv_dataset(sources = cohort_csvs) %>% compute()
     
     Subs <- data.frame(temp %>% distinct(key) %>% collect())
     
     for (Subnum in 1:dim(Subs)[1]) {
       out <-
         data.frame(temp %>% filter(key == Subs[Subnum, ]) %>% collect())
         out[is.na(out)] <- 'NA'
         out$tags <- 'NA'
         out <-
           dummy_cols(
             out,
             select_columns = "terms",
             remove_selected_columns = FALSE,
             omit_colname_prefix = TRUE
           )
         out <-
           dummy_cols(
             out,
             select_columns = "tags",
             remove_selected_columns = FALSE,
             omit_colname_prefix = TRUE
           )
         if (Subnum == 1){
           Out_table <- arrow_table(out)
         } else {
           Out_table <-Out_table %>% full_join(out)
           }
   ```
   
   However, when it reaches past the first part of the loop to the full join, 
it throws the error regardless of the call used to make the `full_join()`:
   ```
   Out_table %>% full_join(out)
   Error: NotImplemented: Function 'coalesce' has no kernel matching input 
types (numeric(0)
   attr(,"class")
   [1] NA, numeric(0)
   attr(,"class")
   [1] NA)
   
   full_join(Out_table,arrow_table(out))
   Error: NotImplemented: Function 'coalesce' has no kernel matching input 
types (numeric(0)
   attr(,"class")
   [1] NA, numeric(0)
   attr(,"class")
   [1] NA)
   
   full_join(Out_table,out)
   Error: NotImplemented: Function 'coalesce' has no kernel matching input 
types (numeric(0)
   attr(,"class")
   [1] NA, numeric(0)
   attr(,"class")
   [1] NA)
   ```
   
   It will **not** however, throw any error or display issues with left, right, 
inner, semi, or anti join.
   
   I kind of need all columns to be retained during the joining, even if as NAs.
   
   Any idea what might be causing the issue?
   
   Version info:
   OS:
   NAME="Red Hat Enterprise Linux Server"
   VERSION="7.9 (Maipo)"
   
   R Version:
   R version 4.2.1 (2022-06-23) -- "Funny-Looking Kid"
   
   RStudio Version:
   RStudio Server 2022.07.0 Build 548
   
   Session Info:
   ```
   sessionInfo()
   R version 4.2.1 (2022-06-23)
   Platform: x86_64-pc-linux-gnu (64-bit)
   Running under: Red Hat Enterprise Linux
   
   Matrix products: default
   BLAS/LAPACK: /usr/lib64/libopenblasp-r0.3.3.so
   
   locale:
    [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               
LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
    [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    
LC_PAPER=en_US.UTF-8       LC_NAME=C                 
    [9] LC_ADDRESS=C               LC_TELEPHONE=C             
LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
   
   attached base packages:
   [1] stats     graphics  grDevices datasets  utils     methods   base     
   
   other attached packages:
    [1] lubridate_1.9.0   timechange_0.1.1  rmarkdown_2.18    here_1.0.1        
fastDummies_1.7.3 arrow_12.0.1.1   
    [7] data.table_1.14.6 toolbox_0.1.0     janitor_2.2.0     forcats_0.5.2     
stringr_1.4.1     dplyr_1.0.10     
   [13] purrr_0.3.5       readr_2.1.3       tidyr_1.2.1       tibble_3.1.8      
ggplot2_3.4.0     tidyverse_1.3.2  
   
   loaded via a namespace (and not attached):
    [1] httr_1.4.4          vroom_1.6.0         bit64_4.0.5         
jsonlite_1.8.3      viridisLite_0.4.1   splines_4.2.1      
    [7] modelr_0.1.10       assertthat_0.2.1    pander_0.6.5        renv_0.16.0 
        googlesheets4_1.0.1 cellranger_1.1.0   
   [13] yaml_2.3.6          pillar_1.8.1        backports_1.4.1     
lattice_0.20-45     glue_1.6.2          digest_0.6.30      
   [19] rvest_1.0.3         snakecase_0.11.1    colorspace_2.0-3    
htmltools_0.5.5     Matrix_1.5-3        survey_4.1-1       
   [25] pkgconfig_2.0.3     broom_1.0.1         haven_2.5.1         
scales_1.2.1        webshot_0.5.4       svglite_2.1.0      
   [31] tzdb_0.3.0          googledrive_2.0.0   generics_0.1.3      tictoc_1.1  
        ellipsis_0.3.2      DT_0.26            
   [37] withr_2.5.0         cli_3.4.1           survival_3.3-1      
magrittr_2.0.3      crayon_1.5.2        readxl_1.4.1       
   [43] evaluate_0.18       fs_1.5.2            fansi_1.0.3         xml2_1.3.3  
        tableone_0.13.2     tools_4.2.1        
   [49] mitools_2.4         hms_1.1.2           gargle_1.2.1        
lifecycle_1.0.3     munsell_0.5.0       reprex_2.0.2       
   [55] kableExtra_1.3.4    compiler_4.2.1      systemfonts_1.0.4   rlang_1.1.2 
        grid_4.2.1          rstudioapi_0.14    
   [61] htmlwidgets_1.6.0   gtable_0.3.1        DBI_1.1.3           R6_2.5.1    
        knitr_1.41          bit_4.0.5          
   [67] fastmap_1.1.0       utf8_1.2.2          rprojroot_2.0.3     
stringi_1.7.8       parallel_4.2.1      Rcpp_1.0.9         
   [73] vctrs_0.6.4         dbplyr_2.2.1        tidyselect_1.2.0    xfun_0.35 
   ```
   
   ### Component(s)
   
   R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to