wjones127 commented on code in PR #15077:
URL: https://github.com/apache/arrow/pull/15077#discussion_r1060826403


##########
r/R/dplyr-join.R:
##########
@@ -72,7 +72,18 @@ right_join.arrow_dplyr_query <- function(x,
                                          suffix = c(".x", ".y"),
                                          ...,
                                          keep = FALSE) {
-  do_join(x, y, by, copy, suffix, ..., keep = keep, join_type = "RIGHT_OUTER")
+
+  # Initially keep join keys so we can coalesce them after when keep=FALSE
+  query <- do_join(x, y, by, copy, suffix, ..., keep = TRUE, join_type = 
"RIGHT_OUTER")
+
+  # If we are doing a right outer join and not keeping the join keys of
+  # both sides, we need to coalesce. Otherwise, rows that exist in the
+  # RHS will have NAs for the join keys.
+  if (!keep) {
+    query$selected_columns <- post_join_projection(names(x), names(y), 
handle_join_by(by, x, y), suffix)
+  }

Review Comment:
   Try this (works for me locally):
   
   ```R
     left_output <- if (!keep && join_type == "RIGHT_OUTER") {
       setdiff(names(x), by)
     } else {
       names(x)
     }
     right_output <- if (keep || join_type %in% c("FULL_OUTER", "RIGHT_OUTER")) 
{
       names(y)
     } else {
       setdiff(names(y), by)
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to