thisisnic commented on a change in pull request #12073: URL: https://github.com/apache/arrow/pull/12073#discussion_r779651711
########## File path: r/R/metadata.R ########## @@ -133,24 +133,24 @@ remove_attributes <- function(x) { } arrow_attributes <- function(x, only_top_level = FALSE) { + + att <- attributes(x) + removed_attributes <- remove_attributes(x) + if (inherits(x, "grouped_df")) { # Keep only the group var names, not the rest of the cached data that dplyr # uses, which may be large if (requireNamespace("dplyr", quietly = TRUE)) { gv <- dplyr::group_vars(x) x <- dplyr::ungroup(x) # ungroup() first, then set attribute, bc ungroup() would erase it - attr(x, ".group_vars") <- gv - } else { - # Regardless, we shouldn't keep groups around - attr(x, "groups") <- NULL + att[[".group_vars"]] <- gv + removed_attributes <- c(removed_attributes, "groups", "class") Review comment: When `remove_attributes()` is called on `x` it hits this block of code: ` if (identical(class(x), c("tbl_df", "tbl", "data.frame"))) { removed_attributes <- c("class", "row.names", "names") } else if (inherits(x, "data.frame")) { removed_attributes <- c("row.names", "names") ` As it's a `grouped_df` it fails the first condition and hits the second. That doesn't include "class" for reasons I don't fully understand but its addition makes other tests fail and changes the returned object type. I've tried a few other approaches but they make other tests fail, and I'm going down a bit of a rabbit hole. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org