On 25/10/2023 2:18 a.m., Christian Asseburg wrote:
Hi! I came across this unexpected behaviour in R. First I thought it was a bug in 
the assignment operator <- but now I think it's maybe a bug in the way data 
frames are being printed. What do you think?

Using R 4.3.1:

x <- data.frame(A = 1, B = 2, C = 3)
y <- data.frame(A = 1)
x
   A B C
1 1 2 3
x$B <- y$A # works as expected
x
   A B C
1 1 1 3
x$C <- y[1] # makes C disappear
x
   A B A
1 1 1 1
str(x)
'data.frame':   1 obs. of  3 variables:
  $ A: num 1
  $ B: num 1
  $ C:'data.frame':      1 obs. of  1 variable:
   ..$ A: num 1

Why does the print(x) not show "C" as the name of the third element? I did mess 
up the data frame (and this was a mistake on my part), but finding the bug was harder 
because print(x) didn't show the C any longer.

y[1] is a dataframe with one column, i.e. it is identical to y. To get the result you expected, you should have used y[[1]], to extract column 1.

Since dataframes are lists, you can assign them as columns of other dataframes, and you'll create a single column in the result whose rows are the columns of the dataframe you're assigning. This means that

 x$C <- y[1]

replaces the C column of x with a dataframe. It retains the name C (you can see this if you print names(x) ), but since the column contains a dataframe, it chooses to use the column name of y when printing.

If you try

 x$D <- x

you'll see it generate new names when printing, but the names within x remain as A, B, C, D.

This is a situation where tibbles do a better job than dataframes: if you created x and y as tibbles instead of dataframes and executed your code, you'd see this:

  library(tibble)
  x <- tibble(A = 1, B = 2, C = 3)
  y <- tibble(A = 1)
  x$C <- y[1]
  x
  #> # A tibble: 1 × 3
  #>       A     B   C$A
  #>   <dbl> <dbl> <dbl>
  #> 1     1     2     1

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to